NHANES-Calibrated Synthetic Data
Synthetic person records whose health distributions are fitted to the CDC's NHANES 2017–2020 cycle — A1c, BMI, blood pressure, and diabetes & hypertension prevalence by age and sex — so your test data behaves like a real US population without containing a single real person.
No signup. Generate your own — 5,000 rows/day free →
Each health attribute's marginal distribution is fitted to published NHANES 2017–2020 estimates (and CDC NDSS for diabetes). A 55-year-old man's A1c, BMI, and blood-pressure values are drawn from the same distributions NHANES measured for that age and sex — not uniform noise. Cross-field invariants are enforced: BMI = weight / (height/100)², insulin only appears for diagnosed diabetics, and ZIP matches state.
It is not real NHANES data. NHANES supplies the target distributions; SimpleIDGen generates fresh synthetic people that match them. No NHANES respondent's record is ever reproduced — so there is no protected health information to safeguard.
| Attribute | Calibrated to |
|---|---|
| A1c (HbA1c) | NHANES 2017–2020 glycohemoglobin, by age × sex |
| BMI · height · weight · waist | NHANES 2017–2020 body measures |
| Systolic / diastolic blood pressure | NHANES 2017–2020 blood pressure |
| Diabetes status & prevalence | CDC NDSS 2022 + NHANES 2017–2020 |
| Hypertension | NHANES 2017–2020 |
| Demographics (age, sex, race, geography) | ACS 2022 · US Census 2020 |
| Insurance type | KFF 2023 |
Want the evidence? The Person Profile page publishes a full fidelity report — observed vs. target distributions across 45 pairwise comparisons over millions of generated rows.
No. It is entirely synthetic. NHANES provides the target distributions; we generate fake people that match them. No real respondent's record is reproduced — so there is no PII to protect.
Per-attribute marginals are fitted to NHANES 2017–2020 (and CDC NDSS) by age and sex, with cross-field invariants enforced (BMI, insulin↔diabetes, ZIP↔state). Generation is deterministic by seed — the same seed yields the same people.
Yes. No real PII ever enters the system — the data is built from public reference distributions, not learned from real records. That makes it GDPR- and DPDP-safe for environments where production data can't be used.
Free. The 1,000-row sample above needs no account; a free account generates up to 5,000 rows per day in CSV or JSONL.