A Synthea alternative for synthetic patient data without FHIR
Synthea builds full longitudinal EHRs, and it does that well. But if all you need is a table of calibrated patient demographics, vitals, and condition flags, you do not need Java, FHIR bundles, or a build step. SimpleIDGen returns that table as a flat CSV you can download now.
No signup, no install. See every field on the Person Profile page →
Synthea, from MITRE, is an open-source synthetic-patient generator. It simulates a person's clinical life from birth: encounters, diagnoses, medications, and observations unfold over a synthetic lifetime, then export as FHIR, C-CDA, or relational CSV. If you are testing an EHR integration, a FHIR pipeline, or an interoperability workflow that expects rich longitudinal records, Synthea is the right tool, and it is excellent at it.
That depth has a footprint. You clone and build a Java project, configure disease modules, run a generation pass, then parse EHR-shaped output. For longitudinal clinical realism, that is a fair trade. For a simple demographics-and-vitals table, it is more machinery than the job needs.
Plenty of work needs one row per person, not a lifetime of encounters: seeding a database, building ML features, populating a dashboard, or demoing an app. For that, SimpleIDGen generates a cross-sectional snapshot — identity, geography, vitals (A1c, BMI, blood pressure, height, weight, waist), and condition flags (diabetes, hypertension, CKD, and more) — across 65 attributes per record.
The values are not independent noise. Marginals are fitted to NHANES 2017–2020 and ACS 2022 and drawn jointly by age and sex, so a 58-year-old man's A1c, BMI, and blood pressure cohere the way a real population's would. Cross-field invariants hold: BMI follows weight and height, insulin appears only for diagnosed diabetics, and ZIP matches state. Generation is deterministic by seed — the same seed returns the same people. See the calibration detail on the NHANES-calibrated data page →
| Dimension | Synthea | SimpleIDGen |
|---|---|---|
| What it models | Longitudinal patient histories — encounters, conditions, meds, observations over a synthetic lifetime | One row per person — present-state demographics, vitals & condition flags |
| Output | FHIR, C-CDA, or relational CSV (EHR-shaped) | Flat CSV or JSONL (a single table) |
| Setup | Clone & build a Java project, configure modules | None — download a sample or call a hosted API |
| Population calibration | Driven by clinical modules and published incidence rates | Marginals fitted to NHANES 2017–2020 & ACS 2022, jointly by age and sex |
| Clinical depth | Deep — care plans, claims, full clinical record | Basics — A1c, BMI, blood pressure, body measures, condition flags |
| Best for | EHR / FHIR pipelines, interoperability, clinical workflows | Analytics, ML features, demos, test fixtures |
| Cost | Free, open source | Free sample (no login); free account, 5,000 rows/day |
Different shapes for different jobs. Synthea models a clinical life; SimpleIDGen describes a population at a moment.
Only for the cross-sectional case. If you need longitudinal EHRs, FHIR resources, or full clinical histories, stay with Synthea. If you need a calibrated table of patient demographics and vitals, SimpleIDGen is faster to get and simpler to load.
No, by design. Output is flat CSV or JSONL — one record per person. That is the point: no FHIR bundles to parse and no resource graph to flatten before you can load a dataframe.
No. Every record is synthetic, built from public reference distributions (NHANES, ACS, CDC NDSS) and never learned from real records. No real PII enters the system, so it is GDPR- and DPDP-safe.
No. The 1,000-row sample downloads with no account; a free account generates up to 5,000 rows per day in CSV or JSONL. No Java, no runtime, no build.
65 per record — identity, geography, social, financial, behavioral, vitals, and condition flags. See the full field list on the Person Profile page.