A free Mockaroo alternative for realistic person data
Mockaroo is a capable, flexible mock-data tool: many field types, a formula engine, and a schema designer that exports to several formats. Where it generates each column independently, SimpleIDGen draws every attribute on a record from one jointly-calibrated US population — so a 44-year-old's age, BMI, A1c, and income cohere instead of colliding. It's free, with a no-login sample you can download right now.
No signup. Generate your own — 5,000 rows/day free →
Most mock-data generators — Mockaroo among them — fill each column from its own list or formula. Age comes from one generator, weight from another, income from a third. Each field looks plausible in isolation, but the row as a whole doesn't: you get teenagers with retiree incomes, or a normal BMI sitting next to a diabetic A1c, because nothing ties the columns together.
SimpleIDGen takes the opposite approach. Every record is a synthetic person sampled from distributions fitted to public US references — NHANES 2017–2020, ACS 2022, CDC NDSS 2022, KFF 2023, BLS 2023, US Census 2020. Age and sex condition the health values; geography conditions the ZIP; diabetes status gates whether insulin appears. Cross-field invariants are enforced — BMI follows from height and weight, insulin only appears for diagnosed diabetics, and ZIP always matches state. The result is 65 attributes per record that behave like a real cohort, not unrelated random draws. See exactly how the calibration works →
| Dimension | Mockaroo | SimpleIDGen |
|---|---|---|
| What it is | General-purpose mock-data tool with many field types | Calibrated synthetic person dataset + API |
| How fields relate | Generated independently per column | Jointly distributed across the whole record |
| Population realism | Plausible values, not fitted to a population | Marginals fitted to NHANES, ACS, CDC, KFF, BLS, Census |
| Cross-field invariants | Manual — via formulas you write | Enforced (BMI, insulin↔diabetes, ZIP↔state) |
| Health depth | Generic fields, not clinically modeled | A1c, BMI, blood pressure, diabetes, hypertension, CKD, meds |
| Reproducibility | Fresh random data each run | Deterministic by seed — same seed, same people |
| Output | CSV, JSON, SQL, and more | CSV or JSONL, instant |
| Try before account | Browser tool, free tier | 1,000-row sample, no login required |
Qualitative comparison of the general approach; Mockaroo's exact features change over time — check their site for current details.
If you need an arbitrary schema — invoice numbers, product SKUs, free-text fields, custom column names in a shape you define — Mockaroo's flexibility is hard to beat, and its formula engine handles bespoke logic well. Reach for it when the shape of the data matters more than its statistical realism.
Reach for SimpleIDGen when you specifically need realistic people: demographics, geography, finances, and health basics that hold together under analysis. It's built for testing health and population software, seeding demos that survive a second glance, and training or benchmarking models that would otherwise learn from incoherent rows. No real PII ever enters the system — records are built from public reference distributions, not learned from real people — so the data is GDPR- and DPDP-safe for environments where production data can't be used. Inspect every field on the generator page →
Yes. The 1,000-row sample above needs no account. A free account generates up to 5,000 rows per day in CSV or JSONL. See the pricing page for the details.
Mockaroo generates each column independently from field types you choose. SimpleIDGen samples whole people from distributions fitted to real US references, so attributes within a record are statistically consistent and cross-field invariants are enforced.
Not arbitrarily — that's where a general tool like Mockaroo shines. SimpleIDGen produces a fixed, calibrated person schema of 65 attributes. You pick how many rows and which format; the columns are designed to cohere as a population.
Yes. Generation is deterministic by seed — the same seed always yields the same people, so test fixtures and benchmarks stay stable across runs.
No. Every record is synthetic, built from public reference distributions rather than learned from real records. There is no PII to protect, which keeps it GDPR- and DPDP-safe.