Person Profile (Advanced) — engine v0.5
Synthetic person records with 65 jointly-distributed attributes — demographics, health, behavioral, financial — calibrated against public US reference data.
Demographic-first generator. Returns synthetic person records with 65 jointly-distributed attributes across 9 domains (identity, geography, social, financial, behavioral, health basics, health conditions, healthcare utilization, medications). Each marginal distribution cites a public source (ACS 2022, NHANES 2017-2020, CDC NDSS, KFF 2023, MEPS 2022, BLS 2023, USPS L005 2024). Cross-field invariants are enforced: BMI = weight/(height/100)², ZIP matches state per USPS SCF ranges, insulin only fires for diabetics. Deterministic by seed. Three locales (en-US full fidelity; en-GB / en-IN identity-native with en-US health fallback, disclosed via locale_data_source). Cohort engines (T2DM, Rx pharmacy, future) wrap this generator with longitudinal records — they're separately versioned.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
count |
integer |
optional | 1 |
Number of person records to return. Range: 1–100. |
seed |
integer |
optional | (time-derived, non-deterministic) |
RNG seed for reproducibility. Same seed + same params = byte-identical records. |
locale |
string |
optional | en-US |
Locale: en-US, en-GB, en-IN. Health attributes use en-US fallback for en-GB / en-IN (Phase 2 will add native data). |
idFormat |
string |
optional | ulid |
ID format: ulid, uuidv7, uuid, nanoid, cuid2. |
Example output
{
"id": "64PG6RYQXXD7XFEKZJ6AW616M7",
"given_name": "Elizabeth", "family_name": "Robinson",
"age": 31, "sex_at_birth": "female",
"race": "white", "ethnicity": "hispanic",
"locale": "en-US", "country": "US", "state": "IL", "urbanicity": "suburban",
"education": "some_college", "insurance_type": "marketplace",
"height_cm": 171.8, "weight_kg": 76.4, "bmi": 25.9, "waist_circumference_cm": 89.1,
"diabetes_status": "diagnosed_t2dm", "family_history_diabetes": true,
"visits_past_year": 7, "number_of_prescriptions": 1, "on_insulin": false
// ... 49 more attributes
}
API call
curl -s 'https://api.simpleidgen.com/v1/mock/person'
const res = await fetch('https://api.simpleidgen.com/v1/mock/person');
const data = await res.json();
console.log(data.data);
import requests
resp = requests.get('https://api.simpleidgen.com/v1/mock/person')
print(resp.json())
Endpoint
/v1/mock/person
Multiple datasets — 10 × 200K records
Variance evidence: 10 independent regenerations, 45 pairwise comparisons. Each ~200K-row dataset is generated with a different base seed.
Single large dataset — 1 × 2M records
Scale evidence: 2M-row dataset generated in ~60s via the async /v1/datasets/person endpoint. JSONL streamed to S3 via multipart upload.