§ · Developer docs

Quickstart

Go from a free account to a downloaded synthetic dataset. There are three ways to generate data: the web app (no code), the REST API (async jobs for scripts and pipelines), and the MCP server (for AI agents like Claude and Cursor). All three share one free tier of 5,000 rows per UTC day, and all need a free account.

§ · 1 · Create an account

Register with an email and password, then enter the 6-digit code we email you. Accounts are free and no card is required. Sign-up and login run in the browser (they're behind a bot check), so this first step happens on the site rather than the command line.

§ · 2 · Get your clientId

Every account has a clientId, a UUID that acts as your API credential. Find it on your profile page, or fetch it with GET /v1/auth/me. Keep it private: it authenticates the MCP server and meters your daily quota.

§ · Which interface to use

You want to…UseAuth
Click through and download a fileWeb app at simpleidgen.comLog in (handled for you)
Generate from a script or pipeline, or in bulk up to 1M rowsREST APISession cookie + clientId
Generate from an AI agent (Claude, Cursor, …)MCP serverBearer clientId

§ · Path A · REST API

The REST API runs generation as an async job: submit it, poll until it finishes, then download the files. This is the path for pipelines, CI, and large datasets.

Authenticate. Dataset jobs need a logged-in session. Because login runs in the browser, the simplest way to call the API from a terminal is to log in on the site, then copy your sidgen_session cookie from your browser's dev tools (Application → Cookies). It stays valid for 30 days. Send it with each request and include your clientId in the JSON body.

1 · Submit a job — 10,000 US person records, reproducible via seed

curl -sS https://api.simpleidgen.com/v1/datasets/person \
  -H "Content-Type: application/json" \
  -b "sidgen_session=YOUR_SESSION_COOKIE" \
  -d '{ "clientId": "YOUR_CLIENT_ID", "count": 10000, "seed": 42 }'

# 202: { "jobId": "...", "status": "pending", "statusUrl": "...", "estimatedSeconds": 5 }

2 · Poll until it's done — status moves pending → running → completed

curl -sS https://api.simpleidgen.com/v1/datasets/YOUR_JOB_ID \
  -b "sidgen_session=YOUR_SESSION_COOKIE"

# when completed, the response carries a downloadUrls object

3 · Download — every job produces JSONL, CSV, and a manifest

curl -sS https://api.simpleidgen.com/v1/datasets/YOUR_JOB_ID/files/persons.jsonl \
  -b "sidgen_session=YOUR_SESSION_COOKIE" -o persons.jsonl

Generated files are kept for 7 days.

T2DM patients. Point at /v1/datasets/t2dm for a diabetic cohort. It takes the same count and seed, plus a stage dial (1–5), cohort (diabetic or general), timeline: true for year-by-year histories, and formats: ["fhir"] for FHIR R4.

curl -sS https://api.simpleidgen.com/v1/datasets/t2dm \
  -H "Content-Type: application/json" \
  -b "sidgen_session=YOUR_SESSION_COOKIE" \
  -d '{ "clientId": "YOUR_CLIENT_ID", "count": 500, "stage": 4, "formats": ["fhir"] }'

Full request and response reference: the OpenAPI spec.

§ · Path B · MCP (for AI agents)

For AI agents the MCP server is the simplest path: no session and no browser step. Point any MCP client at https://api.simpleidgen.com/mcp and pass your clientId as a Bearer token.

# Claude Code
claude mcp add --transport http simpleidgen https://api.simpleidgen.com/mcp \
  --header "Authorization: Bearer YOUR_CLIENT_ID"

Then ask in plain English and the agent picks the tool and format. Four tools are available:

ToolWhat it doesMax inline
generate_peopleSynthetic US adults with calibrated health attributes100
generate_t2dm_cohortStaged T2DM patients, severity dial 1–5100
generate_timelineYear-by-year longitudinal T2DM histories25
describe_schemaThe data model — attributes and formats (free, no quota)

Inline results are capped to stay chat-sized; for larger sets use the REST jobs above. Setup for Cursor, VS Code, and other clients is on the MCP page.

§ · Quotas & limits

LimitValue
Free tier5,000 rows per UTC day (resets 00:00 UTC; over-quota returns HTTP 429)
Rows per jobUp to 1,000,000 (default 10,000)
Concurrent jobs3 in flight per account
MCP inline results100 people, 25 timelines
File retentionDownloads stay available for 7 days

Need more than 5,000 rows a day, or a one-off larger dataset? It's still free — just ask. See pricing for the details.

§ · Formats & determinism

Every job produces JSONL and CSV. Add formats: ["fhir"] for US-Core-aligned FHIR R4 (bulk per-resource NDJSON). Output is deterministic by seed: the same seed and options return byte-identical records every time, so fixtures and benchmarks stay stable across runs.