The American Healthcare Conundrum

Data Access Fund

Every dollar buys data, not salary. Every analysis is open-source. Every finding is reproducible.

Thank you! Your contribution is confirmed.

You funded with . We'll publish your name on this page within 48 hours and email you when the dataset reaches its goal. Check your inbox for a Stripe receipt.

Total Raised $0 of $44,700
Free data found $448.6B/year in fixable waste. These datasets let us prove it at patient level.
$3,500
Phase 1: Claims Access
$9,700
Phase 2: Multi-State + Legal
$44,700
Phase 3: Full Medicare (65M patients)
🔓
Have data you can share?

Whether you hold an active DUA (CMS LDS, VRDC, a state APCD, HCUP) or your organization licenses a proprietary dataset (Truven/MarketScan, Optum Clinformatics, IQVIA Pharmetrics, Definitive Healthcare, Press Ganey, Sage Transparency), get in touch. Your existing access + our published code = findings neither of us could produce alone. Co-publication under your DUA is fine. Donors can be public or anonymous. Code stays open. Findings stay published.

Get in touch

How Sponsorship Works

1. Click "Fund this dataset" on any card below.
2. Pick a suggested amount ($25, $100, $500, $1,500) or type any amount — minimum $5.
3. You're taken to Stripe Checkout (secure, no account required). Card, Apple Pay, and Google Pay accepted.
4. We purchase the data when a dataset's goal is fully met. Your money is only used for the dataset you chose.
5. You're listed as a sponsor on this page and thanked by name in every issue that uses your data.
Contributions are project funding, not tax-deductible charitable donations.
Phase 1 — Claims Access
1
CMS Medicare Claims (5% Sample)
$1,500
$0 raisedGoal: $1,500
Claim-level Medicare data with diagnosis codes, procedure codes, and denial outcomes. Transforms aggregate inference into patient-level evidence.
Data source ↗
Sponsors
Be the first to fund this dataset

Source: CMS via ResDAC. Data Use Agreement required (no fee), 2-3 week approval. 5% random sample of all fee-for-service Medicare: inpatient, outpatient, physician, SNF, home health, hospice, DME.

The problem this solves: Aggregated Medicare reports flatten the questions that matter most. What happens to a patient after a denial? Which procedures add volume without improving outcomes? When a physician's office is acquired by a hospital, does the bill for the same visit double? Claim-level data answers these one event at a time, tied to the patient. Everything else is inference.

2
Colorado All-Payer Claims Database
$2,000
$0 raisedGoal: $2,000
Every commercial, Medicare, and Medicaid claim in Colorado. The only way to see what private insurers actually pay. 5.5M lives.
Data source ↗
Sponsors
Be the first to fund this dataset

Source: CIVHC (civhc.org). Custom data extract with Data Use Agreement. Commercial, Medicare Advantage, Medicaid, and MA plan coverage. Standard researcher access.

The problem this solves: Commercial insurer prices are set in private. Hospital chargemasters are fiction. PBM rebates hide inside bilateral contracts. An all-payer claims database is the only way to see what a specific insurer actually pays a specific hospital for a specific procedure, what a plan pays a PBM versus what the PBM pays the pharmacy, and whether denial rates differ by payer or diagnosis. Colorado runs one of the most researcher-accessible APCDs in the country.

Grant writers, we need you: Philanthropic funders like the Colorado Health Foundation run RFPs that can underwrite work like this. If you have experience writing health-research grants, consider donating your time to partner with us on a submission. Reach out via the "Get in touch" link above.

Phase 2 — Multi-State + Legal
3
Hospital Discharge Data: California + New York + Florida
$1,500
$0 raisedGoal: $1,500
Every hospital inpatient discharge in three of the largest state markets. All payers. If a finding holds in CA, NY, and FL, it scales nationally.
Data source ↗
Sponsors
Be the first to fund this dataset

Source: AHRQ HCUP State Inpatient Databases. ~$500/state. California: 400+ hospitals, 3M+ discharges/year. New York: 200+ hospitals. Florida: 200+ hospitals, one of the largest Medicare-age markets in the country. Student/nonprofit discounts available.

The problem this solves: One state is a case study. Three major state markets, covering roughly a quarter of the US population and every major insurance model, is national evidence. Hospital discharge data lets us map procedure volume, DRG severity drift, readmissions, and end-of-life intensity at the hospital level across California, New York, and Florida. Findings that replicate across all three are very hard to dismiss as regional artifacts.

4
Hospital Price Transparency (Full National)
$3,500
$0 raisedGoal: $3,500
1 billion+ negotiated rate records. What every hospital charges every insurer for every procedure. The most consumer-facing dataset on this list.
Data source ↗
Sponsors
Be the first to fund this dataset

Source: Turquoise Health. Free tier covers 14 procedures; full dataset covers thousands. Research partnership may reduce cost. Derived from CMS-mandated public hospital price disclosures.

The problem this solves: The CMS price transparency rule requires hospitals to publish their negotiated rates in machine-readable format. In practice the files are huge, inconsistent, and only partially compliant. Turquoise has already parsed, normalized, and cross-walked the universe into queryable records. Without it, we cite studies. With it, we can show a reader exactly what their own hospital charges their own insurer for their own procedure, and compare that to the hospital across town.

5
Legal Research (Case Law + Antitrust)
$1,200
$0 raisedGoal: $1,200
Full US case law. FTC antitrust opinions, merger challenges, PBM litigation, Safe Harbor history. Answers "how did the system get this way?"
Data source ↗
Sponsors
Be the first to fund this dataset

Source: Midpage (midpage.ai). $99/month, cancel anytime. Already configured as a live connector in our research environment. Just needs activation.

The problem this solves: Every regulatory failure in US healthcare leaves a paper trail in the case law. The Anti-Kickback Safe Harbor that shields intermediary fees. The hospital mergers the FTC lost. The fiduciary-duty litigation that could rewire incentive structures inside public companies. Legal research is how you trace a mechanism back to its origin and understand which levers can actually be moved, rather than guessing at symptoms.

Phase 3 — Full Medicare Access
6
CMS Full Medicare Claims (65M Patients)
$35,000
$0 raisedGoal: $35,000
100% of Medicare claims, including Medicare Advantage. Patient-level, longitudinal. The same data Harvard, Dartmouth, and RAND use. For less than the cost of one junior researcher.
Data source ↗
Sponsors
Be the first to fund this dataset

Source: CMS Virtual Research Data Center (VRDC) via ResDAC. ~$35K first year ($23K renewal). Full DUA + IRB approval required. Virtual-only access (no data downloads). All outputs reviewed by CMS for patient privacy.

The problem this solves: The 5% sample reveals patterns. The full file follows every Medicare patient across every provider, every year. It's the data that connects a denial to the hospitalization three months later, quantifies whether vertically integrated systems route patients to captive providers at higher cost, and settles the fee-for-service-vs-Medicare-Advantage volume debate with individual-level evidence. This is what peer-reviewed research, regulatory citations, and congressional testimony are built on.

What "Open Source" Means Here

What
Shareable?
Analysis code
Yes, always. Every script, every model. Published to GitHub.
Findings
Yes, always. Hospital stats, denial rates, cost comparisons. Published in the newsletter and on GitHub.
Derived tables
Yes, with cell suppression. No cell under 11 observations (CMS privacy rule).
Raw claims
No. Patient privacy. Any researcher can get their own access and run our code.
The model: We buy access. We publish the code. We publish the findings. CMS actually requires researchers to publish as a condition of access. Any researcher with their own DUA can reproduce every result.