virtualbackroom.ai - Analysis & Methodology - FDA Warning Letter Pattern Analysis

Data Definitions (Strict, Non-Overlapping)

As of the April 17, 2026 snapshot, deficiency categorization uses strict non-overlapping clause definitions. Each deficiency falls into exactly one category, eliminating the prior overlap between CAPA and Nonconforming Product. The mapping is:

Category	21 CFR Clause	ISO 13485:2016 Clause
CAPA	`820.100`	`8.5.2` / `8.5.3`
Nonconforming Product	`820.90`	`8.3`
Complaint Files	`820.198`	`8.2.2`
Design Controls	`820.30`	`7.3`
Document Controls	`820.40`	`4.2.4`
Records	`820.180`	`4.2.5`

What changed: The April 17 snapshot moved 820.100 into CAPA (where it always belonged) and split 820.90 out as its own Nonconforming Product category. Each clause now maps to exactly one category, so category counts sum to total deficiencies without double-counting.

Data Source

Source: U.S. Food & Drug Administration, Inspections, Compliance, Enforcement, and Criminal Investigations (ICECI) warning letter database
URL: FDA Warning Letters
Letters Analyzed: 224 warning letters issued to medical device companies
Snapshot Date: April 17, 2026 (full re-aggregation with strict non-overlapping categorization)

Date Range of Letters

Year	Letters
2021	63
2022	25
2023	36
2024	45
2025	46
2026	9
Total	224 (all dated)

Why 224 and not ~3,400? The FDA's warning letter database contains approximately 3,400 total entries across all FDA-regulated industries — drugs, biologics, food, tobacco, veterinary, and medical devices. Our analysis filtered to medical device warning letters only, which represent roughly 6.5% of all FDA warning letters. You can verify this yourself on the FDA warning letters page by selecting "Medical Devices" under the product type filter.

What we did NOT do: We did not filter to a single quarter. We analyzed the full set of medical device warning letters available in FDA's database as of April 17, 2026. The dataset represents FDA's enforcement priorities over the past 5 years — not a single-quarter snapshot.

Extraction Pipeline

Step 1: Letter Retrieval

Warning letters were fetched programmatically from FDA's public website. For each letter, we captured:

Company name
Device type
CMS case number
Issue date
Full letter content (HTML text extraction)

Step 2: AI-Assisted Deficiency Extraction

Each letter was processed individually through Claude Opus 4.6 (Anthropic, via OpenRouter) with the following parameters:

Temperature

0.2

Low creativity, high consistency

Max Tokens

8,000

Per letter analysis

Retry Logic

3x

Exponential backoff

The AI was instructed to extract from each letter:

Specific CFR/ISO clause cited (e.g., "21 CFR 820.30(g)")
QMSR/ISO 13485 equivalent mapping (e.g., "Clause 7.3")
Short title of the deficiency
Severity classification (Critical / Major / Minor)
Description of what was found deficient
QMSR transition impact
Root causes identified
Risk score (0–100)

The AI output was required in strict JSON format. Non-conforming outputs were retried up to 3 times with exponential backoff.

Step 3: Aggregation & Statistical Analysis

Extracted deficiencies were aggregated across all 224 letters using deterministic code (not AI). All counts, percentages, co-occurrence calculations, and category rollups were computed programmatically in Python.

What AI Did vs. What Code Did

This distinction matters. Critics of AI-assisted analysis are right to ask where the AI stops and the math starts.

Step	Performed By	Verifiable?
Fetching letters from FDA	Code HTTP requests	Yes — source URLs logged
Extracting deficiency clauses from letter text	AI Claude Opus 4.6	Partially — requires reading source letters
Mapping 820 clauses to QMSR equivalents	AI Claude Opus 4.6	Yes — mappings follow published crosswalk
Classifying severity (Critical/Major/Minor)	AI Claude Opus 4.6	Partially — subjective judgment call
Assigning risk scores (0–100)	AI Claude Opus 4.6	Partially — subjective judgment call
Counting deficiencies per category	Code Python aggregation	Yes — deterministic count
Calculating co-occurrence patterns	Code Python aggregation	Yes — deterministic intersection
Computing averages, percentages, distributions	Code Python math	Yes — deterministic math

In plain English: The AI reads the letters and extracts what's cited. The code counts things up. If you disagree with a number, the question is whether the AI correctly identified what was in the letter — not whether we added wrong.

Known Limitations & Honest Caveats

1. The "Other" Category (262 deficiencies, 22.7%)

262 of 1,152 deficiencies were categorized as "Other" — meaning the AI couldn't cleanly map them to a standard QSR deficiency category. These include premarket authorization violations (510(k)/PMA issues), FD&C Act violations, and MDR-adjacent findings. We did not force-fit them into categories to make the numbers look cleaner.

2. Severity Classification Is Subjective

The AI classified 966 of 1,152 deficiencies as "Critical" (83.9%). This skew likely reflects the nature of warning letters themselves — FDA only sends warning letters for serious violations, so the baseline severity is high. A different AI model or a human reviewer might classify the same deficiency differently. We used a low temperature (0.2) for consistency, but acknowledge this is a judgment call.

3. Risk Scores Are Relative, Not Absolute

Min

62

Max

98

Mean

91.8

Median

92

The tight clustering at the top of the scale reflects that these are warning letters — the worst of the worst by definition. A company receiving a warning letter is already in serious trouble. The risk score is useful for ranking severity within the dataset, not for comparing against companies that haven't received warning letters.

4. QMSR Clause Mapping Consistency

The AI occasionally produced slightly different formatting for the same clause:

ISO 13485:2016 Clause 8.2.2 vs. ISO 13485:2016 Clause 8.2.2 (Complaint Handling)
ISO 13485:2016 Clause 8.5.2 / 8.5.3 vs. ISO 13485:2016 Clause 8.5.2 (Corrective Action) and 8.5.3 (Preventive Action)

We aggregated using primary clause numbers (8.2.2, 8.5.2, 7.3, etc.) for the co-occurrence analysis. The top-cited clause list in the report shows them as the AI labeled them, which is why some appear as separate line items. This slightly undercounts clause totals in the ranked list but does not affect the category-level analysis.

5. 820-to-13485 Mapping Validation

The clause mappings (e.g., 820.100 → 8.5.2/8.5.3, 820.30 → 7.3, 820.198 → 8.2.2) follow the FDA's published QMSR crosswalk and are consistent with industry-standard references from MasterControl, PQE Group, and The FDA Group. These are not proprietary interpretations — they're the same mappings every RA/QA professional uses.

6. What This Dataset Does Not Include

483 observations — We analyzed warning letters only, not Form 483 inspectional observations. Warning letters represent escalated enforcement actions.
Consent decrees or injunctions — Not included.
Warning letters to drug, biologic, or food companies — Filtered to medical devices only.
International enforcement — FDA only. No EU notified body actions, Health Canada, or TGA.

How to Verify Our Numbers

Check the category math

Design Controls (271) + Other (262) + CAPA (105) + Complaint Files (103) + MDR Reporting (67) + Process Validation (63) + Management Responsibility (49) + Purchasing Controls (44) + Quality Audit (40) + Nonconforming Product (30) + Document Controls (30) + Production & Process Controls (25) + Device History Record (18) + Records (16) + Personnel (11) + Acceptance Activities (6) + Traceability (3) + Quality System Record (3) + Servicing (3) + Statistical Techniques (1) + Distribution (1) + Installation (1) = 1,152

This matches our reported total. Each deficiency lands in exactly one category under the strict non-overlapping definitions.

Check the risk score math

Critical 90–100 (170) + High 75–89 (53) + Medium 50–74 (1) + Low 0–49 (0) = 224 letters

This matches our reported total.

Check the severity math

Critical (966) + Major (185) + Minor (1) = 1,152 deficiencies

This matches our reported total.

Check the co-occurrence claims

56 letters cite both CAPA and Design Controls. 69 cite both Design Controls and Complaint Files. 41 cite both CAPA and Complaint Files. These were computed by counting letters where both categories appear — a straightforward set intersection under the strict definitions (CAPA = 820.100, Complaint Files = 820.198, Design Controls = 820.30).

Check the percentage claims

Claim	Calculation	Result
"68% cite design controls"	153 of 224 letters (820.30 / Clause 7.3)	68.3% → 68%
"36% cite CAPA"	81 of 224 letters (strict: 820.100 only)	36.2% → 36%
"76% scored 90+"	170 of 224 letters	75.9% → 76%
"17% cite supplier (purchasing) controls"	39 of 224 letters (820.50)	17.4% → 17%

Spot-check the source data

Pick any company from the FDA warning letter database. Read their letter. Compare what we extracted against what's in the letter. We will happily provide our extracted data for any specific letter upon request.

Fact Check & Traceability

Every claim in our analysis is traceable to a verifiable source. Below is a line-by-line fact check of the key numbers, with direct links to the source data so you can confirm each one independently.

Dataset Scope Claim

Claim	How to Verify	Source
"224 FDA warning letters"	Go to the FDA warning letter search page. Filter by Product = "Medical Devices", date range 2021–2026. Count results. Note: FDA's ~3,400 total includes all industries — medical devices are ~6.5% of that.	FDA Warning Letters Search
"219 unique companies"	4 companies received multiple warning letters during the period. You can verify by searching for repeat company names in the FDA database within the 2021–2026 date range.	FDA Warning Letters Search
"1,152 total deficiencies"	Sum all 22 deficiency categories from our report. Each deficiency traces to a specific clause citation in a specific letter. Our report JSON contains the per-letter breakdown.	Internal: `exports/q2_2026_warning_letter_report_2026-04-17.json`

Top Deficiency Claims

Claim	How to Verify	Source
"Design Controls is #1 at 271 deficiencies (24%)"	Search FDA letters for citations of 21 CFR 820.30 (or any sub-clause 820.30(a)–(j)). Our report aggregates these under "Design Controls." 271 ÷ 1,152 = 23.5%, rounded to 24%.	21 CFR 820.30 (eCFR)
"CAPA at 105 deficiencies (9%)"	Search for citations of 21 CFR 820.100 (and ISO 13485 Clauses 8.5.2 / 8.5.3). 105 ÷ 1,152 = 9.1%. Under the strict non-overlapping definitions, 820.90 (Nonconforming Product) is its own category — not folded into CAPA.	21 CFR 820.90 (eCFR)
"820.100(a) is the most-cited single clause (66 citations)"	This is the specific corrective action sub-clause. Count letters that cite "820.100(a)" explicitly. Our AI extraction identified it in 66 distinct letters.	21 CFR 820.100 (eCFR)

Co-occurrence Claims

Claim	Count	How to Verify	Method
CAPA + Design Controls	56 letters	Count letters where both CAPA (820.100) and Design Controls (820.30) appear in the same letter under the strict definitions.	Set intersection of per-letter deficiency categories
Design Controls + Complaint Files	69 letters	Count letters where both Design Controls (820.30) and Complaint Files (820.198) citations appear.	Set intersection of per-letter deficiency categories
CAPA + Complaint Files	41 letters	Count letters where both CAPA (820.100) and Complaint Files (820.198) citations appear.	Set intersection of per-letter deficiency categories

Note on co-occurrence counting (April 17 snapshot): The report JSON uses strict, non-overlapping CFR/ISO category matching (CAPA = 820.100, Complaint Files = 820.198, Design Controls = 820.30). The numbers shown above (56 / 69 / 41) come from this strict matching. Earlier articles citing 66 / 74 / 49 used broader pre-snapshot matching that folded 820.90 into CAPA — those numbers are superseded by the strict counts here.

Percentage & Risk Score Claims

Claim	Calculation	Result	Verified?
"68% cite design controls"	Letters with ≥1 Design Controls deficiency ÷ 224	153 / 224 = 68.3%	Confirmed
"36% cite CAPA"	Letters with ≥1 CAPA deficiency (820.100 strict) ÷ 224	81 / 224 = 36.2%	Confirmed
"76% scored 90+"	Letters with risk_score ≥ 90 ÷ 224	170 / 224 = 75.9% ≈ 76%	Confirmed
"Average risk score 91.8"	Sum of all risk scores ÷ 224	20,563.2 / 224 = 91.8	Confirmed

QMSR Crosswalk Claims

Every clause mapping in our analysis follows the FDA's published crosswalk from 21 CFR Part 820 to QMSR (ISO 13485:2016). These mappings are independently verifiable:

Old QSR Citation	QMSR/ISO 13485 Equivalent	Independent Verification
`820.30` Design Controls	`ISO 13485:2016 §7.3`	FDA QMSR Page
`820.90` Nonconforming Product	`ISO 13485:2016 §8.3`	QMSR Final Rule (89 FR 7496)
`820.100` Corrective Action	`ISO 13485:2016 §8.5.2 / 8.5.3`	21 CFR 820 (eCFR)
`820.198` Complaint Handling	`ISO 13485:2016 §8.2.2`	FDA QMSR Page
`820.184` Device History Record	`ISO 13485:2016 §4.2.5`	QMSR Final Rule (89 FR 7496)
`820.75` Process Validation	`ISO 13485:2016 §7.5.6`	21 CFR 820 (eCFR)

Traceability Chain

Every number flows through a traceable chain from source to published claim:

1. Source

FDA.gov warning letter page — publicly accessible, URL preserved for each letter in our database

2. Extraction

AI reads the letter, outputs structured JSON — logged with SHA-256 hash, timestamp, and full audit trail (see ALCOA+ section)

3. Aggregation

Python code counts, groups, and calculates percentages — deterministic, no AI involved, fully reproducible

4. Publication

Numbers appear in articles and this methodology page — every claim maps back to step 3 output

Regulatory References & Official Sources

Every regulatory citation in our analysis links back to an official government or standards-body source. Below are the authoritative references used throughout.

FDA & U.S. Government Sources

FDA Warning Letters Database

Primary data source. Searchable database of all FDA warning letters across all product areas.

Visit

21 CFR Part 820 — Quality System Regulation

The original QSR that governed medical device quality systems. Being replaced by QMSR (effective Feb 2, 2026).

Visit

QMSR Final Rule (89 FR 7496)

The final rule establishing the Quality Management System Regulation, incorporating ISO 13485:2016 by reference.

Visit

openFDA API — Device Classification

Used for device classification enrichment (product codes, device classes). Does not affect deficiency analysis.

Visit

FDA GMLP — Good Machine Learning Practice

10 guiding principles for AI/ML in medical devices. Our AI boundaries and quality measurement align with Principle #7 (transparency).

Visit

International Standards & Frameworks

ISO 13485:2016 — Medical Device QMS

The international standard for medical device quality management systems, now incorporated by reference into the QMSR.

Visit

ISO 14971:2019 — Risk Management

Risk management standard for medical devices. Referenced in the context of risk scores and severity classifications.

Visit

EU MDR (2017/745) & IVDR (2017/746)

European medical device and in vitro diagnostic regulations. Referenced for international regulatory context.

Visit

Crosswalk & Transition References

We verified our QSR-to-QMSR clause mappings against these authoritative sources:

FDA QMSR Official Page

FDA's official Quality Management System Regulation page with transition guidance, compliance dates, and crosswalk resources.

Visit

QMSR Final Rule — Federal Register (89 FR 7496)

The complete final rule text including the official clause-by-clause crosswalk from 21 CFR 820 to ISO 13485:2016.

Visit

PQE Group — QMSR Final Rule Analysis

Independent regulatory consulting analysis of the QMSR transition with clause mapping guidance.

Visit

Raw Data Access

Full transparency means you should be able to examine the data yourself. Below is everything we used to produce the analysis, downloadable and inspectable.

Complete Analysis Report

JSON • ~130 KB

The full report (April 17, 2026 snapshot) covering 224 warning letters across 219 unique companies, aggregated statistics, 22 strict non-overlapping deficiency categories, co-occurrence patterns, risk score distributions, root cause analysis, and year-over-year trends. This is the exact data behind every number in our articles.

Download Report JSON

Source Letters

FDA.gov

Every warning letter in our dataset links back to its original page on FDA.gov. Each letter is publicly available and can be read in full. Our database preserves the original URL for each of the 223 letters analyzed.

Browse FDA Warning Letters

What’s in the Report JSON

Field	Type	Description
`summary`	Object	Aggregate totals: 223 letters, 1,149 deficiencies, 219 unique companies, average risk score 91.8, date range
`deficiency_categories`	Object	All 17 deficiency categories with counts (e.g., Design Controls: 294, CAPA: 130)
`top_cited_clauses`	Array	Most-cited 21 CFR 820 clauses ranked by frequency (820.100(a) = 64 citations at #1)
`top_qmsr_clauses`	Array	Equivalent QMSR/ISO 13485:2016 clauses with mappings
`co_occurrence_patterns`	Array	Which deficiency categories appear together most often (Design+Complaint: 74 letters)
`risk_score_distribution`	Object	Breakdown by tier: Critical 169, High 53, Medium 1, Low 0
`severity_distribution`	Object	Critical: 964, Major: 184, Minor: 1
`year_distribution`	Object	Letters per year: 2021 (62), 2022 (25), 2023 (35), 2024 (43), 2025 (46), 2026 (8)
`per_company_detail`	Array[219]	Per-company records: company name, letter date, device, risk score, deficiency count, top clauses cited
`common_root_causes`	Array	Most frequent root causes identified across all letters
`device_class_distribution`	Object	Distribution of warning letters by FDA device classification (Class I/II/III)

Data integrity note: The downloadable JSON is the same file used to generate every statistic on this page and in our published articles. It is not a summary or subset — it is the complete output of the analysis pipeline. You can load it in any JSON viewer, spreadsheet tool, or scripting language and independently verify every claim we make.

Technical Appendix

Full technical details of every component in the analysis pipeline. This is the "show your work" section — everything a technical reviewer needs to evaluate or reproduce this analysis.

1. Exact AI Prompt — Verbatim

Each of the 223 warning letters was analyzed individually (not batched). The following prompt was sent to Claude Opus 4.6 for every letter:

System Message

You are an FDA regulatory compliance expert. Always respond with valid JSON only, no markdown code blocks.

User Prompt Template

You are an FDA regulatory compliance expert specializing in medical device quality systems.
Analyze this FDA Warning Letter and extract structured findings.

Company: {company}
Device: {device}
CMS Number: {cms_number}
Date: {date}

Warning Letter Content:
{first 12,000 characters of letter content}

Respond ONLY with valid JSON (no markdown, no code blocks) in this exact structure:
{
    "company": "{company}",
    "device": "{device}",
    "deficiencies": [
        {
            "clause": "the specific CFR/ISO clause cited (e.g., 21 CFR 820.30, ISO 13485:2016 Clause 7.3)",
            "qmsr_mapping": "the equivalent QMSR/ISO 13485 clause number (e.g., 7.3, 8.5, 8.2.1)",
            "title": "short title of the deficiency",
            "severity": "critical|major|minor",
            "description": "brief description of what was found deficient",
            "qmsr_impact": "how this maps to QMSR requirements"
        }
    ],
    "root_causes": ["list of systemic root causes identified"],
    "risk_score": 0-100 (higher = more serious compliance risk),
    "qmsr_readiness_impact": "paragraph explaining implications for QMSR transition"
}

Content window: The first 12,000 characters of each letter's full text were passed to the model. Most warning letters fall well within this window. Letters exceeding 12,000 characters had their content truncated — the AI analyzed only what was provided.

2. AI Parameters

Parameter	Value	Why
`model`	`anthropic/claude-opus-4.6`	Highest accuracy for regulatory document parsing; chosen for structured extraction quality
`provider`	OpenRouter API	Unified API gateway for multi-model access with consistent billing and rate limiting
`temperature`	`0.2`	Low creativity for high consistency — we want deterministic extraction, not creative interpretation
`max_tokens`	`8,000`	Sufficient for complex letters with 10+ deficiencies; smaller models use 3,000 tokens
`retry_logic`	3 attempts, exponential backoff	Handles transient API failures and rate limits without losing progress
`output_format`	Strict JSON	Non-conforming responses (markdown code blocks, prose) are stripped and re-parsed; failures trigger retry
`content_window`	12,000 characters	First 12K chars of each letter's extracted text; covers the full regulatory content of most letters

3. Data Integrity: ALCOA+ Audit Trail

Every AI analysis is logged with a full ALCOA+ compliance trail (Attributable, Legible, Contemporaneous, Original, Accurate, Complete, Consistent, Enduring, Available). This is the same data integrity framework required by 21 CFR Part 11 for electronic records in regulated environments.

Per-Letter Metadata

Each analyzed letter stores the following integrity fields:

Field	Type	Purpose
`analyzed_at`	ISO 8601 timestamp	When the analysis was performed (contemporaneous)
`ai_provider`	String	Which API gateway was used (`openrouter`)
`ai_model`	String	Exact model identifier (`anthropic/claude-opus-4.6`)
`response_hash`	SHA-256	Cryptographic hash of the raw AI response (tamper detection)
`latency_ms`	Integer	Round-trip time in milliseconds (performance tracking)
`audit_log_id`	Integer	Foreign key to the centralized AI audit log table

Centralized Audit Log Entry

In addition to per-letter metadata, every analysis creates a record in the AI audit log table containing:

prompt (first 2,000 chars) — what was sent to the AI
response (first 2,000 chars) — what the AI returned
model — exact model used
feature — warning_letter_analyzer
request_type — fda_warning_letter_analysis
regulatory_standards — ["21 CFR 820", "ISO 13485:2016", "QMSR"]
latency_ms, success (boolean), error_message (if failed)
response_hash_full — full SHA-256 hash
alcoa_plus_compliant — boolean flag

Aggregate Trail

{
    "total_analyses": 224,
    "audit_logged": 224,
    "compliance_status": "compliant",
    "data_integrity_verified": true
}

If any analysis failed to log an audit entry, compliance_status would show "partial" instead of "compliant".

4. Deficiency Category Classification

Deficiencies were classified into categories using deterministic keyword matching — not AI. The classifier concatenates each deficiency's clause, qmsr_mapping, and title fields into a single string, then checks for the first matching keyword from this static map:

Keyword Pattern	Category
`820.30` or `7.3`	Design Controls
`820.100` or `8.5.2` or `8.5.3`	CAPA
`820.90`	Nonconforming Product
`820.198` or `8.2.2`	Complaint Files
`820.40` or `4.2.4`	Document Controls
`820.22` or `8.2.4`	Quality Audit
`820.75` or `7.5.6`	Process Validation
`820.184`	Device History Record
`820.186`	Quality System Record
`820.180` or `4.2`	Records
`820.50` or `7.4`	Purchasing Controls
`803` or `8.2.3`	MDR Reporting
`820.80` or `8.2.6`	Acceptance Activities
`820.250`	Statistical Techniques
`820.70` or `7.5`	Production & Process Controls
`820.25` or `6.2`	Personnel
`820.20` or `5.`	Management Responsibility
(no match)	Other

Why "Other" is 262 of 1,152 deficiencies (22.7%): These are deficiencies citing premarket authorization requirements (510(k)/PMA violations), FD&C Act statutory violations, registration/listing requirements, and other regulatory provisions that don't map to a standard QSR subsystem category. We did not force-fit them to inflate category-level numbers.

5. Co-occurrence Algorithm

Co-occurrence patterns identify which clause pairs appear together in the same warning letter most frequently. The algorithm:

For each letter, extract the set of unique QMSR clause mappings from all its deficiencies
Generate all sorted pairwise combinations from that set (order-independent)
Increment a counter for each pair across all 224 letters
Report the top 15 pairs by frequency

Example

Letter A has deficiencies mapped to: [7.3, 8.2.2, 8.5.2]
Pairs generated: (7.3, 8.2.2), (7.3, 8.5.2), (8.2.2, 8.5.2)

Letter B has deficiencies mapped to: [7.3, 8.2.2]
Pairs generated: (7.3, 8.2.2)

Result: (7.3, 8.2.2) = 2 letters, (7.3, 8.5.2) = 1 letter, (8.2.2, 8.5.2) = 1 letter

Report vs. article numbers: The co-occurrence data in the report JSON uses exact QMSR clause strings as output by the AI (e.g., "ISO 13485:2016 Clause 7.3"). The 66/74/49 co-occurrence numbers cited in articles use broader category matching — checking whether a letter's deficiency text contains "820.30" OR "7.3" anywhere (for Design Controls). This broader approach captures all variants of the same regulatory area regardless of how the AI formatted the clause string, which is why the article numbers are higher than the top pairs in the report's exact-match co-occurrence table.

6. Report Generation Pipeline

The analysis runs in three sequential phases:

Phase 1: Fetch

HTTP scrape FDA warning letter pages
Extract company, date, CMS number, URL
Deduplicate by letter_id and cms_number
Store in PostgreSQL warning_letters table
Incremental mode skips already-fetched letters

Phase 2: Analyze

Fetch full letter content from FDA URL
Extract text via HTML parsing (trafilatura)
Send to Claude Opus 4.6 (one letter per API call)
Parse JSON response, validate structure
Store analysis JSON in database
Log ALCOA+ audit trail

Phase 3: Aggregate

Load all analyzed letters from database
Count clauses, categories, severities
Calculate risk score distribution
Compute co-occurrence pairs
Generate per-company detail
Export to exports/ as JSON

Key design decision: Each letter is analyzed individually, not batched. This means the AI sees only one letter per API call and cannot be influenced by content from other letters. It also means each analysis is independently cacheable, retriable, and auditable.

7. Risk Score Distribution

Minimum

62

Maximum

98

Mean

91.8

Median

92

Risk Tier	Score Range	Letters	% of Total
Critical	90 – 100	170	75.9%
High	75 – 89	53	23.7%
Medium	50 – 74	1	0.4%
Low	0 – 49	0	0.0%
Total		224	100%

Interpretation: Risk scores are assigned by the AI based on the severity, number, and systemic nature of deficiencies in each letter. The tight clustering at the high end (75.8% scoring 90+) reflects the nature of warning letters — FDA only issues them for serious, sustained violations. A risk score of 62 (the minimum in our dataset) still represents a significant compliance failure; it simply had fewer individual deficiencies or less systemic impact than the median letter.

8. Data Enrichment: openFDA API

After fetching and analyzing warning letters, device classification data was enriched using the openFDA Device Enforcement API:

Step	Detail
API endpoint	`https://api.fda.gov/device/enforcement.json`
Search parameter	`recalling_firm:"{company_name}"` (first token before LLC/Inc/Ltd)
Data extracted	Device classification (Class I / II / III), product description
Rate limiting	500 ms delay between requests to avoid API throttling
Fallback	If openFDA has no match, device class remains blank (not imputed)

This enrichment is supplementary — it adds device classification context but does not affect deficiency extraction, clause mapping, severity scoring, or any of the numbers reported in the analysis.

The Bottom Line

This analysis is transparent about what AI did and what code did. The regulatory framework is verified by published crosswalks. The math is deterministic and checkable. The AI's subjective calls (severity, risk scores) are disclosed with their limitations.

If you think a specific number is wrong, tell us which one. We'll show you the source letter and the extraction. That's how reproducible analysis works.

Report generated April 17, 2026 • Analysis engine: Claude Opus 4.6 via OpenRouter • Aggregation and statistics: Python • Raw data available upon request

Dashboard	`g` `d`
Audit Coach	`g` `a`
Council Mode	`g` `c`
Warning Letters	`g` `w`
Regulatory Monitor	`g` `r`
FMEA	`g` `f`
Monitoring (admin)	`g` `m`

Show this dialog	`?`
Focus search	`/`
Close dialog / cancel	`Esc`