What a fail-closed PHI redactor actually does
Most healthcare AI marketing says 'we have a redactor.' Few say what it catches, what happens when it errors, and why fail-closed is non-negotiable.
"Fail-closed" is one of those phrases that sounds like a checkbox until you've watched a fail-OPEN redactor leak PHI in production. We've watched it. This post is about the difference.
What "fail-closed" means
Every PHI redactor has the same shape:
1. Take a prompt 2. Find identifiers in it 3. Replace identifiers with [REDACTED] placeholders 4. Send the redacted prompt downstream
Three things can go wrong in step 2:
- The redactor crashes (exception)
- The redactor returns malformed output (parse error)
- The redactor times out (slow regex, slow ML model)
When any of these happens, the redactor must decide what to do with the original prompt. Two options:
Fail-OPEN: Pass the original (unredacted) prompt through. The argument: "users would rather get a slow but working chat than a hard fail."
Fail-CLOSED: Reject the request entirely. The user sees an error. The original prompt is NOT forwarded.
The correct choice for HIPAA is fail-closed. Always. If you ever route an unredacted prompt to inference because the redactor crashed, you have a PHI breach.
Most engineering teams default to fail-open when they build their first redactor because it's the more user-friendly behavior. They learn the hard way.
What identifiers a real redactor catches
HIPAA defines 18 categories of "PHI identifiers." A real redactor catches all 18, plus institutional knowledge that didn't make the formal list:
- Names (full, first-only, last-only, hyphenated, with apostrophes, with non-ASCII characters)
- Medical Record Numbers (MRN, varies per institution)
- Dates relative to a patient (DOB, admission, discharge, death; within 1 day of admission)
- Phone numbers (US, international, with extensions)
- Fax numbers
- Email addresses
- Social Security Numbers (SSN, including with-dashes, no-dashes, partial)
- Account numbers
- Certificate / license numbers
- Vehicle identifiers (VIN, plate)
- Device identifiers (serial numbers)
- URLs
- IP addresses
- Biometric identifiers
- Photographic images (handled at file-upload, not text)
- Other unique identifying characteristic
- ITINs
- Addresses smaller than state level
Plus what's actually scary in clinical text:
- Provider names
- Clinic names + locations
- Unique medical conditions (the only patient at the hospital with a rare cancer is identifiable by that fact alone)
- Family member names from collateral history
- Pharmacy IDs, insurance member IDs
- Geographic subdivisions smaller than 3-digit ZIP
A redactor that catches the first 18 but not the last 8 is technically HIPAA-compliant and operationally useless.
How a real redactor is built
The naïve approach is regex. Regex catches DOB and SSN and phone numbers well. It catches names badly. Names are too varied: "Dr. O'Brien-Smith" doesn't fit a single pattern, and "John" by itself is a name AND a common English word.
The right approach is a layered pipeline:
1. Regex layer: high-precision patterns for structured identifiers (SSN, DOB, MRN, phone, email, ITIN). These have unambiguous shapes. 2. NER (named entity recognition) layer: a fine-tuned transformer that flags names, places, organizations. Fast, ~50ms on CPU. 3. Institutional knowledge layer: tenant-specific rules (your hospital's provider list, your formulary, your patient roster). Lookup, not inference. 4. Aggregation layer: combine signals, decide which spans to redact, replace with [REDACTED:CATEGORY] placeholders. 5. Verification layer: re-scan the redacted output for any identifiers the earlier layers might have missed.
The last step is the safety net. Even a good pipeline misses things; a second pass catches misses without changing the redaction algorithm.
Where production redactors fail
Three common failure modes:
Slow regex catastrophic backtracking. A poorly-written regex can run in O(n²) on certain inputs. A clinician pastes a long chart note, the regex hangs, the request times out. Fail-OPEN systems then send the unredacted text downstream.
Fix: pre-test all regexes against pathological inputs, set hard timeouts on the redactor, fail closed on timeout.
NER model OOM. The transformer runs out of GPU memory on a long prompt. The model errors. The request hangs.
Fix: chunk the prompt to fit, run NER on each chunk, merge results. Set a memory ceiling.
False-negative on rare names. The NER model wasn't trained on Vietnamese names or West African surnames or hyphenated Indian names. It misses them.
Fix: institutional roster lookup. If a person works at your hospital, their name should be in a database the redactor consults.
The math on fail-closed
If your redactor has 99.9% success rate and serves 100,000 prompts per day:
- 99,900 pass through correctly
- 100 fail
Fail-OPEN: 100 unredacted prompts cross the model boundary. 100 potential breaches.
Fail-CLOSED: 100 user-visible errors. Zero breaches.
The user-experience cost of fail-closed is real but bounded. You can mitigate it with retries, better error messages, and dashboards that surface redactor-failure rate so engineers see it and fix it.
The cost of fail-open is unbounded. Every PHI breach is a potential investigation, a potential settlement, and a potential lifetime breach-notification obligation.
Fail-closed is the only HIPAA-compatible choice. Anyone who tells you otherwise is selling a less-safe system.