Skip to main content
7 min readSanolith Engineering

What a fail-closed PHI redactor actually does

Most healthcare AI marketing says 'we have a redactor.' Few say what it catches, what happens when it errors, and why fail-closed is non-negotiable.

HIPAAEngineeringPHI redaction

"Fail-closed" is one of those phrases that sounds like a checkbox until you've watched a fail-OPEN redactor leak PHI in production. We've watched it. This post is about the difference.

What "fail-closed" means

Every PHI redactor has the same shape:

1. Take a prompt 2. Find identifiers in it 3. Replace identifiers with [REDACTED] placeholders 4. Send the redacted prompt downstream

Three things can go wrong in step 2:

  • The redactor crashes (exception)
  • The redactor returns malformed output (parse error)
  • The redactor times out (slow regex, slow ML model)

When any of these happens, the redactor must decide what to do with the original prompt. Two options:

Fail-OPEN: Pass the original (unredacted) prompt through. The argument: "users would rather get a slow but working chat than a hard fail."

Fail-CLOSED: Reject the request entirely. The user sees an error. The original prompt is NOT forwarded.

The correct choice for HIPAA is fail-closed. Always. If you ever route an unredacted prompt to inference because the redactor crashed, you have a PHI breach.

Most engineering teams default to fail-open when they build their first redactor because it's the more user-friendly behavior. They learn the hard way.

What identifiers a real redactor catches

HIPAA defines 18 categories of "PHI identifiers." A real redactor catches all 18, plus institutional knowledge that didn't make the formal list:

  • Names (full, first-only, last-only, hyphenated, with apostrophes, with non-ASCII characters)
  • Medical Record Numbers (MRN, varies per institution)
  • Dates relative to a patient (DOB, admission, discharge, death; within 1 day of admission)
  • Phone numbers (US, international, with extensions)
  • Fax numbers
  • Email addresses
  • Social Security Numbers (SSN, including with-dashes, no-dashes, partial)
  • Account numbers
  • Certificate / license numbers
  • Vehicle identifiers (VIN, plate)
  • Device identifiers (serial numbers)
  • URLs
  • IP addresses
  • Biometric identifiers
  • Photographic images (handled at file-upload, not text)
  • Other unique identifying characteristic
  • ITINs
  • Addresses smaller than state level

Plus what's actually scary in clinical text:

  • Provider names
  • Clinic names + locations
  • Unique medical conditions (the only patient at the hospital with a rare cancer is identifiable by that fact alone)
  • Family member names from collateral history
  • Pharmacy IDs, insurance member IDs
  • Geographic subdivisions smaller than 3-digit ZIP

A redactor that catches the first 18 but not the last 8 is technically HIPAA-compliant and operationally useless.

How a real redactor is built

The naïve approach is regex. Regex catches DOB and SSN and phone numbers well. It catches names badly. Names are too varied: "Dr. O'Brien-Smith" doesn't fit a single pattern, and "John" by itself is a name AND a common English word.

The right approach is a layered pipeline:

1. Regex layer: high-precision patterns for structured identifiers (SSN, DOB, MRN, phone, email, ITIN). These have unambiguous shapes. 2. NER (named entity recognition) layer: a fine-tuned transformer that flags names, places, organizations. Fast, ~50ms on CPU. 3. Institutional knowledge layer: tenant-specific rules (your hospital's provider list, your formulary, your patient roster). Lookup, not inference. 4. Aggregation layer: combine signals, decide which spans to redact, replace with [REDACTED:CATEGORY] placeholders. 5. Verification layer: re-scan the redacted output for any identifiers the earlier layers might have missed.

The last step is the safety net. Even a good pipeline misses things; a second pass catches misses without changing the redaction algorithm.

Where production redactors fail

Three common failure modes:

Slow regex catastrophic backtracking. A poorly-written regex can run in O(n²) on certain inputs. A clinician pastes a long chart note, the regex hangs, the request times out. Fail-OPEN systems then send the unredacted text downstream.

Fix: pre-test all regexes against pathological inputs, set hard timeouts on the redactor, fail closed on timeout.

NER model OOM. The transformer runs out of GPU memory on a long prompt. The model errors. The request hangs.

Fix: chunk the prompt to fit, run NER on each chunk, merge results. Set a memory ceiling.

False-negative on rare names. The NER model wasn't trained on Vietnamese names or West African surnames or hyphenated Indian names. It misses them.

Fix: institutional roster lookup. If a person works at your hospital, their name should be in a database the redactor consults.

The math on fail-closed

If your redactor has 99.9% success rate and serves 100,000 prompts per day:

  • 99,900 pass through correctly
  • 100 fail

Fail-OPEN: 100 unredacted prompts cross the model boundary. 100 potential breaches.

Fail-CLOSED: 100 user-visible errors. Zero breaches.

The user-experience cost of fail-closed is real but bounded. You can mitigate it with retries, better error messages, and dashboards that surface redactor-failure rate so engineers see it and fix it.

The cost of fail-open is unbounded. Every PHI breach is a potential investigation, a potential settlement, and a potential lifetime breach-notification obligation.

Fail-closed is the only HIPAA-compatible choice. Anyone who tells you otherwise is selling a less-safe system.