How prior authorization automation
actually works.
The phrase "prior authorization automation" hides a workflow with seven steps, each independently automatable, with very different difficulty and very different safety properties. This is a walk through what each step does, where automation works well, where it does not work at all, and the architectural pattern that distinguishes a production system from a demo.
A PA workflow is seven steps, not one.
Most "prior authorization automation" pitches treat the workflow as a single decision: given a request, output a determination. Inside a production system, the workflow has at least seven distinct steps — intake, eligibility, duplicate detection, criteria evaluation, evidence assembly, determination, and audit packet generation. Each step has its own data shape, its own failure mode, and its own automation calculus. Treating them as one decision is how vendors end up with systems that demo well and ship poorly.
The seven steps of a prior authorization workflow.
Each step is independently automatable. Each has different inputs, different outputs, and a different cost when it goes wrong.
-
Step 1 · Intake validation
Validate that the incoming request is structurally correct. CPT/HCPCS codes against active code sets, ICD-10 against current versions, beneficiary identifiers against eligibility systems, X12 278 transactions against the schema. The most common failure mode in legacy PA stacks is silent intake failure — bad inputs propagate downstream and surface as inscrutable denials weeks later. Automated intake validation is high-leverage, low-risk, and almost universally underspecified in vendor pitches.
-
Step 2 · Eligibility verification
Verify that the beneficiary is enrolled, the requesting provider is enrolled and credentialed (PECOS for Medicare), and the service is within scope of coverage. Eligibility is a deterministic question — the answer is in a database somewhere — and the automation pattern is therefore an API call, not an LLM. The architectural choice that matters is: which eligibility system is authoritative for this case, and what happens when its data is stale?
-
Step 3 · Duplicate request detection
Detect whether this exact case has been submitted before by this provider for this beneficiary in a defined lookback window. Duplicates are a major source of waste in legacy PA workflows — a clinical reviewer spending fifteen minutes on a case that was already decided last week. Detection is straightforward (matching on CPT/ICD/date-range/provider/beneficiary) but requires careful tuning of the deduplication window and the equivalence rules.
-
Step 4 · Criteria evaluation
Evaluate the case against the applicable medical policy: NCD, LCD, payer medical policy, or commercial criteria like InterQual or MCG. This is the step where most "AI prior auth" deployments fail — not because the LLM cannot read the criteria, but because criteria evaluation requires evidence-grounded reasoning, not generative reasoning. The production pattern is to decompose each criterion into atomic clinical questions and answer each question with explicit evidence citations.
-
Step 5 · Evidence assembly
Assemble the clinical evidence supporting the determination — medical record excerpts, imaging reports, lab results, prior treatment history. Evidence must be bound to specific criteria and timestamped, so the audit packet can be regenerated months later. Most production failures happen here: the model says "the criteria are met" but the evidence trail does not actually contain the documentation a reviewer would need to validate the claim.
-
Step 6 · Determination
For cases where the criteria are clearly met, an auto-affirmation agent issues the determination in seconds. For cases that do not clearly meet criteria, the case routes through a non-affirm research agent to a human reviewer. Auto-deny is architecturally prohibited — see The Auto-Deny Problem for the full argument.
-
Step 7 · Audit packet generation
Emit a structured artifact for every decision: rule pack version, evidence chain, agent reasoning trace, human-review record (where applicable), and the regulatory citations that justified the determination. The audit packet is the artifact a CMS auditor would review — and it is generated for every case, not just the cases that get audited. Most PA stacks treat audit as a backstop; production-grade systems treat it as a primary output.
The cost of being wrong is not symmetric.
Some of these seven steps are cleanly automatable. Some are partially automatable. One of them — clinical denial — is not automatable at all. The architectural principle is to be aggressive on the safe steps and disciplined on the unsafe ones.
Steps 1, 2, 3, 5, 7.
Intake validation, eligibility verification, duplicate detection, evidence assembly, and audit packet generation are all deterministic or near-deterministic operations. Production systems automate these end-to-end. The cost of being wrong is bounded — a malformed intake gets returned to the submitter, a stale eligibility result triggers a refresh, a duplicate detection error costs one extra review cycle.
Steps 4 and 6 (auto-affirmation only).
Criteria evaluation can be automated — but the output is a recommendation, not a decision, when the case does not clearly meet criteria. Auto-affirmation is safe to automate when the criteria are unambiguously met; the agent issues the approval and the case is closed. The architecture must support the agent saying "I am not confident enough to auto-affirm" and routing to a human.
Step 6 (non-affirmation).
Auto-denial of a clinical PA is not safe to automate. Federal court rulings against UnitedHealth and Cigna in 2024, multiple state insurance commissioner actions, and the CMS WISeR Model design all reinforce this position. Production systems route every clinical non-affirmation through a human reviewer — by architectural construction, not by configuration.
Decomposed agents. Deterministic guardrails. Audit-grade traceability.
A production-grade PA system is not one big LLM. It is a set of specialist agents, each fluent in one job, each individually testable, each citable in an audit. The pattern matters because the alternative — a monolithic LLM that handles everything — fails in ways that are hard to debug, hard to fix, and hard to defend.
A specialist eligibility agent can be evaluated against eligibility ground truth. A specialist criteria agent can be evaluated against criteria evaluation. A specialist auto-affirmation agent can be evaluated against the cases it should affirm and the cases it should escalate. A monolithic LLM that handles everything cannot be audited at this level of specificity — and an architecture that cannot be audited at this level of specificity will not survive a CMS audit.
The other property that matters is determinism. Criteria evaluation in a production system is not pure LLM reasoning — it is LLM reasoning constrained by deterministic rule packs that cite NCD/LCD/medical-policy text verbatim. The rule pack is version-controlled. The version is logged in every audit packet. A determination from June can be reproduced in December because the rule pack version, the evidence chain, and the agent reasoning trace are all preserved.
FAQ.
What are the steps in a prior authorization workflow?
A complete PA workflow has seven steps: intake validation, eligibility verification, duplicate request detection, criteria evaluation, evidence assembly, determination, and audit packet generation. Each step is independently automatable; the architecture choice is whether to combine them in one big LLM call or decompose them into specialist agents.
Which steps of prior authorization are safe to automate?
Intake, eligibility, duplicate detection, criteria evaluation, and evidence assembly are all safely and routinely automated. Auto-affirmation (issuing approvals when criteria are clearly met) is also safe to automate. Auto-denial is not — clinical non-affirmation requires human review for legal, regulatory, and clinical reasons.
Why decompose prior authorization into multiple agents?
Decomposition makes each step individually testable, individually auditable, and independently deployable. A specialist eligibility agent can be evaluated against eligibility ground truth; a specialist criteria agent against criteria evaluation; and so on. A monolithic LLM that handles everything cannot be audited at this level of specificity.
What separates production prior auth automation from a demo?
Audit packets, determinism, and human-routing on adverse decisions. Production systems generate a complete audit packet for every decision, ground criteria evaluation in deterministic rule packs (not pure LLM reasoning), and route every clinical non-affirmation through a human reviewer — by architectural construction, not by configuration.
Continue the cluster.
More from the prior authorization series — and where this argument shows up in the product.
The Auto-Deny Problem
Why every clinical non-affirmation routes through a human reviewer — and why this is a constraint, not a feature.
Read the note Note · EngineeringPA Agent Architecture
The reference architecture for a domain-decomposed prior authorization agent system. Agents, state, audit packets.
Read the note Note · RegulatoryCMS-0057-F Field Guide
What CMS-0057-F actually requires, when, and how — translated into engineering deliverables.
Read the note Field guidePrior Authorization Automation
The buyer-side overview. Architecture, regulatory mandates, comparison table, ten-question FAQ.
Read the field guide ProductHIP One
The full Health Intelligence Platform that operationalizes the seven-step workflow described in this note.
Explore HIP One Live deploymentWISeR Live Deployment
The CMS Medicare reference deployment. 15K+ authorizations, 100% three-day TAT compliance, zero auto-denials.
See the deploymentA walkthrough on your actual PA volume.
A 30–45 minute conversation with the engineering team running these seven steps in CMS Medicare today.