Engineering note · Architecture · April 30, 2026

How prior authorization automation
actually works.

The phrase "prior authorization automation" hides a workflow with seven steps, each independently automatable, with very different difficulty and very different safety properties. This is a walk through what each step does, where automation works well, where it does not work at all, and the architectural pattern that distinguishes a production system from a demo.

~1,800 words · A field note from the team running the live CMS Medicare WISeR deployment

The short version

A PA workflow is seven steps, not one.

Most "prior authorization automation" pitches treat the workflow as a single decision: given a request, output a determination. Inside a production system, the workflow has at least seven distinct steps — intake, eligibility, duplicate detection, criteria evaluation, evidence assembly, determination, and audit packet generation. Each step has its own data shape, its own failure mode, and its own automation calculus. Treating them as one decision is how vendors end up with systems that demo well and ship poorly.

Step by step

The seven steps of a prior authorization workflow.

Each step is independently automatable. Each has different inputs, different outputs, and a different cost when it goes wrong.

  1. Step 1 · Intake validation

    Validate that the incoming request is structurally correct. CPT/HCPCS codes against active code sets, ICD-10 against current versions, beneficiary identifiers against eligibility systems, X12 278 transactions against the schema. The most common failure mode in legacy PA stacks is silent intake failure — bad inputs propagate downstream and surface as inscrutable denials weeks later. Automated intake validation is high-leverage, low-risk, and almost universally underspecified in vendor pitches.

  2. Step 2 · Eligibility verification

    Verify that the beneficiary is enrolled, the requesting provider is enrolled and credentialed (PECOS for Medicare), and the service is within scope of coverage. Eligibility is a deterministic question — the answer is in a database somewhere — and the automation pattern is therefore an API call, not an LLM. The architectural choice that matters is: which eligibility system is authoritative for this case, and what happens when its data is stale?

  3. Step 3 · Duplicate request detection

    Detect whether this exact case has been submitted before by this provider for this beneficiary in a defined lookback window. Duplicates are a major source of waste in legacy PA workflows — a clinical reviewer spending fifteen minutes on a case that was already decided last week. Detection is straightforward (matching on CPT/ICD/date-range/provider/beneficiary) but requires careful tuning of the deduplication window and the equivalence rules.

  4. Step 4 · Criteria evaluation

    Evaluate the case against the applicable medical policy: NCD, LCD, payer medical policy, or commercial criteria like InterQual or MCG. This is the step where most "AI prior auth" deployments fail — not because the LLM cannot read the criteria, but because criteria evaluation requires evidence-grounded reasoning, not generative reasoning. The production pattern is to decompose each criterion into atomic clinical questions and answer each question with explicit evidence citations.

  5. Step 5 · Evidence assembly

    Assemble the clinical evidence supporting the determination — medical record excerpts, imaging reports, lab results, prior treatment history. Evidence must be bound to specific criteria and timestamped, so the audit packet can be regenerated months later. Most production failures happen here: the model says "the criteria are met" but the evidence trail does not actually contain the documentation a reviewer would need to validate the claim.

  6. Step 6 · Determination

    For cases where the criteria are clearly met, an auto-affirmation agent issues the determination in seconds. For cases that do not clearly meet criteria, the case routes through a non-affirm research agent to a human reviewer. Auto-deny is architecturally prohibited — see The Auto-Deny Problem for the full argument.

  7. Step 7 · Audit packet generation

    Emit a structured artifact for every decision: rule pack version, evidence chain, agent reasoning trace, human-review record (where applicable), and the regulatory citations that justified the determination. The audit packet is the artifact a CMS auditor would review — and it is generated for every case, not just the cases that get audited. Most PA stacks treat audit as a backstop; production-grade systems treat it as a primary output.

Where automation works, where it does not

The cost of being wrong is not symmetric.

Some of these seven steps are cleanly automatable. Some are partially automatable. One of them — clinical denial — is not automatable at all. The architectural principle is to be aggressive on the safe steps and disciplined on the unsafe ones.

Cleanly automated

Steps 1, 2, 3, 5, 7.

Intake validation, eligibility verification, duplicate detection, evidence assembly, and audit packet generation are all deterministic or near-deterministic operations. Production systems automate these end-to-end. The cost of being wrong is bounded — a malformed intake gets returned to the submitter, a stale eligibility result triggers a refresh, a duplicate detection error costs one extra review cycle.

Partially automated

Steps 4 and 6 (auto-affirmation only).

Criteria evaluation can be automated — but the output is a recommendation, not a decision, when the case does not clearly meet criteria. Auto-affirmation is safe to automate when the criteria are unambiguously met; the agent issues the approval and the case is closed. The architecture must support the agent saying "I am not confident enough to auto-affirm" and routing to a human.

Not automated

Step 6 (non-affirmation).

Auto-denial of a clinical PA is not safe to automate. Federal court rulings against UnitedHealth and Cigna in 2024, multiple state insurance commissioner actions, and the CMS WISeR Model design all reinforce this position. Production systems route every clinical non-affirmation through a human reviewer — by architectural construction, not by configuration.

The architectural pattern

Decomposed agents. Deterministic guardrails. Audit-grade traceability.

A production-grade PA system is not one big LLM. It is a set of specialist agents, each fluent in one job, each individually testable, each citable in an audit. The pattern matters because the alternative — a monolithic LLM that handles everything — fails in ways that are hard to debug, hard to fix, and hard to defend.

A specialist eligibility agent can be evaluated against eligibility ground truth. A specialist criteria agent can be evaluated against criteria evaluation. A specialist auto-affirmation agent can be evaluated against the cases it should affirm and the cases it should escalate. A monolithic LLM that handles everything cannot be audited at this level of specificity — and an architecture that cannot be audited at this level of specificity will not survive a CMS audit.

The other property that matters is determinism. Criteria evaluation in a production system is not pure LLM reasoning — it is LLM reasoning constrained by deterministic rule packs that cite NCD/LCD/medical-policy text verbatim. The rule pack is version-controlled. The version is logged in every audit packet. A determination from June can be reproduced in December because the rule pack version, the evidence chain, and the agent reasoning trace are all preserved.

Common questions

FAQ.

What are the steps in a prior authorization workflow?

A complete PA workflow has seven steps: intake validation, eligibility verification, duplicate request detection, criteria evaluation, evidence assembly, determination, and audit packet generation. Each step is independently automatable; the architecture choice is whether to combine them in one big LLM call or decompose them into specialist agents.

Which steps of prior authorization are safe to automate?

Intake, eligibility, duplicate detection, criteria evaluation, and evidence assembly are all safely and routinely automated. Auto-affirmation (issuing approvals when criteria are clearly met) is also safe to automate. Auto-denial is not — clinical non-affirmation requires human review for legal, regulatory, and clinical reasons.

Why decompose prior authorization into multiple agents?

Decomposition makes each step individually testable, individually auditable, and independently deployable. A specialist eligibility agent can be evaluated against eligibility ground truth; a specialist criteria agent against criteria evaluation; and so on. A monolithic LLM that handles everything cannot be audited at this level of specificity.

What separates production prior auth automation from a demo?

Audit packets, determinism, and human-routing on adverse decisions. Production systems generate a complete audit packet for every decision, ground criteria evaluation in deterministic rule packs (not pure LLM reasoning), and route every clinical non-affirmation through a human reviewer — by architectural construction, not by configuration.

A walkthrough on your actual PA volume.

A 30–45 minute conversation with the engineering team running these seven steps in CMS Medicare today.

Talk to the team