PA agent architecture.
A reference pattern.
A reference architecture for a domain-decomposed prior authorization agent system. The agents and what each one owns. The state object that flows between them. The audit packet that gets emitted at the end. The failure modes and how they get handled. Written for engineering teams who have decided to build PA agents and need a starting point that has actually shipped.
Decomposed agents, typed state, audit-first design.
A production-grade PA agent system has three load-bearing properties: it decomposes the PA workflow into specialist agents (each fluent in one job), it manages state through a typed object that flows agent-to-agent (so any agent can be replayed independently for debugging), and it emits a complete audit packet for every decision (because every decision will be re-examined by someone). Get these three right and the system survives a CMS audit. Get any of them wrong and the system survives a demo and not much else.
Seven agents, one graph.
The agents are organized in a graph (typically built on LangGraph or a similar agent-orchestration framework). Each agent is a node. Edges describe the routing between them — including conditional edges that route based on agent output.
-
Agent 1 · Intake Agent
Owns: Schema validation, code-set verification (CPT, HCPCS, ICD-10), beneficiary/provider identifier validation, X12 278 conformance, FHIR PAS conformance. Outputs: Either a normalized request envelope (passed to Agent 2) or a structured error returned to the submitter. Implementation: Largely deterministic; LLM only used for free-text field interpretation.
-
Agent 2 · Eligibility Agent
Owns: Beneficiary enrollment verification (HETS for Medicare), provider enrollment and credentialing (PECOS), service scope-of-coverage check. Outputs: Eligibility status, with explicit data-source attribution and timestamp on each finding. Implementation: Deterministic API calls. The LLM is not in the loop here; the data sources are authoritative.
-
Agent 3 · Duplicate Request Agent
Owns: Match against prior submissions on (CPT/HCPCS, ICD-10, provider NPI, beneficiary identifier, date range). Outputs: Duplicate flag with prior case reference, or clean. Implementation: Deterministic match against a case database. Match-equivalence rules are tunable per buyer in YAML config.
-
Agent 4 · Criteria Agent
Owns: Match the case to applicable medical policy (NCD, LCD, payer policy, InterQual, MCG). Decompose each criterion into atomic clinical questions. Answer each question with explicit evidence citations. Outputs: A criterion-by-criterion evaluation, with confidence scores and evidence references. Implementation: LLM-based, but constrained by the rule pack — the LLM cannot invent criteria not in the rule pack.
-
Agent 5 · Evidence Agent
Owns: Locate and extract the clinical evidence supporting each criterion's evaluation. Bind evidence to specific criteria with timestamps. Verify that the evidence trail is sufficient to support an audit. Outputs: An evidence-bound case object — every criterion has either explicit supporting evidence or an explicit "missing evidence" marker.
-
Agent 6 · Auto-Affirmation Agent
Owns: Decide whether the case clearly meets all applicable criteria with confidence above the auto-affirmation threshold. Outputs: Either an auto-affirmation determination (case closed) or a routing decision to Agent 871. Cannot: Issue a denial. The architecture does not expose that affordance.
-
Agent 871 · Non-Affirm Research Agent
Owns: Cases that do not auto-affirm. Prepare the case for the human clinical reviewer — summarize the request, surface the gap, assemble candidate citations, present the case in structured form. Outputs: A reviewer-ready case packet. Cannot: Issue the determination. The reviewer decides.
A typed object flows through the graph.
The state object is a Python dataclass (or equivalent typed structure) that each agent reads from and writes to. Every agent's contribution is additive — agents do not mutate prior agents' fields. The state is persisted at every step.
The state object includes: the original request envelope; the normalized request from Agent 1; the eligibility findings from Agent 2; the duplicate-detection result from Agent 3; the criterion-by-criterion evaluation from Agent 4; the evidence bindings from Agent 5; the auto-affirmation decision from Agent 6; and (where applicable) the non-affirm research packet from Agent 871. Each field is timestamped, versioned, and attributed to the agent that produced it.
The typing matters because the state is replayable. A debugging session begins with the persisted state at the point of failure; an engineer can re-run any agent against that state to reproduce the issue locally. There is no "what was in the LLM context at the moment of failure" mystery — the state object is the context.
An audit-grade artifact for every decision.
For every PA determination, the system emits a structured artifact — the audit packet — that is sufficient to defend the determination if it is later challenged. The audit packet is generated for every case, not just the cases that get audited.
Contents of the audit packet: the rule pack version that was applied (with content hash); the evidence chain (every piece of evidence cited, with provenance); the agent reasoning trace (each agent's structured contribution to the determination); the human reviewer record where applicable (named reviewer, timestamp, clinical rationale, citations); and the explicit regulatory citations that justified the determination.
The audit packet is generated synchronously, not as a background job. The determination is not considered complete until the audit packet has been generated and persisted. This is a deliberate constraint — it prevents the system from issuing determinations that cannot be defended.
What happens when an agent fails.
Agents fail. Network calls time out. LLMs hallucinate. Data sources go stale. The architecture has to handle this without producing a wrong determination — which means failure has to be observable, recoverable, and never silent.
Timeout on a deterministic agent.
If Agent 2 (Eligibility) cannot reach HETS within the configured timeout, the case routes to a human eligibility reviewer with the failure context attached. The system does not assume eligibility; it does not deny on missing data; it surfaces the gap. The reviewer either retries the call or makes a manual eligibility determination.
Hallucination on a reasoning agent.
If Agent 4 (Criteria) produces a finding that does not trace to specific evidence in the case, a grounded-evidence check (a separate validator) flags the finding. The case routes to Agent 871, never to Agent 6. A finding that cannot survive grounded-evidence validation cannot drive an auto-affirmation.
Stale data in a downstream system.
If the eligibility data feed is more than the configured staleness threshold old, Agent 2 declines to issue a finding and instead surfaces the staleness flag. The case routes to a human reviewer. The architecture treats stale data as missing data — this is a deliberate, conservative choice.
Infrastructure outage.
If the agent infrastructure itself is unavailable, incoming PA requests are queued and the submitter receives a structured "system processing" response (consistent with FHIR PAS conformance). No request is silently dropped. No determination is issued without the full agent graph having executed.
FAQ.
How are prior authorization agents organized?
In domain-decomposed pipelines: each agent owns one job (intake, eligibility, duplicate detection, criteria, evidence, auto-affirmation, non-affirm research) and passes structured state to the next agent. The orchestration is typically built on LangGraph or a similar agent-graph framework.
What is an audit packet?
A structured artifact emitted for every PA determination — including rule pack version, evidence chain, agent reasoning trace, human reviewer record (if applicable), and explicit regulatory citations. The audit packet is the artifact a CMS auditor would review; it is generated for every case, not just the cases that get audited.
How is state managed between agents?
Through a typed state object (a dataclass in Python) that each agent reads from and writes to. The state object includes the original request, all intermediate findings, evidence references, and the running determination. State is persisted at every step so that any agent can be replayed independently for debugging.
What happens when an agent fails?
Failure modes are explicit and routed. A timeout on the eligibility agent escalates to a human eligibility reviewer. A hallucination on the criteria agent (detected via grounded-evidence checks) routes to the non-affirm research agent. Failures are observable, recoverable, and never silent.
Continue the cluster.
How PA Automation Actually Works
The seven steps of a PA workflow. Where automation works, where it does not, and the architectural pattern.
Read the note Note · PositionThe Auto-Deny Problem
Why every clinical non-affirmation routes through Agent 871 to a human reviewer. The full argument.
Read the note Note · OperationsWISeR Production Field Note
Six weeks of running this architecture in CMS Medicare. What worked, what surprised us.
Read the note ArchitectureAether One™ Architecture
The patent-protected substrate that operationalizes this reference pattern.
Read the architecture IPPatent Portfolio
Ten patents filed across the agent architecture — including the auto-affirmation system that backs the live CMS deployment.
See the portfolio MarketplaceAether One™ Agents
The same agents productized as standalone marketplace SKUs on Microsoft Azure Marketplace.
Browse the catalogAn architecture review with the engineering team.
A 60-minute deep-dive with the engineers who built this. Bring the code review questions.