Your AI Agent Is Handling Patient Data Wrong. Here's the Fix.
By Michael Martin
Most healthcare teams I talk to are excited about AI agents, and they should be. Connecting an agent to your EHR, payer portal, CRM, and scheduling system, then having it actually do things, is a genuine step change in operational efficiency.
But there's a problem baked into how most of these agents work today. In healthcare, that problem isn't just a performance issue. It's a compliance issue.
How most AI agents work today
When you connect an AI agent to a set of tools, the typical approach is to load every tool definition into the model's context window upfront. When the agent calls a tool and gets data back, that data flows through the model too.
So if you ask your agent to pull a patient's recent claims and update a care coordination note, the patient's claim detail gets pulled, passed through the model, read by the model, then written back out into the update call. The data touched the model twice.
If it was a complex record, like lab results, clinical notes, or a multi-page EOB, that's tens of thousands of tokens worth of PHI flowing through an LLM.
From a token cost standpoint, that's wasteful. From a compliance standpoint, that should make your privacy officer uncomfortable.
There's a better pattern
The concept of code execution with MCP (Model Context Protocol) is genuinely important for healthcare builders. Instead of the agent directly calling tools and piping data through its own context, you give the agent the ability to write and run code against those tools.
Here's why that matters for PHI.
When the agent writes code to move data from Point A to Point B, say from a payer API to a care management platform, the data flows through the execution environment rather than through the model itself. You can intercept that data before it hits the model and replace it with placeholder tokens. The model sees things like [PATIENT_NAME_1] and [MRN_1]. The real values never enter the model's context at all.
The data flows correctly from source to destination. The agent does its job. But the LLM never read the actual name, DOB, or claim number.
That's a meaningful architectural distinction when you're operating under HIPAA.
The efficiency gains are real
The performance side is significant too. Benchmarks show reductions from 150,000 tokens down to roughly 2,000 by switching from the standard pattern to code execution. For healthcare integrations that span multiple systems, that number is believable.
EHR integrations are verbose. Payer APIs return deeply nested payloads. If your agent is loading tool definitions for thirty-plus endpoints every time it runs, you're paying for tokens that aren't doing anything useful.
What to ask if you're building AI in healthcare
If your team is building or evaluating AI agents that touch patient data, here are the questions worth asking now.
Does your agent's data ever touch the model context when it doesn't need to? If you're piping API responses directly through the LLM to move them to the next step, the answer is probably yes.
Do you have a PHI tokenization layer in your agent harness? If not, you're relying entirely on access controls and your BAA to protect data that's actively flowing through an LLM.
Are your agents designed to be reusable? The code execution model naturally encourages building agents that save their working logic as persistent, callable skills. That's the right engineering direction.
The honest tradeoff
This approach is not plug-and-play. Running agent-generated code requires a sandboxed execution environment, resource limits, and monitoring. That's real infrastructure work. If your integration is simple, standard tool-calling might still be the right call.
But if you're building anything that handles PHI at scale, prior auth automation, care coordination workflows, clinical documentation, the code execution pattern deserves serious consideration. The compliance story alone makes it worth the investment.
The healthcare industry has spent decades learning hard lessons about what happens when data governance is an afterthought. We don't need to repeat those lessons with AI.
At Digital2DNA, we build healthcare AI integrations with FHIR-native architecture and compliance built into the foundation, not bolted on after the fact. If you're working through AI agent architecture for a healthcare use case, let's talk.