“AI is everywhere” is no longer an exaggeration. The use of AI in healthcare is spreading like wildfire.
According to Doximity’s 2026 State of AI in Medicine Report, 63% of physicians surveyed in November 2025–January 2026 now uses AI in clinical practice. That figure was 47%, just nine months earlier.
Among non-federal US hospitals, 31.5% had already integrated generative AI into their electronic health records in 2024, with another 24.7% that had planned to add it within 2025.
So, AI adoption is no longer a developing story. The real concern is now governance.
Nearly half of healthcare organizations have no formal AI approval process, and only about 31% actively monitor the AI systems already running in their environment. In other words, AI is in the building, and in most cases, nobody is watching what it touches.
HIPAA was written in 1996. It was not written for LLMs, ambient scribes, or Retrieval-Augmented Generation (RAG). And yet, it applies to all of them.
The confusion in most healthcare organizations is not whether HIPAA covers AI. Because it does. The confusion is how.
This blog answers the specific questions IT teams are asking right now about HIPAA compliant AI on what the rules require in practice and where the deployments actually break.
Why Should You Treat Your AI Vendor as a Business Associate?
If an AI tool creates, receives, maintains, or transmits PHI on your behalf, or if it somehow reidentifies a deidentified PHI, it is a business associate under HIPAA. And as a business associate, a signed business associate agreement (BAA) is mandatory before the first byte of PHI moves into any tool.
There are no exceptions for AI.
Two things make AI vendors different from a typical SaaS business associate. First, a standard BAA template is rarely enough. It was written for vendors that store and transmit data, not vendors that train models on it.
An AI-grade BAA needs explicit language on four points:
- A written prohibition on using your PHI to train or fine-tune models.
- Defined data retention and deletion timelines.
- Full disclosure of subprocessors across the model and inference chain.
- A breach-notification clock that leaves you room to meet HIPAA’s own 60-day deadline.
We break these provisions down in our guide to BAAs for HIPAA compliant AI.
Secondly, the product tier decides everything. Consumer AI tools such as ChatGPT Free, Plus, Pro, and Team, do not come with a BAA and cannot lawfully touch PHI. A single patient identifier pasted into a consumer chatbot is an unauthorized disclosure, and that triggers breach notification whether anyone ever sees the data or not.
In January 2026, OpenAI launched ChatGPT for Healthcare with BAA support built in. But it is an enterprise product that needs to be reviewed by a procurement team and bought through formal purchasing process. It is not something an individual clinician can sign up for.
The lesson learned here is that the same model can be compliant on one tier and a violation on another. You need to verify the tier, the configuration, and the exact BAA scope before anyone touches it.
What About Minimum Necessary Standard Do Most Organizations Get Wrong?
The single most common compliance gap I see with the use of AI in healthcare is with the application of HIPAAs minimum necessary standard of data access.
Under 45 CFR §164.502(b), a covered entity must limit PHI access, use, and disclosure to the minimum necessary for the intended purpose. That rule should apply to an AI tool exactly as it applies to a human employee.
AI models perform better with more data, so vendors and integrators tend to feed them into the model directly. With AI, HIPAA compliance wants the opposite. AI tools should be designed to touch only the PHI strictly required for their purpose, even though models are built for comprehensive datasets.
An AI assistant that can query a full patient record but only needs demographics to do its job is already over-scoped, and that is a violation waiting for an auditor.
Two failure modes matter. On the input side, an AI tool is granted record-level access when field-level access would do. On the output side, a generative tool returns more PHI than the recipient needs.
AHIMA has testified to the National Committee on Vital and Health Statistics that the terms “reasonable” and “necessary” remain open to interpretation, which is precisely why organizations get this wrong. The practical fix is to scope access at the field level before deployment, not after an incident. Define what the tool can read, what it can return, and to whom. And above all, document it all.
When Do You Need De-Identification?
De-identification is the cleanest way to take data outside an LLM’s reach and thus remove it from HIPAA’s scrutiny.
The idea is simple: if you strip a dataset of everything that ties it to a specific person, it stops being PHI, and the rules that govern PHI no longer apply. That’s genuinely useful for AI, because it means you can train or test a model on the data without dragging the full weight of HIPAA into every step.
If an AI model is trained on PHI, that data must either be de-identified, or the HIPAA laws would apply to the model and the infrastructure around it.
The HHS recognizes two methods:
- Safe Harbor requires removing 18 specified identifiers.
- Expert Determination requires a qualified statistician to certify that re-identification risk is very small.
Either is acceptable, but neither is optional if you want the data treated as de-identified.
When you feed text into an AI system, it doesn’t store the words. It converts them into a long string of numbers called an embedding, essentially a mathematical fingerprint of the original content. It’s tempting to assume those numbers are anonymous because they don’t look like a patient record. They aren’t.
For HIPAA purposes, treat embeddings made from patient data exactly like the records they came from: same encryption, same access controls, same audit logging.
De-identification is also not permanent. When de-identified datasets are combined with other data, re-identification can become possible again. In many cases, some AI models have been able to re-identify some part of the de-identified data through some methods, and you need to be careful of that as well.
Your risk analysis has to account for it.
Why Is Nobody Testing the AI and EMR Integration Layer?
The PHI exposure usually does not happen inside the AI model. It happens in the integration layer between the AI tool and your EMR.
Most EMR integration with AI medical assistants runs on a standard called FHIR, which is simply the common format hospital systems use to share patient data. Ambient scribes like Nuance DAX, Suki, and Abridge plug into Epic or Cerner through it. To get in, the AI tool is handed a digital key, called an access token.
That key comes with a “scope” that decides how many doors it opens: which records the tool can see and what it can do with them. Epic’s own HIPAA compliant AI setup works this way.
When AI EMR integration is done right, the key is tightly limited and expires fast. When it’s done carelessly, one over-broad key quietly gives the AI tool access to far more patient data than its job requires.
The middle of that chain is a weak spot. Alissa Knight’s “Playing with FHIR” research showed that the connectors and middlemen sitting between clinical systems are often exploitable. Keys left exposed, weak locks, and one organization’s data reachable from another’s. If the layer the AI tool plugs into is already insecure, it automatically inherits all of that insecurity too.
This is where hands-on penetration testing pays off. Check the keys. Check what they unlock. Check where the data actually flows between the AI tool and the EMR, and not the vendor’s diagram of it.
And make sure it’s all logged. HIPAA requires a record of every time PHI is viewed, changed, or exported, including when an AI tool does it, not just a person.
KLEAP’s application security testing focuses on precisely this surface.
How Can On-Premises AI Be an Alternative to Third-Party Cloud Risk?
A question I hear constantly: “We want an AI-powered internal knowledge base, but we can’t send patient data to a third-party cloud provider. What are our options?”
The answer is self-hosted AI, running inside your own governed perimeter. PHI-laden retrieval workflows should not pass through public cloud LLM APIs if you can keep them in-house. Self-hosting keeps PHI within a boundary you control and audit, and it removes a business associate from the chain entirely.
But on-premises does not mean automatically compliant. The two terms get conflated.
HIPAA-eligible means a service can be configured to meet HIPAA requirements. HIPAA compliant means it actually is, in your specific deployment.
Self-hosting buys you eligibility, not a finish line. The same safeguards still apply: encryption at rest and in transit, role-based access controls, complete audit logging, and BAAs for any vendor that still touches the environment.
This gives you more control over your data, albeit with higher infrastructure and staffing costs. For many regulated workloads, that trade is worth making.
What Are the State AI Laws on Top of HIPAA?
Federal HIPAA is no longer the only regulation in play. States are adding healthcare-specific AI governance and bringing laws into effect.
Three new state AI laws sit in the 2026 window:
- Texas’s TRAIGA (effective January 1, 2026) adds AI-use disclosure duties for licensed practitioners.
- California’s AB 489 (effective January 1, 2026) bars AI from implying it holds a healthcare license and is enforced by professional licensing boards.
- The Colorado AI Act (effective June 30, 2026) targets algorithmic discrimination in high-risk systems, including healthcare.
This is a fast-moving area. The Colorado AI Act in particular is under active repeal-and-replace pressure, and effective dates and scope are still shifting.
The takeaway is that an organization deploying AI now has to track both federal and state obligations and verify the current status of each law at the time of deployment rather than trusting a summary written months earlier.
What Is the Documentation Checklist Before You Deploy Any AI Tool?
Before a third-party AI platform touches PHI, a healthcare organization should have the following in place and on file:
- An AI-specific BAA signed with the vendor, executed before first use.
- A risk analysis updated to include the AI system, which is still OCR’s most-cited deficiency.
- Minimum necessary access scoped and documented at the field level.
- A data-flow map of what PHI goes where, in which direction, what is stored, and what is returned.
- A validated de-identification method, where de-identification is used.
- A breach-notification protocol updated for AI-specific failure modes, such as a model surfacing PHI in an output or training-data leakage.
- Workforce training on permissible versus impermissible AI use.
- A maintained AI tool inventory, which the proposed 2026 HIPAA Security Rule update would require, and which also closes the “shadow AI” gap.
- Audit logging configured for every AI-initiated PHI access.
If you cannot produce these on request, you are not ready to deploy. And an OCR investigation will start with exactly this list.
Industry analysis of OCR enforcement attributes roughly three-quarters (76%) of 2025 penalties to risk-analysis failures, and names inadequate risk analysis as the single most common finding year after year.
How Can KLEAP Help?
HIPAA compliant AI is not a product you buy. It’s a model you can stand behind if a regulator asks. The AI model itself matters far less than the data that flows into it. It is what determines whether patient information is secure or not.
That layer is where KLEAP works. We manually test the integration between AI tools and EMR systems. This involves the FHIR API surface, the OAuth scopes, and the real data flow between the AI assistant and the EHR, not the architecture diagram.
On the advisory side, we map your AI deployment to the specific HIPAA controls it has to satisfy and tell you, plainly, where the gaps are.
KLEAP works on a concierge model. You get a dedicated security lead and an audit-ready report you can hand to a regulator or a board, not a templated scan. KLEAP works with hospitals, health systems, and healthtech platforms navigating exactly this problem.
If you are deploying or evaluating an AI tool and want to know where your real exposure is, get a clear scope in one call.
