FDA AI SaMD Framework: What Developers Must Build

9 May 2026

TL;DR The FDA’s proposed 2026 update for AI software as a medical device raises the bar on how ML systems are designed, documented, and monitored after release. If you ship CDS or diagnostic AI, you need a predetermined change control plan, traceable model versioning, auditable training data lineage, and field monitoring built into the architecture—not bolted on during validation. Teams that treat compliance as a delivery constraint will move faster than teams that treat it as a paperwork exercise.

What changed in the FDA’s AI SaMD framework

The core shift is simple: the FDA is moving from “inspect the finished model” to “inspect the system that safely changes.” That matters because machine learning products are not static software. They drift, they retrain, they pick up new failure modes, and they can behave differently across sites, populations, and device configurations. For developers, that means the regulatory target is no longer just algorithm performance at launch. It is the controls around change, monitoring, and traceability.

For CDS vendors and diagnostic AI teams, the most important concept is the Predetermined Change Control Plan, usually discussed as a PCCP. If your product is expected to learn or update after deployment, the FDA wants to know in advance what can change, how it will change, and how you will verify that those changes remain safe and effective. That includes the model, the thresholds, the inputs, the retraining cadence, and the guardrails around rollback.

We’ve seen this pattern before in regulated healthcare software. When our team builds systems that have to stand up to both clinician scrutiny and compliance review, the bad assumption is always the same: “We’ll get the model working first, then add controls later.” That sequence creates rework. It also creates release trains that nobody in QA, security, or regulatory can trust.

Why this matters to buyers, not just regulators

The buyer problem is not abstract. A hospital or health system doesn’t want an AI feature that is impressive in a demo and impossible to defend in production. They want a tool they can deploy without creating a validation nightmare for clinical leadership, InfoSec, and legal. If your product touches diagnosis, triage, prioritization, or treatment recommendation, the question is no longer “Does the model work?” The question is “Can we prove it works, keep proving it works, and show the controls behind every meaningful change?”

That is why regulatory architecture now affects sales cycles. Procurement teams are asking for audit trails, model documentation, incident response paths, and evidence that updates will not silently change user outcomes. In practice, the software team that can answer those questions with a clean control plane will beat the team that says, “We’ll send you a PDF.”

3control layers most teams need: model governance, release gating, and post-market monitoring

14+months of validation burden that can appear when change control is added too late

1auditable source of truth for training data, model versions, and deployment decisions

AST perspective: where teams usually get stuck

AST’s work in regulated clinical software keeps showing the same failure mode: product teams separate engineering from evidence generation. That works until the first serious review, then every answer lives in a different spreadsheet, a private Slack thread, or a one-off notebook. We’ve integrated compliance-minded delivery into product builds for more than eight years, and the teams that do best are the ones that treat traceability as part of the product, not as an external checklist.

Pro Tip: Build your evidence pipeline the same way you build your model pipeline. Every training run, threshold change, dataset refresh, and deployment approval should produce immutable records automatically. If a reviewer has to reconstruct your history by hand, the architecture is already wrong.

Three architecture patterns that will pass stricter review

Not every AI SaMD needs the same level of controls, but if you are operating in CDS or diagnostic AI, these four patterns are the difference between a scalable product and a compliance fire drill.

Approach	What it includes	Best fit
Static model release	Frozen weights, fixed thresholds, manual validation per version	✓ Low-change workflows, narrow clinical scope
PCCP with controlled updates	Preapproved update types, versioned test packs, rollback rules	✓ Most ML-based clinical products
Full MLOps with audit trail	Model registry, dataset lineage, CI/CD gates, post-market monitors	✓ High-volume CDS and diagnostics
Ad hoc retraining workflow	Manual scripts, undocumented data refreshes, no immutable evidence	✗ Not acceptable for regulated deployment

1. PCCP-centered release architecture

This is the safest default for teams shipping ML-based clinical tools. Your PCCP should define the update boundaries: which inputs may be expanded, which data distributions are allowed, what performance delta triggers review, and what test suite must pass before release. In engineering terms, the PCCP becomes a policy layer above the model registry and CI pipeline. You are not asking regulators to bless every future change. You are asking them to bless the rules that govern change.

2. Model registry plus lineage-first governance

A proper model registry is more than a place to store weights. It should link the model artifact to the exact training dataset snapshot, labeling version, feature definitions, hyperparameters, evaluation cohort, clinical performance metrics, and approval record. If you are missing lineage, your audit trail will collapse the first time someone asks why one site saw different results than another. We’ve seen this in real deployments: the biggest risk is rarely the model itself. It is the inability to explain the model’s pathway from data to production.

3. Release gating with test matrices

Every release should be tied to a test matrix that includes functional performance, subgroup analysis, calibration, false positive/negative impact, and failure mode review. For clinical AI, this is where teams often underbuild. General software QA is not enough. You need a release gate that can answer: did the model degrade on a protected cohort, did sensitivity shift above the allowed tolerance, and did the UI or workflow change alter human interpretation?

4. Post-market monitoring and rollback controls

If the model is in the field, the work is not done. Monitoring needs to detect drift, silent input changes, label mismatch, and operator behavior shifts. Rollback cannot be a manual hero move. It should be a one-click revert to a known-good version with a recorded justification. The FDA framework makes this kind of control non-negotiable because once the product is in use, safety is a continuous property.

How AST Handles This: Our integrated pod teams build the model governance layer at the same time as the product layer. That means developers, QA, and DevOps are producing the audit trail as part of delivery: versioned test evidence, release approvals, environment state, and deployment history. For regulated healthcare software, that eliminates the usual scramble where engineering ships first and compliance tries to reconstruct what happened afterward.

Decision framework for developers

Classify the device behavior Decide whether the product is static, updateable under a PCCP, or needs tighter premarket review because changes are too open-ended.
Define the change boundary List exactly what can change: training data, features, thresholds, retraining schedule, model family, and clinical workflow outputs.
Instrument the evidence trail Capture dataset lineage, model versioning, test results, approvals, and deployment metadata automatically in the pipeline.
Design for monitoring Add drift detection, subgroup watchlists, alerting thresholds, and rollback paths before launch.
Prove the control system Make sure your documentation, CI/CD, and validation artifacts tell a single consistent story to engineering, regulatory, and customer reviewers.

Warning: Do not treat the PCCP as a legal document that sits outside engineering. If your platform cannot enforce the plan technically, you do not actually have a controlled change process—you have a promise.

What AST sees in real delivery work

We’ve built healthcare software where reliability, security, and clinical trust all had to hold at once. The pattern is consistent across products: the teams that win are the ones that bake evidence into the system architecture. In practical terms, that means every deployment is tied to an immutable record, every model update has a test envelope, and every exception has a named owner.

Our team currently supports clinical software used across 160+ respiratory care facilities, and that operational reality matters. In environments like that, you do not get to be vague about change. If something shifts in production, the system has to show what changed, when it changed, who approved it, and how you know it is still safe. That is the mindset FDA-regulated AI needs as well.

Key Insight: The future of AI SaMD compliance is not documentation-heavy engineering. It is evidence-native engineering. The best teams will build product, validation, and monitoring as one system.

FAQ: FDA AI SaMD framework

What is the biggest operational change in the FDA’s proposal?

The biggest change is that updateability now has to be designed up front. If your model will change after deployment, you need a predetermined change control plan, traceable evidence, and monitoring that proves the updates stay within approved bounds.

Do all AI clinical tools need the same level of controls?

No. The control burden depends on risk, intended use, degree of autonomy, and whether the model changes over time. Static tools may need simpler controls, while diagnostic AI and higher-risk CDS products usually need stronger registries, validation gates, and production monitoring.

What should engineering teams build first?

Start with lineage and release traceability. If you cannot prove which data created which model and which approvals led to which deployment, everything else becomes harder to defend.

How does AST work with teams on regulated AI software?

AST uses integrated engineering pods that include developers, QA, DevOps, and PM support from day one. For regulated products, that lets us build compliance evidence, release controls, and deployment automation as part of the product rather than as a separate cleanup step.

What if our current ML pipeline was built for speed, not auditability?

Then you likely need to refactor the pipeline around immutable artifacts, model registry discipline, and controlled deployment gates. That is usually less work than retrofitting compliance after the fact, especially once customers or regulators start asking for proof.

Need an FDA-ready AI SaMD architecture?

If you are shipping CDS or diagnostic AI and your current pipeline cannot produce a clean PCCP, audit trail, and post-market monitoring story, we can help you design the control system around the product. Book a free 15-minute discovery call — no pitch, just straight answers from engineers who have done this.

Book a Free 15-Min Call

What do you think?

Show comments / Leave a comment

Guides

Collaborate with us for Complete Software and App Solutions.

We’re happy to answer any questions you may have and help you determine which of our services best fit your needs.

Your benefits:

What happens next?

We Schedule a call at your convenience

We do a discovery and consulting meeting

We prepare a proposal

FDA AI SaMD Framework: What Developers Must Build

What changed in the FDA’s AI SaMD framework

Why this matters to buyers, not just regulators

AST perspective: where teams usually get stuck