Building trust and accountability in medical AI

A practical guide to building trustworthy medical AI, covering data quality, bias, security, governance and safe deployment.

Last Updated: Mar 30, 2026, 15:11 IST6 min
Prefer us on Google
New
While AI has been implemented across the country, there has not been a corresponding increase in the level of trust that patients and physicians place in its use within healthcare. Photo by Adobe Stock
While AI has been implemented across the country, there has not been a corresponding increase in the level of trust that patients and physicians place in its use within healthcare. Photo by Adobe Stock
Advertisement

Artificial intelligence in healthcare has advanced to the point where hospitals are using AI to identify potential cases of sepsis earlier, prioritise radiology queues, provide suggestions for chemotherapy treatments, and assist patients with post-discharge instructions. In addition, venture funding and enthusiasm from big tech companies have driven a $5 billion industry, projected to grow to as much as $50 billion by 2020.

However, while AI has been implemented across the country, there has not been a corresponding increase in the level of trust that patients and physicians place in its use within healthcare. The Collingridge Dilemma applies here as well: society often adopts new and exciting technologies before fully understanding their negative consequences. Once these consequences emerge, it becomes difficult to add guardrails to mitigate them.

The solution, therefore, is not to slow down the development of AI in medicine, but to operationalise assurance. This article outlines a practical agenda that uses both vendors and healthcare organisations to transform the amorphous “risk” of AI in healthcare into tangible, actionable items. Examples that clinicians and CIOs can use on Monday morning are outlined below.

The first step is to create an environment for collecting and managing the data required for medical models. The initial focus should be on documenting the source of the data. A critical concern is establishing data provenance before attempting to improve any model. To address this, healthcare organisations should require a written “data bill of materials” for each model they intend to implement.

This “bill of materials” should include specific details such as:
(a) the hospitals that contributed the data;
(b) the type of scanner(s) that produced the images;
(c) time frames and inclusion or exclusion criteria for patient selection; and
(d) how informed consent was obtained, or whether consent was waived.

For example, a startup developing a cancer imaging model for lung nodules should assert that its dataset includes scans from at least three different CT manufacturers and multiple reconstruction kernels. It should also demonstrate quantifiable evidence of diversity across domains—such as scanner type and geographic location—rather than merely stating the number of samples included.

To address concerns around patient anonymity, synthetic data may be considered. However, synthetic data requires the same level of oversight as real-world data. Organisations should insist on measurable similarity metrics, such as maximum mean discrepancy, between synthetic and actual distributions. Membership inference testing should be conducted to prevent patient re-identification, and the percentage of synthetic samples used per class should be limited.

One method for implementing these principles is through a federated learning pilot. In one example, five hospitals trained a diabetic retinopathy model using local training. Instead of sharing patient data, the hospitals shared only weight updates. As a result, patient privacy was preserved, accuracy improved—as reflected in a higher Area Under the Curve—and the need for a central repository of patient data was eliminated.

The next phase is data processing, where many projects fail quietly rather than succeed visibly. Labels represent clinical judgement and must be treated accordingly. All labels should be double-read, with annotators completing two independent readings and adjudication protocols established by senior specialists for disagreements beyond a defined threshold.

In a pulmonary nodule service, for instance, radiologists independently record nodule size, attenuation, and morphology, with a third consultant resolving discrepancies. Inter-rater reliability is measured and published. Annotation tools should be used to automate routine tasks, pre-populate fields, prevent impossible entries, and record the exact viewer settings used during labelling.

Standardisation is essential. Structured data should follow HL7 FHIR standards, while imaging metadata should use DICOM. Free text should be mapped to controlled vocabularies such as SNOMED CT and LOINC before training. Organisations should also develop a “data readiness level” (DRL), a nine-point ladder similar to NASA’s TRL. DRL scores datasets based on completeness, cleanliness, documentation, and bias assessment. No model should be deployed if it is trained on data below DRL-7. This shifts the conversation from “we have a dataset” to “we have a dataset fit for purpose”.

Bias and opacity form the political economy of medical AI. While manageable, they cannot be addressed through platitudes. The priority must be relentless measurement of subgroup performance and designing for uncertainty. A dermatology tool trained primarily on images of fair-skinned individuals should not be deployed universally. It should either be retrained using targeted oversampling of dark-skin cases or refuse to provide recommendations when confidence falls below a defined threshold, routing such cases to specialists. This is not failure; it is responsible autonomy.

Organisations should publish “model cards” and “data cards” that disclose intended use, training composition, subgroup metrics—such as sensitivity and specificity by age, sex, and skin type—known failure modes, and update cadence. Explainability should avoid theatre. Saliency maps that appear convincing but mislead are worse than useless. Instead, global feature attribution summaries—for example, the role of oxygen saturation and creatinine in a sepsis score—should be paired with local explanations only when clinically meaningful, and always accompanied by uncertainty.

Cyber safety and cyber security require particular rigour. Medical AI should not be treated like a sophisticated spreadsheet. Each system must be threat-modelled, with a secure development lifecycle that includes code reviews, dependency scans, and a Software Bill of Materials detailing all libraries used. Model artefacts should be signed so hospitals can verify that the production model matches the validated version.

Routine testing must include adversarial scenarios. In imaging, this may involve small, structured perturbations to chest X-rays to ensure labels remain stable. In NLP systems, prompt injection attacks should be tested to prevent recommendations of contraindicated drugs. Systems where robustness is critical—such as insulin dosing or radiation planning—should include both adversarial training and out-of-distribution detection.

Access controls should follow standard best practices: segmentation, least privilege, disabled USB ports on theatre robots, key rotation using hardware security modules, and multi-factor authentication for model management consoles. Incident response plans should include a “kill switch”—a one-click rollback to a last known good model and a manual workflow. Triage models, for instance, should be stress-tested by simulating concept drift events and reverting to human triage within five minutes.

Privacy warrants dedicated attention. De-identification reduces disclosure risk but does not eliminate it. It should be treated as defence-in-depth, not a talisman. High-risk fields such as names and NHS numbers should be tokenised, data encrypted at rest and in transit, and audit logs stored on write-once media. For research or validation, organisations should use secure data enclaves with role-based access and query auditing, rather than sharing CSV files via email.

Governance converts decisions into policy. Hospitals should establish algorithm oversight committees with representation from clinical, data science, security, and legal teams. Their mandate should include approving intended uses, setting evidence thresholds, requiring pre-deployment trials, and enforcing post-market surveillance through monthly dashboards.

Procurement must move beyond demos to documentation. Sensible RFPs should request confusion matrices by subgroup, failure-mode analyses, red-team reports, an SBOM, and change-control plans. Contracts should include service-level agreements for model updates, obligations to preserve logs after incidents, and rights to independent security testing.

Supply chains are unglamorous but critical. Health systems must manage third-party AI risks by mapping dependencies on cloud regions, platforms, and libraries, avoiding concentration risk, and conducting failover drills. Vendors should design for graceful degradation, ensuring systems revert safely to manual or rules-based workflows when components fail.

The objective is not to wrap medicine in bubble wrap. It is to make trust a deliverable alongside AUC and latency. Systems that declare their training data, acknowledge uncertainty, degrade safely, and leave audit trails will earn their place at the bedside. Medical AI does not need another manifesto. It needs evidence, infrastructure, and the humility to route hard cases to humans.

This article has been published with permission from IIM Calcutta. https://www.iimcal.ac.in/ Views expressed are personal.

First Published: Mar 30, 2026, 15:19

Subscribe Now

Latest News

Advertisement