Lack of AI regulatory, clinical standards pose potential risks

While artificial intelligence promises to create actionable insights for clinicians to make better care decisions for patients, the regulations and standards for evaluating AI-based algorithms are lacking.

That’s the contention of Ravi Parikh, MD, a fellow in hematology and oncology at the University of Pennsylvania’s Perelman School of Medicine, and two colleagues.

Perelman School of Medicine-CROP.jpg

Writing in the February 22 issue of the journal Science, they make the case that evaluations of AI-based algorithms are not held to traditional clinical trial standards—and, as a result, there has been little prospective evidence that predictive analytics improve patient care.

“Several commercial algorithms have received regulatory approval for broad clinical use. But the barrier for entry of new advanced algorithms has been low,” charge the authors. “To unlock the potential of advanced analytics while protecting patient safety, regulatory and professional bodies should ensure that advanced algorithms meet accepted standards of clinical benefit, just as they do for clinical therapeutics and predictive biomarkers.”

“External validation and prospective testing of advanced algorithms are clearly needed, but recent regulatory clearances raise concerns over the rigor of this process,” contend the authors. “Recent clearances of algorithms demonstrate the limitations of current regulatory standards.”

In particular, Parikh and his colleagues point to the Food and Drug Administration’s January 2018 approval of a clinical monitoring platform that uses a predictive algorithm to alert hospital staff of a patient’s deteriorating condition about six hours in advance.

Also See: FDA approves clinical platform that predicts patient deterioration

To address these issues, the authors propose five standards to guide regulation of devices based on predictive analytics and AI: meaningful endpoints, appropriate benchmarks, interoperable generalizability, specifying interventions and audit mechanisms.


In a series of tweets on Thursday, Parikh elaborated on these five recommendations:

  • Predictive analytics and AI, just like drugs and devices, should show evidence of clinical (not just statistical) benefit. Doesn’t have to be overall survival. Even “time-to-diagnosis” would be nice. The surrogates used in some of these studies would even shock.
  • We ought to be benchmarking algorithms against standard of care. What are standard of care predictions in medicine? Clinician predictions! If we’re going to use analytics in practice, we should make sure these tools are better than clinicians’ intuition (which is pretty good).
  • Predictive analytics should be generalizable and interoperable. Remember, these tools are cleared for broad clinical use. But many validation studies only used single institution data in specific populations. Would you take a drug if it was only studied in one hospital?
  • Specify the interventions. Better predictive tools only improve care if we know how to act on it. In many prospective studies, the intervention that led to clinical benefit was more important than the algorithm prediction. We should have some idea how to react to predictions.
  • We do post-marketing studies for drugs; we ought to have similar post-clearance audit mechanisms for these algorithms. Unlike a drug, the components and performance of a machine learning algorithm can change over time. These are not static interventions.

“The FDA’s recent Digital Health Innovation Action Plan, issued in 2017, launched a precertification program to study clinical outcomes of AI-based tools and enable streamlined premarket review,” conclude the authors. “Such efforts should be lauded but expanded upon based on our five criteria.”

For reprint and licensing requests for this article, click here.