Opinion & Analysis

AI Risk Management Begins With the Model’s Intended Use, Not the Model Itself

Written by: Chathuri Daluwatte

Updated 4:09 PM UTC, Fri July 26, 2024

Shiraz aka shibby

The passage of the EU AI Act marks a milestone in artificial intelligence (AI) legislation and is anticipated to be followed by AI legislation in other regions.

The EU AI Act, as well as recently proposed AI risk management frameworks (e.g., ISO/IEC 42001:2023, AI Risk Management Framework from NIST), are grounded on regulating AI using risk-based approaches. Establishing credibility of computational models (AI is a type of a computational model) has a long history in industries that leverage computational modeling (e.g., aerospace, aviation, automobile, autonomous driving, defense, medical devices, model-informed drug development).

Frameworks from these industries can be studied by other industries that are aiming to capitalize on data using AI but are new to risk-based regulation of computational approaches.

Risk stratification of a computational model and assessing credibility

● It is key that the risk stratification of a model and evidence of credibility is tied to a “context of use (COU)” of the model, not to the model itself without the COU.

● The credibility assessment changes with the COU as well as the model.

The Verification and Validation (V&V) 40 standard, published by the American Society of Mechanical Engineers (ASME), is a recognized consensus standard by the U.S. Food and Drug Administration (FDA) and has a framework to assess the credibility of computational modeling through Verification and Validation for application to medical devices.

The ASME has similar standards for different industries, V&V 10, 20, 30, and V&V 70 is currently being developed for AI.

As defined by the V&V 40 standard:

● Credibility of a model is the trust in the capability of the algorithm for a context of use (COU) obtained through the collection of evidence.

● A risk-informed credibility assessment framework provides a means to determine the level of credibility needed to support the model for a COU and gathering evidence under verification and validation activities to establish credibility.

Risk-informed credibility assessment (ASME V&V 40 process)

Model life cycle management

Changes to models are inevitable and are also necessary to stay current in the changing business environment. First step in managing change defining the type of change in relation to the model context of use.

The FDA Artificial Intelligence/Machine Learning (AI/ML) technology has a framework for model life cycle management both for “locked” models and “adaptive” models and defines three types of modifications to algorithms:

1. Modifications with no change to the context of use and/or inputs/outputs: This type of change can be managed by reassessing model credibility with no change to the model risk and V&V plan. e.g.: training with new datasets within the context of use from the same type of input signal, change in the AI/ML architecture.

2. Modifications related to inputs/outputs, with no change to the context of use: This type of change can be managed by modifying the V&V plan and reassessing model credibility without changes to the model risk.

3. Modifications related to the context of use: This changes the model risk category. To manage the change, model risk and rigor of validation should be reevaluated, and the V&V plan should be modified accordingly prior to reassessing model credibility.

The FDA AI/ML framework allows approved prespecified changes to occur with:

● An approved change protocol (defined using two documents: Software Change Pre-specification, SPS and Algorithm Change Protocol, ACP)

● Culture of quality and organizational excellence

● Real-world performance monitoring and documentation

Documentation in all stages of credibility assessment and lifecycle management is a key enabler of achieving transparency.

AI governance is horizontal and vertical and can be automated using AI.

Horizontal – DevSecOps: code quality and security; MLOps: monitoring and audit trail generation; exception handling requirements; and documentation are mandatory for all production pipelines.

Without setting up monitor-by-design, monitoring with embedded credibility rules, acceptance criteria, fallback modes, and trail generation, and passing DevSecOps and MLOps rules, the AI pipeline cannot be moved to production such that the developers during the development stage are prompted to adhere to governance rules.

Vertical – Domain-specific golden paths to enforce domain-specific credibility evidence generation, credibility rules, acceptance criteria, and fallback modes.

With an Internal Developer Platform (IDP), AI can be used to automate AI governance.

As proposed in the V&V 40 standard, one of the key factors of a credibility assessment is a team of people with adequate knowledge of the computational model (the builders and testers) and COU (i.e. the business).

An IDP allows this to be achieved efficiently as the community is empowered to write and approve golden paths creating governance by the builders, testers, and the business, for the builders, testers, and the business. More importantly, AI builders are expensive talent, and it is not optimal for them to spend their time reading governance SOPs and implementing them from scratch for every project. IDP enables the reusability of governance assets across use cases.

With monitor-by-design, an AI pipeline in production is automatically monitored and an audit trail is generated. If a credibility rule is violated (due to data drift/model drift), the pipeline goes into fallback mode, notifies the users, and triggers change management of the model to the development and business team, thereby generating transparency.

About the Author:

Chathuri Daluwatte, PhD, has led organization-wide, nationwide, and global AI initiatives in the regulated life sciences industry that resulted in a competitive advantage. Her unique thought leadership stems from her technology background, business acumen, and regulatory and health policy experience.

Daluwatte has a bachelor’s degree spanning fields of electronics, telecom, and computer science, along with a master’s degree in statistics, and a Ph.D. in biomedical engineering and has worked at bioMérieux and then in the U.S. Food and Drug Administration (FDA), both at the Center for Devices and Radiological Health (CDRH), and the Center for Drug Evaluation and Research (CDER). Her professional experience also includes working at Sanofi, across the value chain and the enterprise Chief Digital Office, leading enterprise AI, globally. She is the Head of AI Diagnostics for Alexion, AstraZeneca Rare Disease. The opinions within this article are her own.