New blueprint for validating AI tools published by NHS AI Lab

The NHS Artificial Intelligence Laboratory (NHS AI Lab) has published a new blueprint for artificial intelligence validation in healthcare.

The document, NCCID case study: Setting standards for testing Artificial Intelligence, includes a proof-of-concept validation process for testing the quality of AI in the health and care sector. It follows an evaluation into the performance of AI models using data from the National Covid-19 Chest Imaging Database (NCCID).

It emphasises the need for a large volume of good quality data when developing AI radiology products so that the performance of the tools is effective and reliable enough for use.

Creating a validation process for AI tools, the document says, is essential to eliminating systematic errors in AI models. By validating the AI tools, and making sure that AI technologies are safe and ethical, it is possible to reduce negative outcomes for patients, the document says.

The tests described in the document calculate how accurately the models detected positive and negative Covid-19 cases from medical images, as well as how the models performed with different sub-groups – such as age, race, ethnicity, and sex. The analysis assessed the robustness of the algorithm by looking at how it performed in response to changes in the data, such as the inclusion of patients with additional medical conditions, or using images taken using different scanning equipment.

The document splits the validation process into four steps:

Creating a validation data set based on the intended use case of each algorithm by using data from the NCCID that had not been used to train the algorithm.
Running the algorithm in a cloud-based environment, providing a secure space to protect the developer’s intellectual property.
Running the model on the validation set, and performing pre-defined statistical tests to assess the robustness and performance of the model against various demographics.
Reporting the results to the organisation that built the models in order to inform model improvements.

Safe, robust and accurate models

The document says that by taking part in the study, the NHS team and vendors were able to learn more about the performance of their algorithms. The proof of concept process “proved a valuable blueprint for testing that the AI models adopted for use in health and care are safe, robust and accurate.”

Dominic Cushnan, head of AI imaging at NHS AI Lab, said: “Our rigorous validation and testing procedures have implemented a novel process to test that AI models adopted are safe, robust and accurate in diagnosing Covid-19 – while protecting developers’ intellectual property.

“Unfair and biased models can lead to inconsistent levels of care, a serious problem in these critical circumstances. Outside of the NHS, our validation process has helped guide the use of AI in medical diagnosis and inform new approaches to the international governance of AI in healthcare.”

“Unfair and biased models can lead to inconsistent levels of care, a serious problem in these critical circumstances...Our validation process has helped guide the use of AI in medical diagnosis and inform new approaches to the international governance of AI in healthcare.” Dominic Cushnan, head of AI imaging, NHS AI Lab