Navigating the Terrain of Healthcare AI
Artificial intelligence is a rapidly evolving tool in the healthcare space. AI has the potential to positively impact our healthcare experience, ranging from functioning as a safety net providing a second read on imaging scans, to having an immediate impact at the bedside through clinical decision support tools. As healthcare AI evolves, so do the questions and the need to learn from real-world medical experience.
To help you navigate the healthcare AI terrain, we’re working with Dr. Harpreet Dhatt, a Diagnostic Radiologist at Dignity Health. Together with Dr. Dhatt, we’ll explore AI through the eyes of a physician providing direct patient care. He’ll share his experience and learnings in the blog series Navigating the Terrain of Healthcare AI.
The series will cover:
- An introduction to the field of radiology
- A look at the benefits of AI from the physician’s perspective
- Fundamentals of machine learning in medical imagery
- Machine learning and its role in healthcare
- AI as a tool for preventing missed findings and minimizing malpractice
- Moving from peer review to peer learning and how AI can help
- Will AI replace the radiologist
- Where does AI fit into current workflows
Read the first post in the series, Radiology and Artificial Intelligence Alliance.
Will Healthcare AI Save the Day?
Eyeball-grabbing headlines about the inevitable march of artificial intelligence replacing human intellectual tasks abound. While the general media news is filled with anecdotes, often supported by superficial non-scientific data, about the rise of AI, some ground reality supports this general sentiment, especially in post-COVID healthcare. And nowhere is it more palpable than in medical imaging.
Each passing month confirms our suspicions of the immense radiologist shortage and exploding medical imaging volumes. The disconnect between supply and demand has been exacerbated by the COVID-19 pandemic but was nonetheless predictable given the upward trajectory of radiology imaging volumes. Given the current situation, we must assess the role of machine learning (ML) and, very specifically, deep learning based on convolutional neural networks (CNN) as an assistive technology to provide accurate, efficient, and scalable patient care.
Before diving into the benefits and challenges of integrating ML into routine radiology practice, it’s incumbent upon us to understand some basics of CNN so that we may better evaluate the technology’s benefits and limitations. Ultimately, we should be able to distinguish poorly developed algorithms and workflow models from properly developed technology.
Let’s start by defining artificial intelligence. Simply stated, it’s the ability of a computer to perform a task with accuracy and efficiency equal to a human being. Traditional AI relies on predefined features based on expert input to identify specific markers on radiology images, such as tumor size, producing data that is subsequently processed by statistical models for the desired output. Specifically, traditional AI requires handcrafted features by human experts, such as manual segmentation of tumors or organs as input into conventional machine learning models, which “learn” these data to produce output on subsequent testing. While Traditional AI remains useful, it’s hindered by requiring intense expert labor and limited by the predefined input from the data.
CNN, in contrast, is adaptive technology using a mathematical model with three layers (building blocks) to automatically and without predefined features navigate provided spatial data (radiology images) for desired outputs such as independent identification and segmentation of tumors on CT or any other desired prediction. Through independent learning from immense data space, CNN-based algorithms become enhanced in complex problem-solving using three layers to produce a reasoned output and continually improve with more data exposure.
In radiology, CNN are generally synonymous with AI/machine learning. But we must remain grounded in the fact that the capabilities of AI in the current iteration of CNN algorithms are narrow and task-specific – these can address only one problem or abnormality. Predicting outputs for multiple abnormalities remains distant. While the engineering basics of CNN are beyond the scope of this post, we must exercise caution in determining the value of any particular CNN algorithm pitched at radiologists and managers by acquiring some fundamental knowledge of model training.
The most basic task of CNN in computer vision is identification. CNN, organized like the human visual cortex, is responsible for identifying a single abnormality on both simple and complex medical imaging. These abnormalities range from fractures to tumors to free air in the abdomen. Given the gravity of identifying such serious abnormalities and the impact on patient health, the robustness of algorithms must be verified prior to buying the product and applying it to clinical care. The robustness of any particular CNN algorithm is predominantly dependent on the quality and quantity of data available for training. Since deep learning technology is a “black box,” we are unable to audit the decision trail for any specific output due to the presence of “hidden layers” between input and output. Therefore, we must be assured that the training data meet high standards as that is the primary way to determine the algorithm’s efficacy.
What are some general guidelines to ensure data standards are met?
Suppose a particular algorithm produces too many false positives. In this situation, the radiologists’ time will be wasted chasing ghosts, rendering the entire process inefficient and making healthcare more costly for patients and enterprises. False positives can also lead to excessive imaging or unnecessary procedures. On the other hand, if the algorithm produces too many false negatives, not only is the patient harmed by missing disease, but this also places the AI developer, the enterprise, and radiologists in legal harm.
Since we are unable to truly audit the CNN for its outputs, we must assess training data quality to understand the AI predictions/outputs. Ideal models are trained on meticulously collected data (images) that are appropriately labeled by experts (radiologists), also called ground truths. Because high-quality labeled images are rare and expensive but remain an important step in the model, one must carefully interrogate this portion of the algorithm development to ensure the product meets the quality standards before applying it to patient care.
The subsequent steps of “validation data” and “test data” are also important but are more about fine-tuning the model’s performance while keeping in mind that if the initial data is suboptimal, the algorithm will not be adequate. Radiologists and enterprise stakeholders must delve deep into the origin of data and the generalizability of that data to the particular enterprise and its patient population. When the training images are obtained from vetted open-public repositories such as The Cancer Imaging Archive (TCIA), one can be assured the data is relatively high quality.
Another consideration is ensuring the training data for the algorithm has sufficient volume and appropriate labeling that corresponds to the needs of the buying enterprise and specific radiology practice. If another source is used for training, the enterprise must identify a qualified radiologist who understands AI training models and the local practice environment to vet this data prior to instituting an enterprise-wide pilot. Radiologists must also remain highly vigilant by performing scheduled regular audits (QA process) to ensure no harm to patients or enterprise is occurring by AI behavior. This aspect of radiologist involvement is highly critical as we delve into the medicolegal issues that may arise from AI implementation in our next post.
While AI might save the day, it must be rigorously tested to be trusted.