Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Distributional drift detection is important in medical applications as it helps ensure the accuracy and reliability of models by identifying changes in the underlying data distribution that could affect the prediction results of machine learning models. However, current methods have limitations in detecting drift, for example, the inclusion of abnormal datasets can lead to unfair comparisons. This paper presents an accurate and sensitive approach to detect distributional drift in CT-scan medical images by leveraging data-sketching and fine-tuning techniques. We developed a robust baseline library model for real-time anomaly detection, allowing for efficient comparison of incoming images and identification of anomalies. Additionally, we fine-tuned a pre-trained Vision Transformer model to extract relevant features, using mammography as a case study, significantly enhancing model accuracy to 99.11%. Combining with data-sketches and fine-tuning, our feature extraction evaluation demonstrated that cosine similarity scores between similar datasets provide greater improvements, from around 50% increased to 99.1%. Finally, the sensitivity evaluation shows that our solutions are highly sensitive to even 1% salt-and-pepper and speckle noise, and it is not sensitive to lighting noise (e.g., lighting conditions have no impact on data drift). The proposed methods offer a scalable and reliable solution for maintaining the accuracy of diagnostic models in dynamic clinical environments.more » « lessFree, publicly-accessible full text available July 7, 2026
-
Astley, Susan M; Chen, Weijie (Ed.)Devices enabled by artificial intelligence (AI) and machine learning (ML) are being introduced for clinical use at an accelerating pace. In a dynamic clinical environment, these devices may encounter conditions different from those they were developed for. The statistical data mismatch between training/initial testing and production is often referred to as data drift. Detecting and quantifying data drift is significant for ensuring that AI model performs as expected in clinical environments. A drift detector signals when a corrective action is needed if the performance changes. In this study, we investigate how a change in the performance of an AI model due to data drift can be detected and quantified using a cumulative sum (CUSUM) control chart. To study the properties of CUSUM, we first simulate different scenarios that change the performance of an AI model. We simulate a sudden change in the mean of the performance metric at a change-point (change day) in time. The task is to quickly detect the change while providing few false-alarms before the change-point, which may be caused by the statistical variation of the performance metric over time. Subsequently, we simulate data drift by denoising the Emory Breast Imaging Dataset (EMBED) after a pre-defined change-point. We detect the change-point by studying the pre- and post-change specificity of a mammographic CAD algorithm. Our results indicate that with the appropriate choice of parameters, CUSUM is able to quickly detect relatively small drifts with a small number of false-positive alarms.more » « less
An official website of the United States government
