skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Yesha, Yelena"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available May 1, 2025
  2. Astley, Susan M ; Chen, Weijie (Ed.)
    Devices enabled by artificial intelligence (AI) and machine learning (ML) are being introduced for clinical use at an accelerating pace. In a dynamic clinical environment, these devices may encounter conditions different from those they were developed for. The statistical data mismatch between training/initial testing and production is often referred to as data drift. Detecting and quantifying data drift is significant for ensuring that AI model performs as expected in clinical environments. A drift detector signals when a corrective action is needed if the performance changes. In this study, we investigate how a change in the performance of an AI model due to data drift can be detected and quantified using a cumulative sum (CUSUM) control chart. To study the properties of CUSUM, we first simulate different scenarios that change the performance of an AI model. We simulate a sudden change in the mean of the performance metric at a change-point (change day) in time. The task is to quickly detect the change while providing few false-alarms before the change-point, which may be caused by the statistical variation of the performance metric over time. Subsequently, we simulate data drift by denoising the Emory Breast Imaging Dataset (EMBED) after a pre-defined change-point. We detect the change-point by studying the pre- and post-change specificity of a mammographic CAD algorithm. Our results indicate that with the appropriate choice of parameters, CUSUM is able to quickly detect relatively small drifts with a small number of false-positive alarms. 
    more » « less
    Free, publicly-accessible full text available April 3, 2025
  3. Free, publicly-accessible full text available January 1, 2025
  4. We present a novel algorithm that is able to generate deep synthetic COVID-19 pneumonia CT scan slices using a very small sample of positive training images in tandem with a larger number of normal images. This generative algorithm produces images of sufficient accuracy to enable a DNN classifier to achieve high classification accuracy using as few as 10 positive training slices (from 10 positive cases), which to the best of our knowledge is one order of magnitude fewer than the next closest published work at the time of writing. Deep learning with extremely small positive training volumes is a very difficult problem and has been an important topic during the COVID-19 pandemic, because for quite some time it was difficult to obtain large volumes of COVID-19-positive images for training. Algorithms that can learn to screen for diseases using few examples are an important area of research. Furthermore, algorithms to produce deep synthetic images with smaller data volumes have the added benefit of reducing the barriers of data sharing between healthcare institutions. We present the cycle-consistent segmentation-generative adversarial network (CCS-GAN). CCS-GAN combines style transfer with pulmonary segmentation and relevant transfer learning from negative images in order to create a larger volume of synthetic positive images for the purposes of improving diagnostic classification performance. The performance of a VGG-19 classifier plus CCS-GAN was trained using a small sample of positive image slices ranging from at most 50 down to as few as 10 COVID-19-positive CT scan images. CCS-GAN achieves high accuracy with few positive images and thereby greatly reduces the barrier of acquiring large training volumes in order to train a diagnostic classifier for COVID-19. 
    more » « less
  5. We introduce an active, semisupervised algorithm that utilizes Bayesian experimental design to address the shortage of annotated images required to train and validate Artificial Intelligence (AI) models for lung cancer screening with computed tomography (CT) scans. Our approach incorporates active learning with semisupervised expectation maximization to emulate the human in the loop for additional ground truth labels to train, evaluate, and update the neural network models. Bayesian experimental design is used to intelligently identify which unlabeled samples need ground truth labels to enhance the model’s performance. We evaluate the proposed Active Semi-supervised Expectation Maximization for Computer aided diagnosis (CAD) tasks (ASEM-CAD) using three public CT scans datasets: the National Lung Screening Trial (NLST), the Lung Image Database Consortium (LIDC), and Kaggle Data Science Bowl 2017 for lung cancer classification using CT scans. ASEM-CAD can accurately classify suspicious lung nodules and lung cancer cases with an area under the curve (AUC) of 0.94 (Kaggle), 0.95 (NLST), and 0.88 (LIDC) with significantly fewer labeled images compared to a fully supervised model. This study addresses one of the significant challenges in early lung cancer screenings using low-dose computed tomography (LDCT) scans and is a valuable contribution towards the development and validation of deep learning algorithms for lung cancer screening and other diagnostic radiology examinations. 
    more » « less
  6. Venous thromboembolism (VTE) is a preventable complication of hospitalization. VTE risk-assessment models (RAMs) including the Caprini and Padua RAMs quantify VTE risk based on demographic and clinical characteristics. Both RAMs have performed well in selected high-risk cohorts with relatively small sample sizes but few studies have evaluated the RAMs in large, unselected cohorts. We assessed the ability of both RAMs to predict VTE in a large, nationwide, diverse cohort of surgical and nonsurgical patients. 
    more » « less
  7. While our society accelerates its transition to the Internet of Things, billions of IoT devices are now linked to the network. While these gadgets provide enormous convenience, they generate a large amount of data that has already beyond the network’s capacity. To make matters worse, the data acquired by sensors on such IoT devices also include sensitive user data that must be appropriately treated. At the moment, the answer is to provide hub services for data storage in data centers. However, when data is housed in a centralized data center, data owners lose control of the data, since data centers are centralized solutions that rely on data owners’ faith in the service provider. In addition, edge computing enables edge devices to collect, analyze, and act closer to the data source, the challenge of data privacy near the edge is also a tough nut to crack. A large number of user information leakage both for IoT hub and edge made the system untrusted all along. Accordingly, building a decentralized IoT system near the edge and bringing real trust to the edge is indispensable and significant. To eliminate the need for a centralized data hub, we present a prototype of a unique, secure, and decentralized IoT framework called Reja, which is built on a permissioned Blockchain and an intrusion-tolerant messaging system ChiosEdge, and the critical components of ChiosEdge are reliable broadcast and BFT consensus. We evaluated the latency and throughput of Reja and its sub-module ChiosEdge. 
    more » « less