skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, September 13 until 2:00 AM ET on Saturday, September 14 due to maintenance. We apologize for the inconvenience.


Title: Reducing manual labeling requirements and improved retinal ganglion cell identification in 3D AO-OCT volumes using semi-supervised learning

Adaptive optics-optical coherence tomography (AO-OCT) allows for the three-dimensional visualization of retinal ganglion cells (RGCs) in the living human eye. Quantitative analyses of RGCs have significant potential for improving the diagnosis and monitoring of diseases such as glaucoma. Recent advances in machine learning (ML) have made possible the automatic identification and analysis of RGCs within the complex three-dimensional retinal volumes obtained with such imaging. However, the current state-of-the-art ML approach relies on fully supervised training, which demands large amounts of training labels. Each volume requires many hours of expert manual annotation. Here, two semi-supervised training schemes are introduced, (i) cross-consistency training and (ii) cross pseudo supervision that utilize unlabeled AO-OCT volumes together with a minimal set of labels, vastly reducing the labeling demands. Moreover, these methods outperformed their fully supervised counterpart and achieved accuracy comparable to that of human experts.

 
more » « less
NSF-PAR ID:
10522206
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Optical Society of America
Date Published:
Journal Name:
Biomedical Optics Express
Volume:
15
Issue:
8
ISSN:
2156-7085
Format(s):
Medium: X Size: Article No. 4540
Size(s):
Article No. 4540
Sponsoring Org:
National Science Foundation
More Like this
  1. Optical coherence tomography (OCT) and scanning laser ophthalmoscopy (SLO) are imaging technologies invented in the 1980s that have revolutionized the field ofin vivoretinal diagnostics and are now commonly used in ophthalmology clinics as well as in vision science research. Adaptive optics (AO) technology enables high-fidelity correction of ocular aberrations, resulting in improved resolution and sensitivity for both SLO and OCT systems. The potential of gathering multi-modal cellular-resolution information in a single instrument is of great interest to the ophthalmic imaging community. Although similar instruments have been developed for imaging the human retina, developing such a system for mice will benefit basic science research and should help with further dissemination of AO technology. Here, we present our work integrating OCT into an existing mouse retinal AO-SLO system, resulting in a multi-modal AO-enhanced imaging system of the living mouse eye. The new system allows either independent or simultaneous data acquisition of AO-SLO and AO-OCT, depending on the requirements of specific scientific experiments. The system allows a data acquisition speed of 200 kHz A-scans/pixel rate for OCT and SLO, respectively. It offers ∼6 µm axial resolution for AO-OCT and a ∼1 µm lateral resolution for AO-SLO-OCT imaging.

     
    more » « less
  2. Glaucoma is a group of eye diseases characterized by the thinning of the retinal nerve fiber layer (RNFL), which is primarily caused by the progressive death of retinal ganglion cells (RGCs). Precise monitoring of these changes at a cellular resolution in living eyes is significant for glaucoma research. In this study, we aimed to assess the effectiveness of temporal speckle averaging optical coherence tomography (TSA-OCT) and dynamic OCT (dOCT) in examining the static and potential dynamic properties of RGCs and RNFL in living mouse eyes. We evaluated parameters such as RNFL thickness and possible dynamics, as well as compared the ganglion cell layer (GCL) soma density obtained fromin vivoOCT, fluorescence scanning laser ophthalmoscopy (SLO), andex vivohistology.

     
    more » « less
  3. Abstract

    Objective.Recent advances in neural decoding have accelerated the development of brain–computer interfaces aimed at assisting users with everyday tasks such as speaking, walking, and manipulating objects. However, current approaches for training neural decoders commonly require large quantities of labeled data, which can be laborious or infeasible to obtain in real-world settings. Alternatively, self-supervised models that share self-generated pseudo-labels between two data streams have shown exceptional performance on unlabeled audio and video data, but it remains unclear how well they extend to neural decoding.Approach.We learn neural decoders without labels by leveraging multiple simultaneously recorded data streams, including neural, kinematic, and physiological signals. Specifically, we apply cross-modal, self-supervised deep clustering to train decoders that can classify movements from brain recordings. After training, we then isolate the decoders for each input data stream and compare the accuracy of decoders trained using cross-modal deep clustering against supervised and unimodal, self-supervised models.Main results.We find that sharing pseudo-labels between two data streams during training substantially increases decoding performance compared to unimodal, self-supervised models, with accuracies approaching those of supervised decoders trained on labeled data. Next, we extend cross-modal decoder training to three or more modalities, achieving state-of-the-art neural decoding accuracy that matches or slightly exceeds the performance of supervised models.Significance.We demonstrate that cross-modal, self-supervised decoding can be applied to train neural decoders when few or no labels are available and extend the cross-modal framework to share information among three or more data streams, further improving self-supervised training.

     
    more » « less
  4. Optical coherence tomography (OCT) is an ideal imaging technique for noninvasive and longitudinal monitoring of multicellular tumor spheroids (MCTS). However, the internal structure features within MCTS from OCT images are still not fully utilized. In this study, we developed cross-statistical, cross-screening, and composite-hyperparameter feature processing methods in conjunction with 12 machine learning models to assess changes within the MCTS internal structure. Our results indicated that the effective features combined with supervised learning models successfully classify OVCAR-8 MCTS culturing with 5,000 and 50,000 cell numbers, MCTS with pancreatic tumor cells (Panc02-H7) culturing with the ratio of 0%, 33%, 50%, and 67% of fibroblasts, and OVCAR-4 MCTS treated by 2-methoxyestradiol, AZD1208, and R-ketorolac with concentrations of 1, 10, and 25 µM. This approach holds promise for obtaining multi-dimensional physiological and functional evaluations for using OCT and MCTS in anticancer studies.

     
    more » « less
  5. The use of audio and video modalities for Human Activity Recognition (HAR) is common, given the richness of the data and the availability of pre-trained ML models using a large corpus of labeled training data. However, audio and video sensors also lead to significant consumer privacy concerns. Researchers have thus explored alternate modalities that are less privacy-invasive such as mmWave doppler radars, IMUs, motion sensors. However, the key limitation of these approaches is that most of them do not readily generalize across environments and require significant in-situ training data. Recent work has proposed cross-modality transfer learning approaches to alleviate the lack of trained labeled data with some success. In this paper, we generalize this concept to create a novel system called VAX (Video/Audio to 'X'), where training labels acquired from existing Video/Audio ML models are used to train ML models for a wide range of 'X' privacy-sensitive sensors. Notably, in VAX, once the ML models for the privacy-sensitive sensors are trained, with little to no user involvement, the Audio/Video sensors can be removed altogether to protect the user's privacy better. We built and deployed VAX in ten participants' homes while they performed 17 common activities of daily living. Our evaluation results show that after training, VAX can use its onboard camera and microphone to detect approximately 15 out of 17 activities with an average accuracy of 90%. For these activities that can be detected using a camera and a microphone, VAX trains a per-home model for the privacy-preserving sensors. These models (average accuracy = 84%) require no in-situ user input. In addition, when VAX is augmented with just one labeled instance for the activities not detected by the VAX A/V pipeline (~2 out of 17), it can detect all 17 activities with an average accuracy of 84%. Our results show that VAX is significantly better than a baseline supervised-learning approach of using one labeled instance per activity in each home (average accuracy of 79%) since VAX reduces the user burden of providing activity labels by 8x (~2 labels vs. 17 labels).

     
    more » « less