skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on June 20, 2025

Title: Knee or ROC
Self-attention transformers have demonstrated accuracy for image classification with smaller data sets. However, a limitation is that tests to-date are based upon single class image detection with known representation of image populations. For instances where the input image classes may be greater than one and test sets that lack full information on representation of image populations, accuracy calculations must adapt. The Receiver Operating Characteristic (ROC) accuracy thresh-old can address the instances of multi-class input images. However, this approach is unsuitable in instances where image population representation is unknown. We consider calculating accuracy using the knee method to determine threshold values on an ad-hoc basis. Results of ROC curve and knee thresholds for a multi-class data set, created from CIFAR-10 images, are discussed for multi-class image detection.  more » « less
Award ID(s):
2131269
PAR ID:
10546088
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Springer LNCS
Date Published:
Subject(s) / Keyword(s):
Image Classification, Knee Threshold, ROC Curve, and Transformers.
Format(s):
Medium: X
Location:
Zakopane, Poland
Sponsoring Org:
National Science Foundation
More Like this
  1. Multi-instance learning (MIL) has demonstrated its usefulness in many real-world image applications in recent years. However, two critical challenges prevent one from effectively using MIL in practice. First, existing MIL methods routinely model the predictive targets using the instances of input images, but rarely utilize an input image as a whole. As a result, the useful information conveyed by the holistic representation of an input image could be potentially lost. Second, the varied numbers of the instances of the input images in a data set make it infeasible to use traditional learning models that can only deal with single-vector inputs. To tackle these two challenges, in this paper we propose a novel image representation learning method that can integrate the local patches (the instances) of an input image (the bag) and its holistic representation into one single-vector representation. Our new method first learns a projection to preserve both global and local consistencies of the instances of an input image. It then projects the holistic representation of the same image into the learned subspace for information enrichment. Taking into account the content and characterization variations in natural scenes and photos, we develop an objective that maximizes the ratio of the summations of a number of L1 -norm distances, which is difficult to solve in general. To solve our objective, we derive a new efficient non-greedy iterative algorithm and rigorously prove its convergence. Promising results in extensive experiments have demonstrated improved performances of our new method that validate its effectiveness. 
    more » « less
  2. In the medical sector, three-dimensional (3D) images are commonly used like computed tomography (CT) and magnetic resonance imaging (MRI). The 3D MRI is a non-invasive method of studying the soft-tissue structures in a knee joint for osteoarthritis studies. It can greatly improve the accuracy of segmenting structures such as cartilage, bone marrow lesion, and meniscus by identifying the bone structure first. U-net is a convolutional neural network that was originally designed to segment the biological images with limited training data. The input of the original U-net is a single 2D image and the output is a binary 2D image. In this study, we modified the U-net model to identify the knee bone structures using 3D MRI, which is a sequence of 2D slices. A fully automatic model has been proposed to detect and segment knee bones. The proposed model was trained, tested, and validated using 99 knee MRI cases where each case consists of 160 2D slices for a single knee scan. To evaluate the model’s performance, the similarity, dice coefficient (DICE), and area error metrics were calculated. Separate models were trained using different knee bone components including tibia, femur, patella, as well as a combined model for segmenting all the knee bones. Using the whole MRI sequence (160 slices), the method was able to detect the beginning and ending bone slices first, and then segment the bone structures for all the slices in between. On the testing set, the detection model accomplished 98.79% accuracy and the segmentation model achieved DICE 96.94% and similarity 93.98%. The proposed method outperforms several state-of-the-art methods, i.e., it outperforms U-net by 3.68%, SegNet by 14.45%, and FCN-8 by 2.34%, in terms of DICE score using the same dataset. 
    more » « less
  3. Instance detection (InsDet) is a long-lasting problem in robotics and computer vision, aiming to detect object instances (predefined by some visual examples) in a cluttered scene. Despite its practical significance, its advancement is overshadowed by Object Detection, which aims to detect objects belonging to some predefined classes. One major reason is that current InsDet datasets are too small in scale by today's standards. For example, the popular InsDet dataset GMU (published in 2016) has only 23 instances, far less than COCO (80 classes), a well-known object detection dataset published in 2014. We are motivated to introduce a new InsDet dataset and protocol. First, we define a realistic setup for InsDet: training data consists of multi-view instance captures, along with diverse scene images allowing synthesizing training images by pasting instance images on them with free box annotations. Second, we release a real-world database, which contains multi-view capture of 100 object instances, and high-resolution (6k\texttimes{} 8k) testing images. Third, we extensively study baseline methods for InsDet on our dataset, analyze their performance and suggest future work. Somewhat surprisingly, using the off-the-shelf class-agnostic segmentation model (Segment Anything Model, SAM) and the self-supervised feature representation DINOv2 performs the best, achieving >10 AP better than end-to-end trained InsDet models that repurpose object detectors (e.g., FasterRCNN and RetinaNet). 
    more » « less
  4. null (Ed.)
    Osteoarthritis (OA) is the most common form of arthritis and can often occur in the knee. While convolutional neural networks (CNNs) have been widely used to study medical images, the application of a 3-dimensional (3D) CNN in knee OA diagnosis is limited. This study utilizes a 3D CNN model to analyze sequences of knee magnetic resonance (MR) images to perform knee OA classification. An advantage of using 3D CNNs is the ability to analyze the whole sequence of 3D MR images as a single unit as opposed to a traditional 2D CNN, which examines one image at a time. Therefore, 3D features could be extracted from adjacent slices, which may not be detectable from a single 2D image. The input data for each knee were a sequence of double-echo steady-state (DESS) MR images, and each knee was labeled by the Kellgren and Lawrence (KL) grade of severity at levels 0–4. In addition to the 5-category KL grade classification, we further examined a 2-category classification that distinguishes non-OA (KL ≤ 1) from OA (KL ≥ 2) knees. Clinically, diagnosing a patient with knee OA is the ultimate goal of assigning a KL grade. On a dataset with 1100 knees, the 3D CNN model that classifies knees with and without OA achieved an accuracy of 86.5% on the validation set and 83.0% on the testing set. We further conducted a comparative study between MRI and X-ray. Compared with a CNN model using X-ray images trained from the same group of patients, the proposed 3D model with MR images achieved higher accuracy in both the 5-category classification (54.0% vs. 50.0%) and the 2-category classification (83.0% vs. 77.0%). The result indicates that MRI, with the application of a 3D CNN model, has greater potential to improve diagnosis accuracy for knee OA clinically than the currently used X-ray methods. 
    more » « less
  5. Messinger, David W.; Velez-Reyes, Miguel (Ed.)
    Recently, multispectral and hyperspectral data fusion models based on deep learning have been proposed to generate images with a high spatial and spectral resolution. The general objective is to obtain images that improve spatial resolution while preserving high spectral content. In this work, two deep learning data fusion techniques are characterized in terms of classification accuracy. These methods fuse a high spatial resolution multispectral image with a lower spatial resolution hyperspectral image to generate a high spatial-spectral hyperspectral image. The first model is based on a multi-scale long short-term memory (LSTM) network. The LSTM approach performs the fusion using a multiple step process that transitions from low to high spatial resolution using an intermediate step capable of reducing spatial information loss while preserving spectral content. The second fusion model is based on a convolutional neural network (CNN) data fusion approach. We present fused images using four multi-source datasets with different spatial and spectral resolutions. Both models provide fused images with increased spatial resolution from 8m to 1m. The obtained fused images using the two models are evaluated in terms of classification accuracy on several classifiers: Minimum Distance, Support Vector Machines, Class-Dependent Sparse Representation and CNN classification. The classification results show better performance in both overall and average accuracy for the images generated with the multi-scale LSTM fusion over the CNN fusion 
    more » « less