skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 8:00 PM ET on Friday, March 21 until 8:00 AM ET on Saturday, March 22 due to maintenance. We apologize for the inconvenience.


Title: Evaluating the Variable Stride Algorithm in the Identification of Diabetic Retinopathy
An experiment was performed to investigate a modified pooling method for use in convolutional neural networks for image recognition. This algorithm–Variable Stride–allows the user to segment an image and change the amount of subsampling in each region. This control allows for the user to maintain a higher amount of data retention in more important regions of the image, while more aggressively subsampling the less important regions to increase training speed. Three Variable Stride methods were compared to the preexisting pooling algorithms, Maximum Pool and Average Pool, in three different network configurations tasked with classifying Diabetic Retinopathy images between its early and advanced stages. Each combination was run multiple times and the AUC, Validation Loss, Validation Accuracy, and number of training epochs until convergence of each run was all collected. Maximum Pool and Average Pool were both found to be superior to Variable Stride when deployed in these scenarios.  more » « less
Award ID(s):
2050754
PAR ID:
10402463
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Beyond: Undergraduate Research Journal
Volume:
6
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Pantea, Casian (Ed.)
    Limited testing capacity for COVID-19 has hampered the pandemic response. Pooling is a testing method wherein samples from specimens (e.g., swabs) from multiple subjects are combined into a pool and screened with a single test. If the pool tests positive, then new samples from the collected specimens are individually tested, while if the pool tests negative, the subjects are classified as negative for the disease. Pooling can substantially expand COVID-19 testing capacity and throughput, without requiring additional resources. We develop a mathematical model to determine the best pool size for different risk groups , based on each group’s estimated COVID-19 prevalence. Our approach takes into consideration the sensitivity and specificity of the test, and a dynamic and uncertain prevalence, and provides a robust pool size for each group. For practical relevance, we also develop a companion COVID-19 pooling design tool (through a spread sheet). To demonstrate the potential value of pooling, we study COVID-19 screening using testing data from Iceland for the period, February-28-2020 to June-14-2020, for subjects stratified into high- and low-risk groups. We implement the robust pooling strategy within a sequential framework, which updates pool sizes each week, for each risk group, based on prior week’s testing data. Robust pooling reduces the number of tests, over individual testing, by 88.5% to 90.2%, and 54.2% to 61.9%, respectively, for the low-risk and high-risk groups (based on test sensitivity values in the range [0.71, 0.98] as reported in the literature). This results in much shorter times, on average, to get the test results compared to individual testing (due to the higher testing throughput), and also allows for expanded screening to cover more individuals. Thus, robust pooling can potentially be a valuable strategy for COVID-19 screening. 
    more » « less
  2. null (Ed.)
    Managing large-scale systems often involves simultaneously solving thousands of unrelated stochastic optimization problems, each with limited data. Intuition suggests that one can decouple these unrelated problems and solve them separately without loss of generality. We propose a novel data-pooling algorithm called Shrunken-SAA that disproves this intuition. In particular, we prove that combining data across problems can outperform decoupling, even when there is no a priori structure linking the problems and data are drawn independently. Our approach does not require strong distributional assumptions and applies to constrained, possibly nonconvex, nonsmooth optimization problems such as vehicle-routing, economic lot-sizing, or facility location. We compare and contrast our results to a similar phenomenon in statistics (Stein’s phenomenon), highlighting unique features that arise in the optimization setting that are not present in estimation. We further prove that, as the number of problems grows large, Shrunken-SAA learns if pooling can improve upon decoupling and the optimal amount to pool, even if the average amount of data per problem is fixed and bounded. Importantly, we highlight a simple intuition based on stability that highlights when and why data pooling offers a benefit, elucidating this perhaps surprising phenomenon. This intuition further suggests that data pooling offers the most benefits when there are many problems, each of which has a small amount of relevant data. Finally, we demonstrate the practical benefits of data pooling using real data from a chain of retail drug stores in the context of inventory management. This paper was accepted by Chung Piaw Teo, Special Issue on Data-Driven Prescriptive Analytics. 
    more » « less
  3. Deep learning models have demonstrated significant advantages over traditional algorithms in image processing tasks like object detection. However, a large amount of data are needed to train such deep networks, which limits their application to tasks such as biometric recognition that require more training samples for each class (i.e., each individual). Researchers developing such complex systems rely on real biometric data, which raises privacy concerns and is restricted by the availability of extensive, varied datasets. This paper proposes a generative adversarial network (GAN)-based solution to produce training data (palm images) for improved biometric (palmprint-based) recognition systems. We investigate the performance of the most recent StyleGAN models in generating a thorough contactless palm image dataset for application in biometric research. Training on publicly available H-PolyU and IIDT palmprint databases, a total of 4839 images were generated using StyleGAN models. SIFT (Scale-Invariant Feature Transform) was used to find uniqueness and features at different sizes and angles, which showed a similarity score of 16.12% with the most recent StyleGAN3-based model. For the regions of interest (ROIs) in both the palm and finger, the average similarity scores were 17.85%. We present the Frechet Inception Distance (FID) of the proposed model, which achieved a 16.1 score, demonstrating significant performance. These results demonstrated StyleGAN as effective in producing unique synthetic biometric images.

     
    more » « less
  4. Passive prostheses cannot provide the net positive work required at the knee and ankle for step-over stair ascent. Powered prostheses can provide this net positive work, but user synchronization of joint motion and power input are critical to enabling natural stair ascent gaits. In this work, we build on previous phase variable-based control methods for walking and propose a stair ascent controller driven by the motion of the user's residual thigh. We use reference kinematics from an able-bodied dataset to produce knee and ankle joint trajectories parameterized by gait phase. We redefine the gait cycle to begin at the point of maximum hip flexion instead of heel strike to improve the phase estimate. Able-bodied bypass adapter experiments demonstrate that the phase variable controller replicates normative able-bodied kinematic trajectories with a root mean squared error of 12.66 deg and 2.64 deg for the knee and ankle, respectively. The knee and ankle joints provided on average 0.387J/kg and 0.212J/kg per stride, compared to the normative averages of 0.335J/kg and 0.207J/kg, respectively. Thus, this controller allows powered knee-ankle prostheses to perform net positive mechanical work to assist stair ascent. 
    more » « less
  5. The spatial distribution of forest stands is one of the fundamental properties of forests. Timely and accurately obtained stand distribution can help people better understand, manage, and utilize forests. The development of remote sensing technology has made it possible to map the distribution of tree species in a timely and accurate manner. At present, a large amount of remote sensing data have been accumulated, including high-spatial-resolution images, time-series images, light detection and ranging (LiDAR) data, etc. However, these data have not been fully utilized. To accurately identify the tree species of forest stands, various and complementary data need to be synthesized for classification. A curve matching based method called the fusion of spectral image and point data (FSP) algorithm was developed to fuse high-spatial-resolution images, time-series images, and LiDAR data for forest stand classification. In this method, the multispectral Sentinel-2 image and high-spatial-resolution aerial images were first fused. Then, the fused images were segmented to derive forest stands, which are the basic unit for classification. To extract features from forest stands, the gray histogram of each band was extracted from the aerial images. The average reflectance in each stand was calculated and stacked for the time-series images. The profile curve of forest structure was generated from the LiDAR data. Finally, the features of forest stands were compared with training samples using curve matching methods to derive the tree species. The developed method was tested in a forest farm to classify 11 tree species. The average accuracy of the FSP method for ten performances was between 0.900 and 0.913, and the maximum accuracy was 0.945. The experiments demonstrate that the FSP method is more accurate and stable than traditional machine learning classification methods. 
    more » « less