In this project, competition-winning deep neural networks with pretrained weights are used for image-based gender recognition and age estimation. Transfer learning is explored using both VGG19 and VGGFace pretrained models by testing the effects of changes in various design schemes and training parameters in order to improve prediction accuracy. Training techniques such as input standardization, data augmentation, and label distribution age encoding are compared. Finally, a hierarchy of deep CNNs is tested that first classifies subjects by gender, and then uses separate male and female age models to predict age. A gender recognition accuracy of 98.7% and an MAE of 4.1 years is achieved. This paper shows that, with proper training techniques, good results can be obtained by retasking existing convolutional filters towards a new purpose.
more »
« less
Deep Learning Detection and Recognition of Spot Elevations on Historical Topographic Maps
Some information contained in historical topographic maps has yet to be captured digitally, which limits the ability to automatically query such data. For example, U.S. Geological Survey’s historical topographic map collection (HTMC) displays millions of spot elevations at locations that were carefully chosen to best represent the terrain at the time. Although research has attempted to reproduce these data points, it has proven inadequate to automatically detect and recognize spot elevations in the HTMC. We propose a deep learning workflow pretrained using large benchmark text datasets. To these datasets we add manually crafted training image/label pairs, and test how many are required to improve prediction accuracy. We find that the initial model, pretrained solely with benchmark data, fails to predict any HTMC spot elevations correctly, whereas the addition of just 50 custom image/label pairs increases the predictive ability by ∼50%, and the inclusion of 350 data pairs increased performance by ∼80%. Data augmentation in the form of rotation, scaling, and translation (offset) expanded the size and diversity of the training dataset and vastly improved recognition accuracy up to ∼95%. Visualization methods, such as heat map generation and salient feature detection, can be used to better understand why some predictions fail.
more »
« less
- Award ID(s):
- 1853864
- PAR ID:
- 10344345
- Date Published:
- Journal Name:
- Frontiers in Environmental Science
- Volume:
- 10
- ISSN:
- 2296-665X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Image restoration aims to recover a clean image given a noisy image. It has long been a topic of interest for researchers in imaging, optical science and computer vision. As the imaging environment becomes more and more deteriorated, the problem becomes more challenging. Several computational approaches, ranging from statistical to deep learning, have been proposed over the years to tackle this problem. The deep learning-based approaches provided promising image restoration results, but it’s purely data driven and the requirement of large datasets (paired or unpaired) for training might demean its utility for certain physical problems. Recently, physics informed image restoration techniques have gained importance due to their ability to enhance performance, infer some sense of the degradation process and its potential to quantify the uncertainty in the prediction results. In this paper, we propose a physics informed deep learning approach with simultaneous parameter estimation using 3D integral imaging and Bayesian neural network (BNN). An image-image mapping architecture is first pretrained to generate a clean image from the degraded image, which is then utilized for simultaneous training with Bayesian neural network for simultaneous parameter estimation. For the network training, simulated data using the physical model has been utilized instead of actual degraded data. The proposed approach has been tested experimentally under degradations such as low illumination and partial occlusion. The recovery results are promising despite training from a simulated dataset. We have tested the performance of the approach under varying levels of illumination condition. Additionally, the proposed approach also has been analyzed against corresponding 2D imaging-based approach. The results suggest significant improvements compared to 2D even training under similar datasets. Also, the parameter estimation results demonstrate the utility of the approach in estimating the degradation parameter in addition to image restoration under the experimental conditions considered.more » « less
-
Finding Friends and Flipping Frenemies: Automatic Paraphrase Dataset Augmentation Using Graph TheoryMost NLP datasets are manually labeled, so suffer from inconsistent labeling or limited size. We propose methods for automatically improving datasets by viewing them as graphs with expected semantic properties. We construct a paraphrase graph from the provided sentence pair labels, and create an augmented dataset by directly inferring labels from the original sentence pairs using a transitivity property. We use structural balance theory to identify likely mislabelings in the graph, and flip their labels. We evaluate our methods on paraphrase models trained using these datasets starting from a pretrained BERT model, and find that the automatically-enhanced training sets result in more accurate models.more » « less
-
Vision-language models are integral to computer vision research, yet many high-performing models remain closed-source, obscuring their data, design and training recipe. The research community has responded by using distillation from black-box models to label training data, achieving strong benchmark results, at the cost of measurable scientific progress. However, without knowing the details of the teacher model and its data sources, scientific progress remains difficult to measure. In this paper, we study building a Perception Language Model (PLM) in a fully open and reproducible framework for transparent research in image and video understanding. We analyze standard training pipelines without distillation from proprietary models and explore large-scale synthetic data to identify critical data gaps, particularly in detailed video understanding. To bridge these gaps, we release 2.8M human-labeled instances of fine-grained video question-answer pairs and spatio-temporally grounded video captions. Additionally, we introduce PLM-VideoBench, a suite for evaluating challenging video understanding tasks focusing on the ability to reason about "what", "where", "when", and "how" of a video. We make our work fully reproducible by providing data, training recipes, code & models.more » « less
-
Proc. 2023 Int. Conf. on Machine Learning (Ed.)Recent studies have revealed the intriguing fewshot learning ability of pretrained language models (PLMs): They can quickly adapt to a new task when fine-tuned on a small amount of labeled data formulated as prompts, without requiring abundant task-specific annotations. Despite their promising performance, most existing few-shot approaches that only learn from the small training set still underperform fully supervised training by nontrivial margins. In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set. To encourage the generator to produce label discriminative samples, we train it via weighted maximum likelihood where the weight of each token is automatically adjusted based on a discriminative meta-learning objective. A classification PLM can then be fine-tuned on both the few-shot and the synthetic samples with regularization for better generalization and stability. Our approach FewGen achieves an overall better result across seven classification tasks of the GLUE benchmark than existing few-shot learning methods, improving no-augmentation methods by 5+ average points, and outperforming augmentation methods by 3+ average points.more » « less
An official website of the United States government

