Given raster imagery features and imperfect vector training labels with registration uncertainty, this paper studies a deep learning framework that can quantify and reduce the registration uncertainty of training labels as well as train neural network parameters simultaneously. The problem is important in broad applications such as streamline classification on Earth imagery or tissue segmentation on medical imagery, whereby annotating precise vector labels is expensive and time-consuming. However, the problem is challenging due to the gap between the vector representation of class labels and the raster representation of image features and the need for training neural networks with uncertain label locations. Existing research on uncertain training labels often focuses on uncertainty in label class semantics or characterizes label registration uncertainty at the pixel level (not contiguous vectors). To fill the gap, this paper proposes a novel learning framework that explicitly quantifies vector labels' registration uncertainty. We propose a registration-uncertainty-aware loss function and design an iterative uncertainty reduction algorithm by re-estimating the posterior of true vector label locations distribution based on a Gaussian process. Evaluations on real-world datasets in National Hydrography Dataset refinement show that the proposed approach significantly outperforms several baselines in the registration uncertainty estimations performance and classification performance. 
                        more » 
                        « less   
                    
                            
                            Sparse Coding for Classification Via a Locality Regularizer: with Applications to Agriculture
                        
                    
    
            High-dimensional data is commonly encountered in various applications, including genomics, as well as image and video processing. Analyzing, computing, and visualizing such data pose significant challenges. Feature extraction methods are crucial in addressing these challenges by obtaining compressed representations that are suitable for analysis and downstream tasks. One effective technique along these lines is sparse coding, which involves representing data as a sparse linear combination of a set of exemplars. In this study, we propose a local sparse coding framework within the context of a classification problem. The objective is to predict the label of a given data point based on labeled training data. The primary optimization problem encourages the representation of each data point using nearby exemplars. We leverage the optimized sparse representation coefficients to predict the label of a test data point by assessing its similarity to the sparse representations of the training data. The proposed framework is computationally efficient and provides interpretable sparse representations. To illustrate the practicality of our proposed framework, we apply it to agriculture for the classification of crop diseases. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2208392
- PAR ID:
- 10532471
- Publisher / Repository:
- Proceedings of the 16th International Conference on Precision Agriculture
- Date Published:
- Subject(s) / Keyword(s):
- Classification, Sparse coding, Crop disease identification, Feature extraction.
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Since the cost of labeling data is getting higher and higher, we hope to make full use of the large amount of unlabeled data and improve image classification effect through adding some unlabeled samples for training. In addition, we expect to uniformly realize two tasks, namely the clustering of the unlabeled data and the recognition of the query image. We achieve the goal by designing a novel sparse model based on manifold assumption, which has been proved to work well in many tasks. Based on the assumption that images of the same class lie on a sub-manifold and an image can be approximately represented as the linear combination of its neighboring data due to the local linear property of manifold, we proposed a sparse representation model on manifold. Specifically, there are two regularizations, i.e., a variant Trace lasso norm and the manifold Laplacian regularization. The first regularization term enables the representation coefficients satisfying sparsity between groups and density within a group. And the second term is manifold Laplacian regularization by which label can be accurately propagated from labeled data to unlabeled data. Augmented Lagrange Multiplier (ALM) scheme and Gauss Seidel Alternating Direction Method of Multiplier (GS-ADMM) are given to solve the problem numerically. We conduct some experiments on three human face databases and compare the proposed work with several state-of-the-art methods. For each subject, some labeled face images are randomly chosen for training for those supervised methods, and a small amount of unlabeled images are added to form the training set of the proposed approach. All experiments show our method can get better classification results due to the addition of unlabeled samples.more » « less
- 
            Automatic coding of International Classification of Diseases (ICD) is a multi-label text categorization task that involves extracting disease or procedure codes from clinical notes. Despite the application of state-of-the-art natural language processing (NLP) techniques, there are still challenges including limited availability of data due to privacy constraints and the high variability of clinical notes caused by different writing habits of medical professionals and various pathological features of patients. In this work, we investigate the semi-structured nature of clinical notes and propose an automatic algorithm to segment them into sections. To address the variability issues in existing ICD coding models with limited data, we introduce a contrastive pre-training approach on sections using a soft multi-label similarity metric based on tree edit distance. Additionally, we design a masked section training strategy to enable ICD coding models to locate sections related to ICD codes. Extensive experimental results demonstrate that our proposed training strategies effectively enhance the performance of existing ICD coding methods.more » « less
- 
            Abstract—Sleep staging is a key challenge in diagnosing and treating sleep-related diseases due to its labor-intensive, time- consuming, costly, and error-prone. With the availability of large- scale sleep signal data, many deep learning methods are proposed for automatic sleep staging. However, these existing methods face several challenges including the heterogeneity of patients’ underlying health conditions and the difficulty modeling complex interactions between sleep stages. In this paper, we propose a neural network architecture named DREAM to tackle these is- sues for automatic sleep staging. DREAM consists of (i) a feature representation network that generates robust representations for sleep signals via the variational auto-encoder framework and contrastive learning and (ii) a sleep stage classification network that explicitly models the interactions between sleep stages in the sequential context at both feature representation and label classification levels via Transformer and conditional random field architectures. Our experimental results indicate that DREAM significantly outperforms existing methods for automatic sleep staging on three sleep signal datasets.more » « less
- 
            Discriminative features extracted from the sparse coding model have been shown to perform well for classification. Recent deep learning architectures have further improved reconstruction in inverse problems by considering new dense priors learned from data. We propose a novel dense and sparse coding model that integrates both representation capability and discriminative features. The model studies the problem of recovering a dense vector x and a sparse vector u given measurement of the form y = Ax+Bu. Our first analysis relies on a geometric condition, specifically the minimal angle between the spanning subspaces of matrices A and B, which ensures a unique solution to the model. The second analysis shows that, under some conditions on A and B, a convex program recovers the dense and sparse components. We validate the effectiveness of the model on simulated data and propose a dense and sparse autoencoder (DenSaE) tailored to learning the dictionaries from the dense and sparse model. We demonstrate that (i) DenSaE denoises natural images better than architectures derived from the sparse coding model (Bu), (ii) in the presence of noise, training the biases in the latter amounts to implicitly learning the Ax + Bu model, (iii) A and B capture low- and high-frequency contents, respectively, and (iv) compared to the sparse coding model, DenSaE offers a balance between discriminative power and representation.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    