skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Robust Text Classifier on Test-Time Budgets
We design a generic framework for learning a robust text classification model that achieves high accuracy under different selection budgets (a.k.a selection rates) at test-time. We take a different approach from existing methods and learn to dynamically filter a large fraction of unimportant words by a low-complexity selector such that any high-complexity state-of-art classifier only needs to process a small fraction of text, relevant for the target task. To this end, we propose a data aggregation method to train the classifier, allowing it to achieve competitive performance on fractured sentences. On four benchmark text classification tasks, we demonstrate that the framework gains consistent speedup with little degradation in accuracy on various selection budgets.  more » « less
Award ID(s):
1760523
PAR ID:
10144860
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Page Range / eLocation ID:
1167 -1172
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The Local Climate Zone (LCZ) classification is already widely used in urban heat island and other climate studies. The current classification method does not incorporate crucial urban auxiliary GIS data on building height and imperviousness that could significantly improve urban-type LCZ classification utility as well as accuracy. This study utilized a hybrid GIS- and remote sensing imagery-based framework to systematically compare and evaluate different machine and deep learning methods. The Convolution Neural Network (CNN) classifier outperforms in terms of accuracy, but it requires multi-pixel input, which reduces the output’s spatial resolution and creates a tradeoff between accuracy and spatial resolution. The Random Forest (RF) classifier performs best among the single-pixel classifiers. This study also shows that incorporating building height dataset improves the accuracy of the high- and mid-rise classes in the RF classifiers, whereas an imperviousness dataset improves the low-rise classes. The single-pass forward permutation test reveals that both auxiliary datasets dominate the classification accuracy in the RF classifier, while near-infrared and thermal infrared are the dominating features in the CNN classifier. These findings show that the conventional LCZ classification framework used in the World Urban Database and Access Portal Tools (WUDAPT) can be improved by adopting building height and imperviousness information. This framework can be easily applied to different cities to generate LCZ maps for urban models. 
    more » « less
  2. End-to-end spoken language understanding (SLU) systems are typically trained on large amounts of data. In many practical scenarios, the amount of labeled speech is often limited as opposed to text. In this study, we investigate the use of non-parallel speech and text to improve the performance of dialog act recognition as an example SLU task. We propose a multiview architecture that can handle each modality separately. To effectively train on such data, this model enforces the internal speech and text encodings to be similar using a shared classifier. On the Switchboard Dialog Act corpus, we show that pretraining the classifier using large amounts of text helps learning better speech encodings, resulting in up to 40% relatively higher classification accuracies. We also show that when the speech embeddings from an automatic speech recognition (ASR) system are used in this framework, the speech-only accuracy exceeds the performance of ASR-text based tests up to 15% relative and approaches the performance of using true transcripts. 
    more » « less
  3. In deep learning (DL) based human activity recognition (HAR), sensor selection seeks to balance prediction accuracy and sensor utilization (how often a sensor is used). With advances in on-device inference, sensors have become tightly integrated with DL, often restricting access to the underlying model used. Given only sensor predictions, how can we derive a selection policy which does efficient classification while maximizing accuracy? We propose a cascaded inference approach which, given the prediction of any one sensor, determines whether to query all other sensors. Typically, cascades use a sequence of classifiers which terminate once the confidence of a classifier exceeds a threshold. However, a threshold-based policy for sensor selection may be suboptimal; we define a more general class of policies which can surpass the threshold. We extend to settings where little or no labeled data is available for tuning the policy. Our analysis is validated on three HAR datasets by improving upon the F1-score of a threshold policy across several utilization budgets. Overall, our work enables practical analytics for HAR by relaxing the requirement of labeled data for sensor selection and reducing sensor utilization to directly extend a sensor system’s lifetime. 
    more » « less
  4. Abstract Large scientific institutions, such as the Space Telescope Science Institute, track the usage of their facilities to understand the needs of the research community. Astrophysicists incorporate facility usage data into their scientific publications, embedding this information in plain text. Traditional automatic search queries prove unreliable for accurate tracking due to the misidentification of facility names in plain text. As automatic search queries fail, researchers are required to manually classify publications for facility usage, which consumes valuable research time. In this work, we introduce a machine learning classification framework for the automatic identification of facility usage of observation sections in astrophysics publications. Our framework identifies sentences containing telescope mission keywords (e.g., Kepler and TESS) in each publication. Subsequently, the identified sentences are transformed using term frequency–inverse document frequency and classified with a support vector machine. The classification framework leverages the context surrounding the identified telescope mission keywords to provide relevant information to the classifier. The framework successfully classifies the usage of MAST-hosted missions with a 92.9% accuracy. Furthermore, our framework demonstrates robustness when compared to other approaches, considering common metrics and computational complexity. The framework’s interpretability makes it adaptable for use across observatories and other scientific facilities worldwide. 
    more » « less
  5. Weakly-supervised text classification trains a classifier using the label name of each target class as the only supervision, which largely reduces human annotation efforts. Most existing methods first use the label names as static keyword-based features to generate pseudo labels, which are then used for final classifier training. While reasonable, such a commonly adopted framework suffers from two limitations: (1) keywords can have different meanings in different contexts and some text may not have any keyword, so keyword matching can induce noisy and inadequate pseudo labels; (2) the errors made in the pseudo label generation stage will directly propagate to the classifier training stage without a chance of being corrected. In this paper, we propose a new method, PIEClass, consisting of two modules: (1) a pseudo label acquisition module that uses zero-shot prompting of pre-trained language models (PLM) to get pseudo labels based on contextualized text understanding beyond static keyword matching, and (2) a noise-robust iterative ensemble training module that iteratively trains classifiers and updates pseudo labels by utilizing two PLM fine-tuning methods that regularize each other. Extensive experiments show that PIEClass achieves overall better performance than existing strong baselines on seven benchmark datasets and even achieves similar performance to fully-supervised classifiers on sentiment classification tasks. 
    more » « less