skip to main content


Title: Optimally ordering IDK classifiers subject to deadlines
Abstract A classifier is a software component, often based on Deep Learning, that categorizes each input provided to it into one of a fixed set of classes. An IDK classifier may additionally output “I Don’t Know” (IDK) for certain inputs. Multiple distinct IDK classifiers may be available for the same classification problem, offering different trade-offs between effectiveness, i.e. the probability of successful classification, and efficiency, i.e. execution time. Optimal offline algorithms are proposed for sequentially ordering IDK classifiers such that the expected duration to successfully classify an input is minimized, optionally subject to a hard deadline on the maximum time permitted for classification. Solutions are provided considering independent and dependent relationships between pairs of classifiers, as well as a mix of the two.  more » « less
Award ID(s):
1932530 1814739
NSF-PAR ID:
10354275
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Real-Time Systems
ISSN:
0922-6443
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract This paper introduces and evaluates a general construct for trading off accuracy and overall execution duration in classification-based machine perception problems—namely, the generalized IDK classifier cascade . The aim is to select the optimal sequence of classifiers required to minimize the expected (i.e. average) execution duration needed to achieve successful classification, subject to a constraint on quality, and optionally a latency constraint on the worst-case execution duration. An IDK classifier is a software component that attempts to categorize each input provided to it into one of a fixed set of classes, returning “I Don’t Know” (IDK) if it is unable to do so with the required level of confidence. An ensemble of several different IDK classifiers may be available for the same classification problem, offering different trade-offs between effectiveness (i.e. the probability of successful classification) and timeliness (i.e. execution duration). A model for representing such characteristics is defined, and a method is proposed for determining the values of the model parameters for a given ensemble of IDK classifiers. Optimal algorithms are developed for sequentially ordering IDK classifiers into an IDK cascade, such that the expected duration to successfully classify an input is minimized, optionally subject to a latency constraint on the worst-case overall execution duration of the IDK cascade. The entire methodology is applied to two real-world case studies. In contrast to prior work, the methodology developed in this paper caters for arbitrary dependences between the probabilities of successful classification for different IDK classifiers. Effective practical solutions are developed considering both single and multiple processors. 
    more » « less
  2. This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the ( x, y )-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods. 
    more » « less
  3. Explainability helps users trust deep learning solutions for time series classification. However, existing explainability methods for multi-class time series classifiers focus on one class at a time, ignoring relationships between the classes. Instead, when a classifier is choosing between many classes, an effective explanation must show what sets the chosen class apart from the rest. We now formalize this notion, studying the open problem of class-specific explainability for deep time series classifiers, a challenging and impactful problem setting. We design a novel explainability method, DEMUX, which learns saliency maps for explaining deep multi-class time series classifiers by adaptively ensuring that its explanation spotlights the regions in an input time series that a model uses specifically to its predicted class. DEMUX adopts a gradient-based approach composed of three interdependent modules that combine to generate consistent, class-specific saliency maps that remain faithful to the classifier’s behavior yet are easily understood by end users. Our experimental study demonstrates that DEMUX outperforms nine state-of-the-art alternatives on five popular datasets when explaining two types of deep time series classifiers. Further, through a case study, we demonstrate that DEMUX’s explanations indeed highlight what separates the predicted class from the others in the eyes of the classifier. 
    more » « less
  4. Abstract The Local Climate Zone (LCZ) classification is already widely used in urban heat island and other climate studies. The current classification method does not incorporate crucial urban auxiliary GIS data on building height and imperviousness that could significantly improve urban-type LCZ classification utility as well as accuracy. This study utilized a hybrid GIS- and remote sensing imagery-based framework to systematically compare and evaluate different machine and deep learning methods. The Convolution Neural Network (CNN) classifier outperforms in terms of accuracy, but it requires multi-pixel input, which reduces the output’s spatial resolution and creates a tradeoff between accuracy and spatial resolution. The Random Forest (RF) classifier performs best among the single-pixel classifiers. This study also shows that incorporating building height dataset improves the accuracy of the high- and mid-rise classes in the RF classifiers, whereas an imperviousness dataset improves the low-rise classes. The single-pass forward permutation test reveals that both auxiliary datasets dominate the classification accuracy in the RF classifier, while near-infrared and thermal infrared are the dominating features in the CNN classifier. These findings show that the conventional LCZ classification framework used in the World Urban Database and Access Portal Tools (WUDAPT) can be improved by adopting building height and imperviousness information. This framework can be easily applied to different cities to generate LCZ maps for urban models. 
    more » « less
  5. Roth, A (Ed.)
    It is well understood that classification algorithms, for example, for deciding on loan applications, cannot be evaluated for fairness without taking context into account. We examine what can be learned from a fairness oracle equipped with an underlying understanding of “true” fairness. The oracle takes as input a (context, classifier) pair satisfying an arbitrary fairness definition, and accepts or rejects the pair according to whether the classifier satisfies the underlying fairness truth. Our principal conceptual result is an extraction procedure that learns the underlying truth; moreover, the procedure can learn an approximation to this truth given access to a weak form of the oracle. Since every “truly fair” classifier induces a coarse metric, in which those receiving the same decision are at distance zero from one another and those receiving different decisions are at distance one, this extraction process provides the basis for ensuring a rough form of metric fairness, also known as individual fairness. Our principal technical result is a higher fidelity extractor under a mild technical constraint on the weak oracle’s conception of fairness. Our framework permits the scenario in which many classifiers, with differing outcomes, may all be considered fair. Our results have implications for interpretablity – a highly desired but poorly defined property of classification systems that endeavors to permit a human arbiter to reject classifiers deemed to be“unfair” or illegitimately derived. 
    more » « less