This paper presents a novel technique to reduce energy consumption of a machine learning classifier based on incremental-precision feature computation and classification. Specifically, the algorithm starts with features computed using the lowest possible precision. Depending on the classification accuracy, the features of the previous level are combined with features of the incremental-precision to compute the features in higher-precision. This process is continued till a desired accuracy is obtained. A certain threshold that allows many samples to be classified using a low-precision classifier can reduce energy consumption, but increases misclassification error. To implement hardware which provides the required updates in precision, an incremental-precision architecture based on data-path decomposition is proposed. One novel aspect of this work lies in the design of appropriate thresholds for multi-level classification using training data such that a family of designs can be obtained that enable trade-offs between classification accuracy and energy consumption. Another novel aspect involves the design of hardware architectures based on data-path decomposition which can incrementally increase precision upon demand. Using a seizure detection example, it is shown that the proposed incremental-precision based multi-level classification approach can reduce energy consumption by 35% while maintaining high sensitivity, or by about 50% at the expense of 15% degradation in sensitivity compared to similar approaches to seizure detection in literature. The reduction in energy is achieved at the expense of small area, timing and memory overheads as multiple classification steps are used instead of a single step.
more »
« less
Low-Energy Architectures of Linear Classifiers for IoT Applications using Incremental Precision and Multi-Level Classification
This paper presents a novel incremental-precision classification approach that leads to a reduction in energy consumption of linear classifiers for IoT applications. Features are first input to a low-precision classifier. If the classifier successfully classifies the sample, then the process terminates. Otherwise, the classification performance is incrementally improved by using a classifier of higher precision. This process is repeated until the classification is complete. The argument is that many samples can be classified using the low-precision classifier, leading to a reduction in energy. To achieve incremental-precision, a novel data-path decomposition is proposed to design of fixed-width adders and multipliers. These components improve the precision without recalculating the outputs, thus reducing energy. Using a linear classification example, it is shown that the proposed incremental-precision based multi-level classifier approach can reduce energy by about 41% while achieving comparable accuracies as that of a full-precision system.
more »
« less
- Award ID(s):
- 1749494
- PAR ID:
- 10074877
- Date Published:
- Journal Name:
- Proc. 2018 ACM Great Lakes Symposium on VLSI (GLSVLSI)
- Page Range / eLocation ID:
- 291 to 296
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract Background It is a computational challenge for current metagenomic classifiers to keep up with the pace of training data generated from genome sequencing projects, such as the exponentially-growing NCBI RefSeq bacterial genome database. When new reference sequences are added to training data, statically trained classifiers must be rerun on all data, resulting in a highly inefficient process. The rich literature of “incremental learning” addresses the need to update an existing classifier to accommodate new data without sacrificing much accuracy compared to retraining the classifier with all data. Results We demonstrate how classification improves over time by incrementally training a classifier on progressive RefSeq snapshots and testing it on: (a) all known current genomes (as a ground truth set) and (b) a real experimental metagenomic gut sample. We demonstrate that as a classifier model’s knowledge of genomes grows, classification accuracy increases. The proof-of-concept naïve Bayes implementation, when updated yearly, now runs in 1/4 t h of the non-incremental time with no accuracy loss. Conclusions It is evident that classification improves by having the most current knowledge at its disposal. Therefore, it is of utmost importance to make classifiers computationally tractable to keep up with the data deluge. The incremental learning classifier can be efficiently updated without the cost of reprocessing nor the access to the existing database and therefore save storage as well as computation resources.more » « less
-
DNA Sequencing of microbial communities from en-vironmental samples generates large volumes of data, which can be analyzed using various bioinformatics pipelines. Unsupervised clustering algorithms are usually an early and critical step in an analysis pipeline, since much of such data are unlabeled, unstructured, or novel. However, curated reference databases that provide taxonomic label information are also increasing and growing, which can help in the classification of sequences, and not just clustering. In this contribution, we report on our progress in developing a semi-supervised approach for genomic clustering algorithms, such as U/VSEARCH. The primary contribution of this approach is the ability to recognize previously seen or unseen novel sequences using an incremental approach: for sequences whose examples were previously seen by the algorithm, the algorithm can predict a correct label. For previously unseen novel sequences, the algorithm assigns a temporary label and then updates that label with a permanent one if/when such a label is established in a future reference database. The incremental learning aspect of the proposed approach provides the additional benefit and capability to process the data continuously as new datasets become available. This functionality is notable as most sequence data processing platforms are static in nature, designed to run on a single batch of data, whose only other remedy to process additional data is to combine the new and old data and rerun the entire analysis. We report our promising preliminary results on an extended 16S rRNA database.more » « less
-
Abstract Accurate classification of high‐dimensional data is important in many scientific applications. We propose a family of high‐dimensional classification methods based upon a comparison of the component‐wise distances of the feature vector of a sample to the within‐class population quantiles. These methods are motivated by the fact that quantile classifiers based on these component‐wise distances are the most powerful univariate classifiers for an optimal choice of the quantile level. A simple aggregation approach for constructing a multivariate classifier based upon these component‐wise distances to the within‐class quantiles is proposed. It is shown that this classifier is consistent with the asymptotically optimal classifier as the sample size increases. Our proposed classifiers result in simple piecewise‐linear decision rule boundaries that can be efficiently trained. Numerical results are shown to demonstrate competitive performance for the proposed classifiers on both simulated data and a benchmark email spam application.more » « less
-
In this study, we explored machine learning approaches for predictive diagnosis using surface-enhanced Raman scattering (SERS), applied to the detection of COVID-19 infection in biological samples. To do this, we utilized SERS data collected from 20 patients at the University of Maryland Baltimore School of Medicine. As a preprocessing step, the positive-negative labels are obtained using Polymerase Chain Reaction (PCR) testing. First, we compared the performance of linear and nonlinear dimensionality techniques for projecting the high-dimensional Raman spectra to a low-dimensional space where a smaller number of variables defines each sample. The appropriate number of reduced features used was obtained by comparing the mean accuracy from a 10-fold cross-validation. Finally, we employed Gaussian process (GP) classification, a probabilistic machine learning approach, to correctly predict the occurrence of a negative or positive sample as a function of the low-dimensional space variables. As opposed to providing rigid class labels, the GP classifier provides a probability (ranging from zero to one) that a given sample is positive or negative. In practice, the proposed framework can be used to provide high-throughput rapid testing, and a follow-up PCR can be used for confirmation in cases where the model’s uncertainty is unacceptably high.more » « less
An official website of the United States government

