skip to main content

Search for: All records

Award ID contains: 2021871

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available August 1, 2023
  2. In this paper we consider the training stability of recurrent neural networks (RNNs) and propose a family of RNNs, namely SBO-RNN, that can be formulated using stochastic bilevel optimization (SBO). With the help of stochastic gradient descent (SGD), we manage to convert the SBO problem into an RNN where the feedforward and backpropagation solve the lower and upper-level optimization for learning hidden states and their hyperparameters, respectively. We prove that under mild conditions there is no vanishing or exploding gradient in training SBO-RNN. Empirically we demonstrate our approach with superior performance on several benchmark datasets, with fewer parameters, less training data, and much faster convergence. Code is available at https://zhang-vislab.github.io.
  3. Regression ensembles consisting of a collection of base regression models are often used to improve the estimation/prediction performance of a single regression model. It has been shown that the individual accuracy of the base models and the ensemble diversity are the two key factors affecting the performance of an ensemble. In this paper, we derive a theory for regression ensembles that illustrates the subtle trade-off between individual accuracy and ensemble diversity from the perspective of statistical correlations. Then, inspired by our derived theory, we further propose a novel loss function and a training algorithm for deep learning regression ensembles. We then demonstrate the advantage of our training approach over standard regression ensemble methods including random forest and gradient boosting regressors with both benchmark regression problems and chemical sensor problems involving analysis of Raman spectroscopy. Our key contribution is that our loss function and training algorithm is able to manage diversity explicitly in an ensemble, rather than merely allowing diversity to occur by happenstance.
  4. Cutting-edge machine learning techniques often require millions of labeled data objects to train a robust model. Because relying on humans to supply such a huge number of labels is rarely practical, automated methods for label generation are needed. Unfortunately, critical challenges in auto-labeling remain unsolved, including the following research questions: (1) which objects to ask humans to label, (2) how to automatically propagate labels to other objects, and (3) when to stop labeling. These three questions are not only each challenging in their own right, but they also correspond to tightly interdependent problems. Yet existing techniques provide at best isolated solutions to a subset of these challenges. In this work, we propose the first approach, called LANCET, that successfully addresses all three challenges in an integrated framework. LANCET is based on a theoretical foundation characterizing the properties that the labeled dataset must satisfy to train an effective prediction model, namely the Covariate-shift and the Continuity conditions. First, guided by the Covariate-shift condition, LANCET maps raw input data into a semantic feature space, where an unlabeled object is expected to share the same label with its near-by labeled neighbor. Next, guided by the Continuity condition, LANCET selects objects for labeling, aimingmore »to ensure that unlabeled objects always have some sufficiently close labeled neighbors. These two strategies jointly maximize the accuracy of the automatically produced labels and the prediction accuracy of the machine learning models trained on these labels. Lastly, LANCET uses a distribution matching network to verify whether both the Covariate-shift and Continuity conditions hold, in which case it would be safe to terminate the labeling process. Our experiments on diverse public data sets demonstrate that LANCET consistently outperforms the state-of-the-art methods from Snuba to GOGGLES and other baselines by a large margin - up to 30 percentage points increase in accuracy.« less