skip to main content

Title: Intermittent human-in-the-loop model selection using cerebro: a demonstration
Deep learning (DL) is revolutionizing many fields. However, there is a major bottleneck for the wide adoption of DL: the pain of model selection , which requires exploring a large config space of model architecture and training hyper-parameters before picking the best model. The two existing popular paradigms for exploring this config space pose a false dichotomy. AutoML-based model selection explores configs with high-throughput but uses human intuition minimally. Alternatively, interactive human-in-the-loop model selection completely relies on human intuition to explore the config space but often has very low throughput. To mitigate the above drawbacks, we propose a new paradigm for model selection that we call intermittent human-in-the-loop model selection . In this demonstration, we will showcase our approach using five real-world DL model selection workloads. A short video of our demonstration can be found here:
; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the VLDB Endowment
Page Range or eLocation-ID:
2687 to 2690
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Autonomous process optimization involves the human intervention-free exploration of a range process parameters to improve responses such as product yield and selectivity. Utilizing off-the-shelf components, we develop a closed-loop system for carrying out parallel autonomous process optimization experiments in batch. Upon implementation of our system in the optimization of a stereoselective Suzuki-Miyaura coupling, we find that the definition of a set of meaningful, broad, and unbiased process parameters is the most critical aspect of successful optimization. Importantly, we discern that phosphine ligand, a categorical parameter, is vital to determination of the reaction outcome. To date, categorical parameter selection hasmore »relied on chemical intuition, potentially introducing bias into the experimental design. In seeking a systematic method for selecting a diverse set of phosphine ligands, we develop a strategy that leverages computed molecular feature clustering. The resulting optimization uncovers conditions to selectively access the desired product isomer in high yield.« less
  2. Deep learning (DL) is growing in popularity for many data analytics applications, including among enterprises. Large business-critical datasets in such settings typically reside in RDBMSs or other data systems. The DB community has long aimed to bring machine learning (ML) to DBMS-resident data. Given past lessons from in-DBMS ML and recent advances in scalable DL systems, DBMS and cloud vendors are increasingly interested in adding more DL support for DB-resident data. Recently, a new parallel DL model selection execution approach called Model Hopper Parallelism (MOP) was proposed. In this paper, we characterize the particular suitability of MOP for DL onmore »data systems, but to bring MOP-based DL to DB-resident data, we show that there is no single "best" approach, and an interesting tradeoff space of approaches exists. We explain four canonical approaches and build prototypes upon Greenplum Database, compare them analytically on multiple criteria (e.g., runtime efficiency and ease of governance) and compare them empirically with large-scale DL workloads. Our experiments and analyses show that it is non-trivial to meet all practical desiderata well and there is a Pareto frontier; for instance, some approaches are 3x-6x faster but fare worse on governance and portability. Our results and insights can help DBMS and cloud vendors design better DL support for DB users. All of our source code, data, and other artifacts are available at« less
  3. Abstract Hyperspectral fluorescence imaging is widely used when multiple fluorescent probes with close emission peaks are required. In particular, Fourier transform imaging spectroscopy (FTIS) provides unrivaled spectral resolution; however, the imaging throughput is very low due to the amount of interferogram sampling required. In this work, we apply deep learning to FTIS and show that the interferogram sampling can be drastically reduced by an order of magnitude without noticeable degradation in the image quality. For the demonstration, we use bovine pulmonary artery endothelial cells stained with three fluorescent dyes and 10 types of fluorescent beads with close emission peaks. Further,more »we show that the deep learning approach is more robust to the translation stage error and environmental vibrations. Thereby, the He-Ne correction, which is typically required for FTIS, can be bypassed, thus reducing the cost, size, and complexity of the FTIS system. Finally, we construct neural network models using Hyperband, an automatic hyperparameter selection algorithm, and compare the performance with our manually-optimized model.« less
  4. Many physical tasks such as pulling out a drawer or wiping a table can be modeled with geometric constraints. These geometric constraints are characterized by restrictions on kinematic trajectories and reaction wrenches (forces and moments) of objects under the influence of the constraint. This paper presents a method to infer geometric constraints involving unmodeled objects in human demonstrations using both kinematic and wrench measurements. Our approach takes a recording of a human demonstration and determines what constraints are present, when they occur, and their parameters (e.g. positions). By using both kinematic and wrench information, our methods are able to reliablymore »identify a variety of constraint types, even if the constraints only exist for short durations within the demonstration. We present a systematic approach to fitting arbitrary scleronomic constraint models to kinematic and wrench measurements. Reaction forces are estimated from measurements by removing friction. Position, orientation, force, and moment error metrics are developed to provide systematic comparison between constraint models. By conducting a user study, we show that our methods can reliably identify constraints in realistic situations and confirm the value of including forces and moments in the model regression and selection process.« less
  5. Achilefu, Samuel ; Raghavachari, Ramesh (Ed.)
    Invented in 2010, NanoCluster Beacons (NCBs) (1) are an emerging class of turn-on probes that show unprecedented capabilities in single-nucleotide polymorphism (2) and DNA methylation (3) detection. As the activation colors of NCBs can be tuned by a near-by, guanine-rich activator strand, NCBs are versatile, multicolor probes suitable for multiplexed detection at low cost. Whereas a variety of NCB designs have been explored and reported, further diversification and optimization of NCBs require a full scan of the ligand composition space. However, the current methods rely on microarray and multi-well plate selection, which only screen tens to hundreds of activator sequencesmore »(4, 5). Here we take advantage of the next-generation-sequencing (NGS) platform for high-throughput, large-scale selection of activator strands. We first generated a ~104 activator sequence library on the Illumina MiSeq chip. Hybridizing this activator sequence library with a common nucleation sequence (which carried a nonfluorescent silver cluster) resulted in hundreds of MiSeq chip images with millions of bright spots (i.e. light-up polonies) of various intensities and colors. With a method termed Chip-Hybridized Associated Mapping Platform (CHAMP) (6), we were able to map these bright spots to the original DNA sequencing map, thus recovering the activator sequence behind each bright spot. After assigning an “activation score” to each “light-up polony”, we used a computational algorithm to select the best activator strands and validate these strands using the traditional in-solution preparation and fluorometer measurement method. By exploring a vast ligand composition space and observing the corresponding activation behaviors of silver clusters, we aim to elucidate the design rules of NCBs.« less