The increasing popularity of deep learning models has created new opportunities for developing AI-based recommender systems. Designing recommender systems using deep neural networks requires careful architecture design, and further optimization demands extensive co-design efforts on jointly optimizing model architecture and hardware. Design automation, such as Automated Machine Learning (AutoML), is necessary to fully exploit the potential of recommender model design, including model choices and model-hardware co-design strategies. We introduce a novel paradigm that utilizes weight sharing to explore abundant solution spaces. Our paradigm creates a large supernet to search for optimal architectures and co-design strategies to address the challenges of data multi-modality and heterogeneity in the recommendation domain. From a model perspective, the supernet includes a variety of operators, dense connectivity, and dimension search options. From a co-design perspective, it encompasses versatile Processing-In-Memory (PIM) configurations to produce hardware-efficient models. Our solution space’s scale, heterogeneity, and complexity pose several challenges, which we address by proposing various techniques for training and evaluating the supernet. Our crafted models show promising results on three Click-Through Rates (CTR) prediction benchmarks, outperforming both manually designed and AutoML-crafted models with state-of-the-art performance when focusing solely on architecture search. From a co-design perspective, we achieve 2 × FLOPs efficiency, 1.8 × energy efficiency, and 1.5 × performance improvements in recommender models.
more »
« less
Toward Fully Automated Machine Learning for Routability Estimator Development
The rise of machine learning (ML) technology inspires a boom in its applications in electronic design automation (EDA) and helps improve the degree of automation in chip designs. However, manually crafting ML models remains a complex and time-consuming process because it requires extensive human expertise and tremendous engineering efforts to carefully extract features and design model architectures. In this work, we leverage automated ML techniques to automate the ML model development for routability prediction, a well-established technique that can help to guide cell placement toward routable solutions. We present an automated feature selection method to identify suitable features for model inputs. We develop a neural architecture search method to search for high-quality neural architectures without human interference. Our search method supports various operations and highly flexible connections, leading to architectures significantly different from all previous human-crafted models. Our experimental results demonstrate that our automatically generated models clearly outperform multiple representative manually crafted solutions with a superior 9.9% improvement. Moreover, compared with human-crafted models, which easily take weeks or months to develop, our efficient automated machine-learning framework completes the whole model development process in only 1 day.
more »
« less
- Award ID(s):
- 2106828
- PAR ID:
- 10534415
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
- Volume:
- 43
- Issue:
- 3
- ISSN:
- 0278-0070
- Page Range / eLocation ID:
- 970 to 982
- Subject(s) / Keyword(s):
- Automated machine learning (AutoML) neural architecture search physical design
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Analog integrated circuit (IC) placement is a heavily manual and time-consuming task that has a significant impact on chip quality. Several recent studies apply machine learning (ML) techniques to directly predict the impact of placement on circuit performance or even guide the placement process. However, the significant diversity in analog design topologies can lead to different impacts on performance metrics (e.g., common-mode rejection ratio (CMRR) or offset voltage). Thus, it is unlikely that the same ML model structure will achieve the best performance for all designs and metrics. In addition, customizing ML models for different designs require more tremendous engineering efforts and longer development cycles. In this work, we leverage Neural Architecture Search (NAS) to automatically develop customized neural architectures for different analog circuit designs and metrics. Our proposed NAS methodology supports an unconstrained DAG-based search space containing a wide range of ML operations and topological connections. Our search strategy can efficiently explore this flexible search space and provide every design with the best-customized model to boost the model performance. We make unprejudiced comparisons with the claimed performance of the previous representative work on exactly the same dataset. After fully automated development within only 0.5 days, generated models give 3.61% superior accuracy than the prior art.more » « less
-
Background: Trust is a critical driver of technology usage behaviors and is essential for technology adoption. Thus, nurses’ participation in software development is critical for influencing their involvement, competency, and overall perceptions of software quality. Purpose: To engage nurses as subject matter experts to develop a machine learning (ML) Pain Recognition Automated Monitoring System. Method: Using the Human-centered Design for Embedded Machine Learning Solutions (HCDe-MLS) model, nurses informed the development of an intuitive data labeling software solution, Human-to-Artificial Intelligence (H2AI). Findings: H2AI facilitated efficient data labeling, stored labeled data to train ML models, and tracked inter-rater reliability. OpenCV provided efficient video-to-image data pre-processing for data labeling. MobileFaceNet demonstrated superior results for default landmark placement on neonatal video images. Discussion: Nurses’ engagement in clinical decision support software development is critical for ensuring the end-product addresses nurses’ priorities, reflects nurses’ actual cognitive and decision-making processes, and garners nurses’ trust and technology adoption.more » « less
-
Accurately predicting the ridership of public-transit routes provides substantial benefits to both transit agencies, who can dispatch additional vehicles proactively before the vehicles that serve a route become crowded, and to passengers, who can avoid crowded vehicles based on publicly available predictions. The spread of the coronavirus disease has further elevated the importance of ridership prediction as crowded vehicles now present not only an inconvenience but also a public-health risk. At the same time, accurately predicting ridership has become more challenging due to evolving ridership patterns, which may make all data except for the most recent records stale. One promising approach for improving prediction accuracy is to fine-tune the hyper-parameters of machine-learning models for each transit route based on the characteristics of the particular route, such as the number of records. However, manually designing a machine-learning model for each route is a labor-intensive process, which may require experts to spend a significant amount of their valuable time. To help experts with designing machine-learning models, we propose a neural-architecture and feature search approach, which optimizes the architecture and features of a deep neural network for predicting the ridership of a public-transit route. Our approach is based on a randomized local hyper-parameter search, which minimizes both prediction error as well as the complexity of the model. We evaluate our approach on real-world ridership data provided by the public transit agency of Chattanooga, TN, and we demonstrate that training neural networks whose architectures and features are optimized for each route provides significantly better performance than training neural networks whose architectures and features are generic.more » « less
-
Surface tension is a critical property that influences polymer behavior at interfaces and affects applications ranging from coatings to biomedical devices. Traditional experimental methods for measuring polymer surface tension are time-consuming, costly, and sensitive to environmental conditions. Computational approaches such as molecular dynamics (MD) simulations are valuable but computationally intensive, especially for polymers with long chains. This study investigates the use of machine learning (ML) techniques to predict polymer surface tension using different levels of molecular representation, focusing on multilinear regression (MLR), random forest (RF), and graph neural networks (GNNs). A data set of 317 homopolymers collected from the PolyInfo database is used to train and evaluate these models. Descriptors are derived at various levels of complexity, ranging from manually calculated features to graph-based representations. The GNN approach captures the intrinsic connectivity of polymer structures, while the MLR and RF models rely on manually crafted descriptors. The performance of these models is compared with experimental data, with the GNN model demonstrating superior accuracy due to its ability to directly learn from molecular graphs. Our results show that GNNs can better capture complex nonlinear relationships in polymer structures than traditional descriptorbased methods, suggesting their significant potential for accelerating polymer design and development. The study also includes validation of model predictions against molecular dynamics simulations, highlighting the potential of GNNs to accurately model polymer interfacial properties.more » « less
An official website of the United States government

