- Award ID(s):
- 1704845
- NSF-PAR ID:
- 10095676
- Date Published:
- Journal Name:
- International Conference on Learning Representations (ICLR)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Fast inference of numerical model parameters from data is an important prerequisite to generate predictive models for a wide range of applications. Use of sampling-based approaches such as Markov chain Monte Carlo may become intractable when each likelihood evaluation is computationally expensive. New approaches combining variational inference with normalizing flow are characterized by a computational cost that grows only linearly with the dimensionality of the latent variable space, and rely on gradient-based optimization instead of sampling, providing a more efficient approach for Bayesian inference about the model parameters. Moreover, the cost of frequently evaluating an expensive likelihood can be mitigated by replacing the true model with an offline trained surrogate model, such as neural networks. However, this approach might generate significant bias when the surrogate is insufficiently accurate around the posterior modes. To reduce the computational cost without sacrificing inferential accuracy, we propose Normalizing Flow with Adaptive Surrogate (NoFAS), an optimization strategy that alternatively updates the normalizing flow parameters and surrogate model parameters. We also propose an efficient sample weighting scheme for surrogate model training that preserves global accuracy while effectively capturing high posterior density regions. We demonstrate the inferential and computational superiority of NoFAS against various benchmarks, including cases where the underlying model lacks identifiability. The source code and numerical experiments used for this study are available at https://github.com/cedricwangyu/NoFAS.more » « less
-
This paper describes a generalizable framework for creating context-aware wall-time prediction models for HPC applications. This framework: (a) cost-effectively generates comprehensive application-specific training data, (b) provides an application-independent machine learning pipeline that trains different regression models over the training datasets, and (c) establishes context-aware selection criteria for model selection. We explain how most of the training data can be generated on commodity or contention-free cyberinfrastructure and how the predictive models can be scaled to the production environment with the help of a limited number of resource-intensive generated runs (we show almost seven-fold cost reductions along with better performance). Our machine learning pipeline does feature transformation, and dimensionality reduction, then reduces sampling bias induced by data imbalance. Our context-aware model selection algorithm chooses the most appropriate regression model for a given target application that reduces the number of underpredictions while minimizing overestimation errors. Index Terms—AI4CI, Data Science Workflow, Custom ML Models, HPC, Data Generation, Scheduling, Resource Estimationsmore » « less
-
Abstract To date, many AI initiatives (eg, AI4K12, CS for All) developed standards and frameworks as guidance for educators to create accessible and engaging Artificial Intelligence (AI) learning experiences for K‐12 students. These efforts revealed a significant need to prepare youth to gain a fundamental understanding of how intelligence is created, applied, and its potential to perpetuate bias and unfairness. This study contributes to the growing interest in K‐12 AI education by examining student learning of modelling real‐world text data. Four students from an Advanced Placement computer science classroom at a public high school participated in this study. Our qualitative analysis reveals that the students developed nuanced and in‐depth understandings of how text classification models—a type of AI application—are trained. Specifically, we found that in modelling texts, students: (1) drew on their social experiences and cultural knowledge to create predictive features, (2) engineered predictive features to address model errors, (3) described model learning patterns from training data and (4) reasoned about noisy features when comparing models. This study contributes to an initial understanding of student learning of modelling unstructured data and offers implications for scaffolding in‐depth reasoning about model decision making.
Practitioner notes What is already known about this topic
Scholarly attention has turned to examining Artificial Intelligence (AI) literacy in K‐12 to help students understand the working mechanism of AI technologies and critically evaluate automated decisions made by computer models.
While efforts have been made to engage students in understanding AI through building machine learning models with data, few of them go in‐depth into teaching and learning of feature engineering, a critical concept in modelling data.
There is a need for research to examine students' data modelling processes, particularly in the little‐researched realm of unstructured data.
What this paper adds
Results show that students developed nuanced understandings of models learning patterns in data for automated decision making.
Results demonstrate that students drew on prior experience and knowledge in creating features from unstructured data in the learning task of building text classification models.
Students needed support in performing feature engineering practices, reasoning about noisy features and exploring features in rich social contexts that the data set is situated in.
Implications for practice and/or policy
It is important for schools to provide hands‐on model building experiences for students to understand and evaluate automated decisions from AI technologies.
Students should be empowered to draw on their cultural and social backgrounds as they create models and evaluate data sources.
To extend this work, educators should consider opportunities to integrate AI learning in other disciplinary subjects (ie, outside of computer science classes).
-
Abstract Motivation Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals. Underlying that discrepancy in predictive ability are the common assumptions by reference gene finding algorithms that genes are conserved, well-formed and produce functional proteins.
Results We describe a probabilistic approach for predicting recent changes to gene structure that may or may not conserve function. The model is applicable to both coding and non-coding genes, and can be trained on existing gene annotations without requiring curated examples of aberrant splicing. We apply this model to the problem of predicting altered splicing patterns in the genomes of individual humans, and we demonstrate that performing gene-structure prediction without relying on conserved coding features is feasible. The model predicts an unexpected abundance of variants that create de novo splice sites, an observation supported by both simulations and empirical data from RNA-seq experiments. While these de novo splice variants are commonly misinterpreted by other tools as coding or non-coding variants of little or no effect, we find that in some cases they can have large effects on splicing activity and protein products and we propose that they may commonly act as cryptic factors in disease.
Availability and implementation The software is available from geneprediction.org/SGRF.
Supplementary information Supplementary information is available at Bioinformatics online.
-
A key challenge facing the use of machine learning (ML) in organizational selection settings (e.g., the processing of loan or job applications) is the potential bias against (racial and gender) minorities. To address this challenge, a rich literature of Fairness-Aware ML (FAML) algorithms has emerged, attempting to ameliorate biases while maintaining the predictive accuracy of ML algorithms. Almost all existing FAML algorithms define their optimization goals according to a selection task, meaning that ML outputs are assumed to be the final selection outcome. In practice, though, ML outputs are rarely used as-is. In personnel selection, for example, ML often serves a support role to human resource managers, allowing them to more easily exclude unqualified applicants. This effectively assigns to ML a screening rather than a selection task. It might be tempting to treat selection and screening as two variations of the same task that differ only quantitatively on the admission rate. This paper, however, reveals a qualitative difference between the two in terms of fairness. Specifically, we demonstrate through conceptual development and mathematical analysis that miscategorizing a screening task as a selection one could not only degrade final selection quality but also result in fairness problems such as selection biases within the minority group. After validating our findings with experimental studies on simulated and real-world data, we discuss several business and policy implications, highlighting the need for firms and policymakers to properly categorize the task assigned to ML in assessing and correcting algorithmic biases.