skip to main content


Title: Multiscale Modeling Meets Machine Learning: What Can We Learn?
Machine learning is increasingly recognized as a promising technology in the biological, biomedical, and behavioral sciences. There can be no argument that this technique is incredibly successful in image recognition with immediate applications in diagnostics including electrophysiology, radiology, or pathology, where we have access to massive amounts of annotated data. However, machine learning often performs poorly in prognosis, especially when dealing with sparse data. This is a field where classical physics-based simulation seems to remain irreplaceable. In this review, we identify areas in the biomedical sciences where machine learning and multiscale modeling can mutually benefit from one another: Machine learning can integrate physics-based knowledge in the form of governing equations, boundary conditions, or constraints to manage ill-posted problems and robustly handle sparse and noisy data; multiscale modeling can integrate machine learning to create surrogate models, identify system dynamics and parameters, analyze sensitivities, and quantify uncertainty to bridge the scales and understand the emergence of function. With a view towards applications in the life sciences, we discuss the state of the art of combining machine learning and multiscale modeling, identify applications and opportunities, raise open questions, and address potential challenges and limitations. This review serves as introduction to a special issue on Uncertainty Quantification, Machine Learning, and Data-Driven Modeling of Biological Systems that will help identify current roadblocks and areas where computational mechanics, as a discipline, can play a significant role. We anticipate that it will stimulate discussion within the community of computational mechanics and reach out to other disciplines including mathematics, statistics, computer science, artificial intelligence, biomedicine, systems biology, and precision medicine to join forces towards creating robust and efficient models for biological systems.  more » « less
Award ID(s):
1762063 1904444
NSF-PAR ID:
10157400
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Archives of Computational Methods in Engineering
ISSN:
1134-3060
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Density functional theory (DFT) has been applied to modeling molecular interactions in water for over three decades. The ubiquity of water in chemical and biological processes demands a unified understanding of its physics, from the single molecule to the thermodynamic limit and everything in between. Recent advances in the development of data-driven and machine-learning potentials have accelerated simulation of water and aqueous systems with DFT accuracy. However, anomalous properties of water in the condensed phase, where a rigorous treatment of both local and non-local many-body (MB) interactions is in order, are often unsatisfactory or partially missing in DFT models of water. In this review, we discuss the modeling of water and aqueous systems based on DFT and provide a comprehensive description of a general theoretical/computational framework for the development of data-driven many-body potentials from DFT reference data. This framework, coined MB-DFT, readily enables efficient many-body molecular dynamics (MD) simulations of small molecules, in both gas and condensed phases, while preserving the accuracy of the underlying DFT model. Theoretical considerations are emphasized, including the role that the delocalization error plays in MB-DFT potentials of water and the possibility to elevate DFT and MB-DFT to near-chemical-accuracy through a density-corrected formalism. The development of the MB-DFT framework is described in detail, along with its application in MB-MD simulations and recent extension to the modeling of reactive processes in solution within a quantum mechanics/MB molecular mechanics (QM/MB-MM) scheme, using water as a prototypical solvent. Finally, we identify open challenges and discuss future directions for MB-DFT and QM/MB-MM simulations in condensed phases. 
    more » « less
  2. null (Ed.)
    In the past few decades, we have witnessed tremendous advancements in biology, life sciences and healthcare. These advancements are due in no small part to the big data made available by various high-throughput technologies, the ever-advancing computing power, and the algorithmic advancements in machine learning. Specifically, big data analytics such as statistical and machine learning has become an essential tool in these rapidly developing fields. As a result, the subject has drawn increased attention and many review papers have been published in just the past few years on the subject. Different from all existing reviews, this work focuses on the application of systems, engineering principles and techniques in addressing some of the common challenges in big data analytics for biological, biomedical and healthcare applications. Specifically, this review focuses on the following three key areas in biological big data analytics where systems engineering principles and techniques have been playing important roles: the principle of parsimony in addressing overfitting, the dynamic analysis of biological data, and the role of domain knowledge in biological data analytics. 
    more » « less
  3. Abstract

    Quantitative predictions of natural and induced phenomena in fractured rock is one of the great challenges in the Earth and Energy Sciences with far‐reaching economic and environmental impacts. Fractures occupy a very small volume of a subsurface formation but often dominate fluid flow, solute transport and mechanical deformation behavior. They play a central role in CO2sequestration, nuclear waste disposal, hydrogen storage, geothermal energy production, nuclear nonproliferation, and hydrocarbon extraction. These applications require predictions of fracture‐dependent quantities of interest such as CO2leakage rate, hydrocarbon production, radionuclide plume migration, and seismicity; to be useful, these predictions must account for uncertainty inherent in subsurface systems. Here, we review recent advances in fractured rock research covering field‐ and laboratory‐scale experimentation, numerical simulations, and uncertainty quantification. We discuss how these have greatly improved the fundamental understanding of fractures and one's ability to predict flow and transport in fractured systems. Dedicated field sites provide quantitative measurements of fracture flow that can be used to identify dominant coupled processes and to validate models. Laboratory‐scale experiments fill critical knowledge gaps by providing direct observations and measurements of fracture geometry and flow under controlled conditions that cannot be obtained in the field. Physics‐based simulation of flow and transport provide a bridge in understanding between controlled simple laboratory experiments and the massively complex field‐scale fracture systems. Finally, we review the use of machine learning‐based emulators to rapidly investigate different fracture property scenarios and accelerate physics‐based models by orders of magnitude to enable uncertainty quantification and near real‐time analysis.

     
    more » « less
  4. This Work-in-Progress paper in the Research Category uses a retrospective mixed-methods study to better understand the factors that mediate learning of computational modeling by life scientists. Key stakeholders, including leading scientists, universities and funding agencies, have promoted computational modeling to enable life sciences research and improve the translation of genetic and molecular biology high- throughput data into clinical results. Software platforms to facilitate computational modeling by biologists who lack advanced mathematical or programming skills have had some success, but none has achieved widespread use among life scientists. Because computational modeling is a core engineering skill of value to other STEM fields, it is critical for engineering and computer science educators to consider how we help students from across STEM disciplines learn computational modeling. Currently we lack sufficient research on how best to help life scientists learn computational modeling. To address this gap, in 2017, we observed a short-format summer course designed for life scientists to learn computational modeling. The course used a simulation environment designed to lower programming barriers. We used semi-structured interviews to understand students' experiences while taking the course and in applying computational modeling after the course. We conducted interviews with graduate students and post- doctoral researchers who had completed the course. We also interviewed students who took the course between 2010 and 2013. Among these past attendees, we selected equal numbers of interview subjects who had and had not successfully published journal articles that incorporated computational modeling. This Work-in-Progress paper applies social cognitive theory to analyze the motivations of life scientists who seek training in computational modeling and their attitudes towards computational modeling. Additionally, we identify important social and environmental variables that influence successful application of computational modeling after course completion. The findings from this study may therefore help us educate biomedical and biological engineering students more effectively. Although this study focuses on life scientists, its findings can inform engineering and computer science education more broadly. Insights from this study may be especially useful in aiding incoming engineering and computer science students who do not have advanced mathematical or programming skills and in preparing undergraduate engineering students for collaborative work with life scientists. 
    more » « less
  5. The current availability of soil moisture data over large areas comes from satellite remote sensing technologies (i.e., radar-based systems), but these data have coarse resolution and often exhibit large spatial information gaps. Where data are too coarse or sparse for a given need (e.g., precision farming), one can leverage machine-learning techniques coupled with other sources of environmental information (e.g., topography) to generate gap-free information at a finer spatial resolution (i.e., increased granularity). To this end, we develop a spatial inference engine consisting of modular stages for processing spatial environmental data, generating predictions with machine-learning techniques, and analyzing these predictions. We demonstrate the functionality of this approach and the effects of data processing choices via multiple prediction maps over a United States ecological region with a highly diverse soil moisture profile (i.e., the Middle Atlantic Coastal Plains). The relevance of our work derives from a pressing need to improve the spatial representation of soil moisture for applications in environmental sciences (e.g., ecological niche modeling, carbon monitoring systems, and other Earth system models) and precision farming (e.g., optimizing irrigation practices and other land management decisions). 
    more » « less