Generative AI is generating much enthusiasm on potentially advancing biological design in computational biology. In this paper we take a somewhat contrarian view, arguing that a broader and deeper understanding of existing biological sequences is essential before undertaking the design of novel ones. We draw attention, for instance, to current protein function prediction methods which currently face significant limitations due to incomplete data and inherent challenges in defining and measuring function. We propose a “blue sky” vision centered on both comprehensive and precise annotation of existing protein and DNA sequences, aiming to develop a more complete and precise understanding of biological function. By contrasting recent studies that leverage generative AI for biological design with the pressing need for enhanced data annotation, we underscore the importance of prioritizing robust predictive models over premature generative efforts. We advocate for a strategic shift toward thorough sequence annotation and predictive understanding, laying a solid foundation for future advances in biological design.
more »
« less
Artificial Intelligence for Biology
Abstract Despite efforts to integrate research across different subdisciplines of biology, the scale of integration remains limited. We hypothesize that future generations of Artificial Intelligence (AI) technologies specifically adapted for biological sciences will help enable the reintegration of biology. AI technologies will allow us not only to collect, connect and analyze data at unprecedented scales, but also to build comprehensive predictive models that span various subdisciplines. They will make possible both targeted (testing specific hypotheses) and untargeted discoveries. AI for biology will be the cross-cutting technology that will enhance our ability to do biological research at every scale. We expect AI to revolutionize biology in the 21st century much like statistics transformed biology in the 20th century. The difficulties, however, are many, including data curation and assembly, development of new science in the form of theories that connect the subdisciplines, and new predictive and interpretable AI models that are more suited to biology than existing machine learning and AI techniques. Development efforts will require strong collaborations between biological and computational scientists. This white paper provides a vision for AI for Biology and highlights some challenges.
more »
« less
- PAR ID:
- 10293333
- Date Published:
- Journal Name:
- Integrative and Comparative Biology
- ISSN:
- 1540-7063
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
It’s critical to foster artificial intelligence (AI) literacy for high school students, the first generation to grow up surrounded by AI, to understand working mechanism of data-driven AI technologies and critically evaluate automated decisions from predictive models. While efforts have been made to engage youth in understanding AI through developing machine learning models, few provided in-depth insights into the nuanced learning processes. In this study, we examined high school students’ data modeling practices and processes. Twenty-eight students developed machine learning models with text data for classifying negative and positive reviews of ice cream stores. We identified nine data modeling practices that describe students’ processes of model exploration, development, and testing and two themes about evaluating automated decisions from data technologies. The results provide implications for designing accessible data modeling experiences for students to understand data justice as well as the role and responsibility of data modelers in creating AI technologies.more » « less
-
Byers, Karen B; Johnson, Barbara (Ed.)Introduction: Rapid advances in biotechnologies and transdisciplinary research are enhancing the ability to perform full-scale engineering of biology, contributing to worldwide efforts to create bioengineered plants, medicines, and commodities, which promise sustainability and innovative properties. Objective: This rapidly evolving biotechnology landscape is prompting focused scrutiny on biosecurity frameworks in place to mitigate harmful exploitation of biotechnology by state and non-state actors. Concerns about biosafety and biosecurity of engineering biology research have existed for decades as views about how advances in this and associated fields might provide new capabilities to malicious actors. This article considers biosecurity concerns using examples of research advances in engineering biology. Methods: The authors explore risk assessment and mitigation of transdisciplinary biotechnology research and development, using the framework developed in the National Academies' study on Biodefense in an Age of Synthetic Biology. Results: The Synthetic Biology Assessment Framework focuses on risks of using advanced approaches and technologies to enhance or create novel pathogens and toxins. The field of engineering biology continues to advance at a pace that challenges current risk assessment frameworks. Conclusions: This framework likely is sufficient to assess new science and technology advances affecting conventional biological agents. However, the risk assessment framework may have limited applicability for technologies that are not usable with conventional biological agents and result in economic or broader national security concerns. Finally, the vast majority of discourse has been focused primarily on risks rather than benefits, and analyzing both in future evaluations is critical to balancing scientific progress with risk reduction.more » « less
-
Synopsis Mechanistically connecting genotypes to phenotypes is a longstanding and central mission of biology. Deciphering these connections will unite questions and datasets across all scales from molecules to ecosystems. Although high-throughput sequencing has provided a rich platform on which to launch this effort, tools for deciphering mechanisms further along the genome to phenome pipeline remain limited. Machine learning approaches and other emerging computational tools hold the promise of augmenting human efforts to overcome these obstacles. This vision paper is the result of a Reintegrating Biology Workshop, bringing together the perspectives of integrative and comparative biologists to survey challenges and opportunities in cracking the genotype to phenotype code and thereby generating predictive frameworks across biological scales. Key recommendations include promoting the development of minimum “best practices” for the experimental design and collection of data; fostering sustained and long-term data repositories; promoting programs that recruit, train, and retain a diversity of talent; and providing funding to effectively support these highly cross-disciplinary efforts. We follow this discussion by highlighting a few specific transformative research opportunities that will be advanced by these efforts.more » « less
-
Cells interact as dynamically evolving ecosystems. While recent single-cell and spatial multi-omics technologies quantify individual cell characteristics, predicting their evolution requires mathematical modeling. We propose a conceptual framework—a cell behavior hypothesis grammar—that uses natural language statements (cell rules) to create mathematical models. This enables systematic integration of biological knowledge and multi-omics data to generate in silico models, enabling virtual “thought experiments” that test and expand our understanding of multicellular systems and generate new testable hypotheses. This paper motivates and describes the grammar, offers a reference implementation, and demonstrates its use in developing both de novo mechanistic models and those informed by multi-omics data. We show its potential through examples in cancer and its broader applicability in simulating brain development. This approach bridges biological, clinical, and systems biology research for mathematical modeling at scale, allowing the community to predict emergent multicellular behavior.more » « less
An official website of the United States government

