Digital pathology has transformed the traditional pathology practice of analyzing tissue under a microscope into a computer vision workflow. Whole-slide imaging allows pathologists to view and analyze microscopic images on a computer monitor, enabling computational pathology. By leveraging artificial intelligence (AI) and machine learning (ML), computational pathology has emerged as a promising field in recent years. Recently, task-specific AI/ML (eg, convolutional neural networks) has risen to the forefront, achieving above-human performance in many image-processing and computer vision tasks. The performance of task-specific AI/ML models depends on the availability of many annotated training datasets, which presents a rate-limiting factor for AI/ML development in pathology. Task-specific AI/ML models cannot benefit from multimodal data and lack generalization, eg, the AI models often struggle to generalize to new datasets or unseen variations in image acquisition, staining techniques, or tissue types. The 2020s are witnessing the rise of foundation models and generative AI. A foundation model is a large AI model trained using sizable data, which is later adapted (or fine-tuned) to perform different tasks using a modest amount of task-specific annotated data. These AI models provide in-context learning, can self-correct mistakes, and promptly adjust to user feedback. In this review, we provide a brief overview of recent advances in computational pathology enabled by task-specific AI, their challenges and limitations, and then introduce various foundation models. We propose to create a pathology-specific generative AI based on multimodal foundation models and present its potentially transformative role in digital pathology. We describe different use cases, delineating how it could serve as an expert companion of pathologists and help them efficiently and objectively perform routine laboratory tasks, including quantifying image analysis, generating pathology reports, diagnosis, and prognosis. We also outline the potential role that foundation models and generative AI can play in standardizing the pathology laboratory workflow, education, and training. 
                        more » 
                        « less   
                    This content will become publicly available on March 13, 2026
                            
                            A definition and taxonomy of digital twins: case studies with machine learning and scientific applications
                        
                    
    
            As next-generation scientific instruments and simulations generate ever larger datasets, there is a growing need for high-performance computing (HPC) techniques that can provide timely and accurate analysis. With artificial intelligence (AI) and hardware breakthroughs at the forefront in recent years, interest in using this technology to perform decision-making tasks with continuously evolving real-world datasets has increased. Digital twinning is one method in which virtual replicas of real-world objects are modeled, updated, and interpreted to perform such tasks. However, the interface between AI techniques, digital twins (DT), and HPC technologies has yet to be thoroughly investigated despite the natural synergies between them. This paper explores the interface between digital twins, scientific computing, and machine learning (ML) by presenting a consistent definition for the digital twin, performing a systematic analysis of the literature to build a taxonomy of ML-enhanced digital twins, and discussing case studies from various scientific domains. We identify several promising future research directions, including hybrid assimilation frameworks and physics-informed techniques for improved accuracy. Through this comprehensive analysis, we aim to highlight both the current state-of-the-art and critical paths forward in this rapidly evolving field. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2321123
- PAR ID:
- 10632565
- Publisher / Repository:
- Frontiers in High Performance Computing
- Date Published:
- Journal Name:
- Frontiers in High Performance Computing
- Volume:
- 3
- ISSN:
- 2813-7337
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            High-Performance Computing (HPC) is increasingly being used in traditional scientific domains as well as emerging areas like Deep Learning (DL). This has led to a diverse set of professionals who interact with state-of-the-art HPC systems. The deployment of Science Gateways for HPC systems like Open On-Demand has a significant positive impact on these users in migrating their workflows to HPC systems. Although computing capabilities are ubiquitously available (as on-premises or in the cloud HPC infrastructure), significant effort and expertise are required to use them effectively. This is particularly challenging for domain scientists and other users whose primary expertise lies outside of computer science. In this paper, we seek to minimize the steep learning curve and associated complexities of using state-of-the-art high-performance systems by creating SAI: an AI-Enabled Speech Assistant Interface for Science Gateways in High Performance Computing. We use state-of-the-art AI models for speech and text and fine-tune them for the HPC arena by retraining them on a new HPC dataset we create. We use ontologies and knowledge graphs to capture the complex relationships between various components of the HPC ecosystem. We finally show how one can integrate and deploy SAI in Open OnDemand and evaluate its functionality and performance on real HPC systems. To the best of our knowledge, this is the first effort aimed at designing and developing an AI-powered speech-assisted interface for science gateways in HPC.more » « less
- 
            Material characterization techniques are widely used to characterize the physical and chemical properties of materials at the nanoscale and, thus, play central roles in material scientific discoveries. However, the large and complex datasets generated by these techniques often require significant human effort to interpret and extract meaningful physicochemical insights. Artificial intelligence (AI) techniques such as machine learning (ML) have the potential to improve the efficiency and accuracy of surface analysis by automating data analysis and interpretation. In this perspective paper, we review the current role of AI in surface analysis and discuss its future potential to accelerate discoveries in surface science, materials science, and interface science. We highlight several applications where AI has already been used to analyze surface analysis data, including the identification of crystal structures from XRD data, analysis of XPS spectra for surface composition, and the interpretation of TEM and SEM images for particle morphology and size. We also discuss the challenges and opportunities associated with the integration of AI into surface analysis workflows. These include the need for large and diverse datasets for training ML models, the importance of feature selection and representation, and the potential for ML to enable new insights and discoveries by identifying patterns and relationships in complex datasets. Most importantly, AI analyzed data must not just find the best mathematical description of the data, but it must find the most physical and chemically meaningful results. In addition, the need for reproducibility in scientific research has become increasingly important in recent years. The advancement of AI, including both conventional and the increasing popular deep learning, is showing promise in addressing those challenges by enabling the execution and verification of scientific progress. By training models on large experimental datasets and providing automated analysis and data interpretation, AI can help to ensure that scientific results are reproducible and reliable. Although integration of knowledge and AI models must be considered for the transparency and interpretability of models, the incorporation of AI into the data collection and processing workflow will significantly enhance the efficiency and accuracy of various surface analysis techniques and deepen our understanding at an accelerated pace.more » « less
- 
            Error-bounded lossy compression has been a critical technique to significantly reduce the sheer amounts of simulation datasets for high-performance computing (HPC) scientific applications while effectively controlling the data distortion based on user-specified error bound. In many real-world use cases, users must perform computational operations on the compressed data. However, none of the existing error-bounded lossy compressors support operations, inevitably resulting in undesired decompression costs. In this paper, we propose a novel error-bounded lossy compressor (called SZOps), which supports not only error-bounding features but efficient computations (including negation, scalar addition, scalar multiplication, mean, variance, etc.) on the compressed data without the complete decompression step, which is the first attempt to the best of our knowledge. We develop several optimization strategies to maximize the overall compression ratio and execution performance. We evaluate SZOps compared to other state-of-the-art lossy compressors based on multiple real-world scientific application datasets.more » « less
- 
            As the size and complexity of high-performance computing (HPC) systems keep growing, scientists' ability to trust the data produced is paramount due to potential data corruption for various reasons, which may stay undetected. While employing machine learning-based anomaly detection techniques could relieve scientists of such concern, it is practically infeasible due to the need for labels for volumes of scientific datasets and the unwanted extra overhead associated. In this paper, we exploit spatial sparsity profiles exhibited in scientific datasets and propose an approach to detect anomalies effectively. Our method first extracts block-level sparse representations of original datasets in the transformed domain. Then it learns from the extracted sparse representations and builds the boundary threshold between normal and abnormal without relying on labeled data. Experiments using real-world scientific datasets show that the proposed approach requires 13% on average (less than 10% in most cases and as low as 0.3%) of the entire dataset to achieve competitive detection accuracy (70.74%-100.0%) as compared to two state-of-the-art unsupervised techniques.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
