Abstract Optimal transport (OT) methods seek a transformation map (or plan) between two probability measures, such that the transformation has the minimum transportation cost. Such a minimum transport cost, with a certain power transform, is called the Wasserstein distance. Recently, OT methods have drawn great attention in statistics, machine learning, and computer science, especially in deep generative neural networks. Despite its broad applications, the estimation of high‐dimensional Wasserstein distances is a well‐known challenging problem owing to the curse‐of‐dimensionality. There are some cutting‐edge projection‐based techniques that tackle high‐dimensional OT problems. Three major approaches of such techniques are introduced, respectively, the slicing approach, the iterative projection approach, and the projection robust OT approach. Open challenges are discussed at the end of the review. This article is categorized under:Statistical and Graphical Methods of Data Analysis > Dimension ReductionStatistical Learning and Exploratory Methods of the Data Sciences > Manifold Learning 
                        more » 
                        « less   
                    
                            
                            Applications of Statistical Methods and Machine Learning in the Space Sciences
                        
                    
    
            Applications and methods of artificial intelligence (AI) are becoming powerful tools for scientific investigations in the space sciences, particularly in the analysis and interpretation of the plethora of spacecraft and ground-based observations of the near-Earth and faraway space. Applications of Statistical Methods and Machine Learning in the Space Sciences was a virtual conference held during 17-21 May 2021 to bring together experts in AI and in the various subfields of space sciences to further explore the utility of AI, machine learning, and statistical analysis techniques while sharing the current status of applications in these fields. The conference concluded by emphasizing the scope of AI techniques available to the space science community for addressing outstanding problems with great success as revealed in the number of research works presented. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2114219
- PAR ID:
- 10327680
- Date Published:
- Journal Name:
- Journal of geophysical research
- ISSN:
- 2169-9402
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract Machine learning (ML) provides a powerful framework for the analysis of high‐dimensional datasets by modelling complex relationships, often encountered in modern data with many variables, cases and potentially non‐linear effects. The impact of ML methods on research and practical applications in the educational sciences is still limited, but continuously grows, as larger and more complex datasets become available through massive open online courses (MOOCs) and large‐scale investigations. The educational sciences are at a crucial pivot point, because of the anticipated impact ML methods hold for the field. To provide educational researchers with an elaborate introduction to the topic, we provide an instructional summary of the opportunities and challenges of ML for the educational sciences, show how a look at related disciplines can help learning from their experiences, and argue for a philosophical shift in model evaluation. We demonstrate how the overall quality of data analysis in educational research can benefit from these methods and show how ML can play a decisive role in the validation of empirical models. Specifically, we (1) provide an overview of the types of data suitable for ML and (2) give practical advice for the application of ML methods. In each section, we provide analytical examples and reproducible R code. Also, we provide an extensive Appendix on ML‐based applications for education. This instructional summary will help educational scientists and practitioners to prepare for the promises and threats that come with the shift towards digitisation and large‐scale assessment in education. Context and implicationsRationale for this studyIn 2020, the worldwide SARS‐COV‐2 pandemic forced the educational sciences to perform a rapid paradigm shift with classrooms going online around the world—a hardly novel but now strongly catalysed development. In the context of data‐driven education, this paper demonstrates that the widespread adoption of machine learning techniques is central for the educational sciences and shows how these methods will become crucial tools in the collection and analysis of data and in concrete educational applications. Helping to leverage the opportunities and to avoid the common pitfalls of machine learning, this paper provides educators with the theoretical, conceptual and practical essentials.Why the new findings matterThe process of teaching and learning is complex, multifaceted and dynamic. This paper contributes a seminal resource to highlight the digitisation of the educational sciences by demonstrating how new machine learning methods can be effectively and reliably used in research, education and practical application.Implications for educational researchers and policy makersThe progressing digitisation of societies around the globe and the impact of the SARS‐COV‐2 pandemic have highlighted the vulnerabilities and shortcomings of educational systems. These developments have shown the necessity to provide effective educational processes that can support sometimes overwhelmed teachers to digitally impart knowledge on the plan of many governments and policy makers. Educational scientists, corporate partners and stakeholders can make use of machine learning techniques to develop advanced, scalable educational processes that account for individual needs of learners and that can complement and support existing learning infrastructure. The proper use of machine learning methods can contribute essential applications to the educational sciences, such as (semi‐)automated assessments, algorithmic‐grading, personalised feedback and adaptive learning approaches. However, these promises are strongly tied to an at least basic understanding of the concepts of machine learning and a degree of data literacy, which has to become the standard in education and the educational sciences.Demonstrating both the promises and the challenges that are inherent to the collection and the analysis of large educational data with machine learning, this paper covers the essential topics that their application requires and provides easy‐to‐follow resources and code to facilitate the process of adoption.more » « less
- 
            Abstract Benchmark datasets and benchmark problems have been a key aspect for the success of modern machine learning applications in many scientific domains. Consequently, an active discussion about benchmarks for applications of machine learning has also started in the atmospheric sciences. Such benchmarks allow for the comparison of machine learning tools and approaches in a quantitative way and enable a separation of concerns for domain and machine learning scientists. However, a clear definition of benchmark datasets for weather and climate applications is missing with the result that many domain scientists are confused. In this paper, we equip the domain of atmospheric sciences with a recipe for how to build proper benchmark datasets, a (nonexclusive) list of domain-specific challenges for machine learning is presented, and it is elaborated where and what benchmark datasets will be needed to tackle these challenges. We hope that the creation of benchmark datasets will help the machine learning efforts in atmospheric sciences to be more coherent, and, at the same time, target the efforts of machine learning scientists and experts of high-performance computing to the most imminent challenges in atmospheric sciences. We focus on benchmarks for atmospheric sciences (weather, climate, and air-quality applications). However, many aspects of this paper will also hold for other aspects of the Earth system sciences or are at least transferable. Significance Statement Machine learning is the study of computer algorithms that learn automatically from data. Atmospheric sciences have started to explore sophisticated machine learning techniques and the community is making rapid progress on the uptake of new methods for a large number of application areas. This paper provides a clear definition of so-called benchmark datasets for weather and climate applications that help to share data and machine learning solutions between research groups to reduce time spent in data processing, to generate synergies between groups, and to make tool developments more targeted and comparable. Furthermore, a list of benchmark datasets that will be needed to tackle important challenges for the use of machine learning in atmospheric sciences is provided.more » « less
- 
            null (Ed.)Abstract Machine learning and artificial intelligence (ML/AI) methods have been used successfully in recent years to solve problems in many areas, including image recognition, unsupervised and supervised classification, game-playing, system identification and prediction, and autonomous vehicle control. Data-driven machine learning methods have also been applied to fusion energy research for over 2 decades, including significant advances in the areas of disruption prediction, surrogate model generation, and experimental planning. The advent of powerful and dedicated computers specialized for large-scale parallel computation, as well as advances in statistical inference algorithms, have greatly enhanced the capabilities of these computational approaches to extract scientific knowledge and bridge gaps between theoretical models and practical implementations. Large-scale commercial success of various ML/AI applications in recent years, including robotics, industrial processes, online image recognition, financial system prediction, and autonomous vehicles, have further demonstrated the potential for data-driven methods to produce dramatic transformations in many fields. These advances, along with the urgency of need to bridge key gaps in knowledge for design and operation of reactors such as ITER, have driven planned expansion of efforts in ML/AI within the US government and around the world. The Department of Energy (DOE) Office of Science programs in Fusion Energy Sciences (FES) and Advanced Scientific Computing Research (ASCR) have organized several activities to identify best strategies and approaches for applying ML/AI methods to fusion energy research. This paper describes the results of a joint FES/ASCR DOE-sponsored Research Needs Workshop on Advancing Fusion with Machine Learning, held April 30–May 2, 2019, in Gaithersburg, MD (full report available at https://science.osti.gov/-/media/fes/pdf/workshop-reports/FES_ASCR_Machine_Learning_Report.pdf ). The workshop drew on broad representation from both FES and ASCR scientific communities, and identified seven Priority Research Opportunities (PRO’s) with high potential for advancing fusion energy. In addition to the PRO topics themselves, the workshop identified research guidelines to maximize the effectiveness of ML/AI methods in fusion energy science, which include focusing on uncertainty quantification, methods for quantifying regions of validity of models and algorithms, and applying highly integrated teams of ML/AI mathematicians, computer scientists, and fusion energy scientists with domain expertise in the relevant areas.more » « less
- 
            Integrating Artificial Intelligence (AI) techniques with remote sensing holds great potential for revolutionizing data analysis and applications in many domains of Earth sciences. This review paper synthesizes the existing literature on AI applications in remote sensing, consolidating and analyzing AI methodologies, outcomes, and limitations. The primary objectives are to identify research gaps, assess the effectiveness of AI approaches in practice, and highlight emerging trends and challenges. We explore diverse applications of AI in remote sensing, including image classification, land cover mapping, object detection, change detection, hyperspectral and radar data analysis, and data fusion. We present an overview of the remote sensing technologies, methods employed, and relevant use cases. We further explore challenges associated with practical AI in remote sensing, such as data quality and availability, model uncertainty and interpretability, and integration with domain expertise as well as potential solutions, advancements, and future directions. We provide a comprehensive overview for researchers, practitioners, and decision makers, informing future research and applications at the exciting intersection of AI and remote sensing.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    