Estimating the quality of transmission (QoT) of a lightpath before its establishment is a critical procedure for efficient design and management of optical networks. Recently, supervised machine learning (ML) techniques for QoT estimation have been proposed as an effective alternative to well-established, yet approximated, analytic models that often require the introduction of conservative margins to compensate for model inaccuracies and uncertainties. Unfortunately, to ensure high estimation accuracy, the training set (i.e., the set of historical field data, or “samples,” required to train these supervised ML algorithms) must be very large, while in real network deployments, the number of monitored/monitorable lightpaths is limited by several practical considerations. This is especially true for lightpaths with an above-threshold bit error rate (BER) (i.e., malfunctioning or wrongly dimensioned lightpaths), which are infrequently observed during network operation. Samples with above-threshold BERs can be acquired by deploying probe lightpaths, but at the cost of increased operational expenditures and wastage of spectral resources. In this paper, we propose to useactive learningto reduce the number of probes needed for ML-based QoT estimation. We build an estimation model based on Gaussian processes, which allows iterative identification of those QoT instances that minimize estimation uncertainty. Numerical results using synthetically generated datasets show that, by using the proposed active learning approach, we can achieve the same performance of standard offline supervised ML methods, but with a remarkable reduction (at least 5% and up to 75%) in the number of training samples. 
                        more » 
                        « less   
                    
                            
                            Weakly Supervised Classification of Vital Sign Alerts as Real or Artifact
                        
                    
    
            A significant proportion of clinical physiologic monitoring alarms are false. This often leads to alarm fatigue in clinical personnel, inevitably compromising patient safety. To combat this issue, researchers have attempted to build Machine Learning (ML) models capable of accurately adjudicating Vital Sign (VS) alerts raised at the bedside of hemodynamically monitored patients as real or artifact. Previous studies have utilized supervised ML techniques that require substantial amounts of hand-labeled data. However, manually harvesting such data can be costly, time-consuming, and mundane, and is a key factor limiting the widespread adoption of ML in healthcare (HC). Instead, we explore the use of multiple, individually imperfect heuristics to automatically assign probabilistic labels to unlabeled training data using weak supervision. Our weakly supervised models perform competitively with traditional supervised techniques and require less involvement from domain experts, demonstrating their use as efficient and practical alternatives to supervised learning in HC applications of ML. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1950811
- PAR ID:
- 10441685
- Date Published:
- Journal Name:
- AMIA Annual Symposium proceedings
- ISSN:
- 1942-597X
- Page Range / eLocation ID:
- 405-414
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract We discuss the emerging advances and opportunities at the intersection of machine learning (ML) and climate physics, highlighting the use of ML techniques, including supervised, unsupervised, and equation discovery, to accelerate climate knowledge discoveries and simulations. We delineate two distinct yet complementary aspects: (a) ML for climate physics and (b) ML for climate simulations. Although physics-free ML-based models, such as ML-based weather forecasting, have demonstrated success when data are abundant and stationary, the physics knowledge and interpretability of ML models become crucial in the small-data/nonstationary regime to ensure generalizability. Given the absence of observations, the long-term future climate falls into the small-data regime. Therefore, ML for climate physics holds a critical role in addressing the challenges of ML for climate simulations. We emphasize the need for collaboration among climate physics, ML theory, and numerical analysis to achieve reliable ML-based models for climate applications.more » « less
- 
            Abstract This work presents a deep reinforcement learning (DRL) approach for procedural content generation (PCG) to automatically generate three-dimensional (3D) virtual environments that users can interact with. The primary objective of PCG methods is to algorithmically generate new content in order to improve user experience. Researchers have started exploring the use of machine learning (ML) methods to generate content. However, these approaches frequently implement supervised ML algorithms that require initial datasets to train their generative models. In contrast, RL algorithms do not require training data to be collected a priori since they take advantage of simulation to train their models. Considering the advantages of RL algorithms, this work presents a method that generates new 3D virtual environments by training an RL agent using a 3D simulation platform. This work extends the authors’ previous work and presents the results of a case study that supports the capability of the proposed method to generate new 3D virtual environments. The ability to automatically generate new content has the potential to maintain users’ engagement in a wide variety of applications such as virtual reality applications for education and training, and engineering conceptual design.more » « less
- 
            Abstract Artificial intelligence (AI) and machine learning (ML) pose a challenge for achieving science that is both reproducible and replicable. The challenge is compounded in supervised models that depend on manually labeled training data, as they introduce additional decision‐making and processes that require thorough documentation and reporting. We address these limitations by providing an approach to hand labeling training data for supervised ML that integrates quantitative content analysis (QCA)—a method from social science research. The QCA approach provides a rigorous and well‐documented hand labeling procedure to improve the replicability and reproducibility of supervised ML applications in Earth systems science (ESS), as well as the ability to evaluate them. Specifically, the approach requires (a) the articulation and documentation of the exact decision‐making process used for assigning hand labels in a “codebook” and (b) an empirical evaluation of the reliability” of the hand labelers. In this paper, we outline the contributions of QCA to the field, along with an overview of the general approach. We then provide a case study to further demonstrate how this framework has and can be applied when developing supervised ML models for applications in ESS. With this approach, we provide an actionable path forward for addressing ethical considerations and goals outlined by recent AGU work on ML ethics in ESS.more » « less
- 
            Context.— Machine learning (ML) allows for the analysis of massive quantities of high-dimensional clinical laboratory data, thereby revealing complex patterns and trends. Thus, ML can potentially improve the efficiency of clinical data interpretation and the practice of laboratory medicine. However, the risks of generating biased or unrepresentative models, which can lead to misleading clinical conclusions or overestimation of the model performance, should be recognized. Objectives.— To discuss the major components for creating ML models, including data collection, data preprocessing, model development, and model evaluation. We also highlight many of the challenges and pitfalls in developing ML models, which could result in misleading clinical impressions or inaccurate model performance, and provide suggestions and guidance on how to circumvent these challenges. Data Sources.— The references for this review were identified through searches of the PubMed database, US Food and Drug Administration white papers and guidelines, conference abstracts, and online preprints. Conclusions.— With the growing interest in developing and implementing ML models in clinical practice, laboratorians and clinicians need to be educated in order to collect sufficiently large and high-quality data, properly report the data set characteristics, and combine data from multiple institutions with proper normalization. They will also need to assess the reasons for missing values, determine the inclusion or exclusion of outliers, and evaluate the completeness of a data set. In addition, they require the necessary knowledge to select a suitable ML model for a specific clinical question and accurately evaluate the performance of the ML model, based on objective criteria. Domain-specific knowledge is critical in the entire workflow of developing ML models.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    