skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Pairing Bayesian Methods and Systems Theory to Enable Test and Evaluation of Learning‐Based Systems
ABSTRACT Modern engineered systems, and learning‐based systems, in particular, provide unprecedented complexity that requires advancement in our methods to achieve confidence in mission success through test and evaluation (T&E). We define learning‐based systems as engineered systems that incorporate a learning algorithm (artificial intelligence) component of the overall system. A part of the unparalleled complexity is the rate at which learning‐based systems change over traditional engineered systems. Where traditional systems are expected to steadily decline (change) in performance due to time (aging), learning‐based systems undergo a constant change which must be better understood to achieve high confidence in mission success. To this end, we propose pairing Bayesian methods with systems theory to quantify changes in operational conditions, changes in adversarial actions, resultant changes in the learning‐based system structure, and resultant confidence measures in mission success. We provide insights, in this article, into our overall goal and progress toward developing a framework for evaluation through an understanding of equivalence of testing.  more » « less
Award ID(s):
2108791
PAR ID:
10476144
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Wiley
Date Published:
Journal Name:
INSIGHT
Volume:
25
Issue:
4
ISSN:
2156-485X
Page Range / eLocation ID:
65 to 70
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Physics-based modeling aids in designing efficient data center power and cooling systems. These systems have traditionally been modeled independently under the assumption that the inherent coupling of effects between the systems has negligible impact. This study tests the assumption through uncertainty quantification of models for a typical 300 kW data center supplied through either an alternating current (AC)-based or direct current (DC)-based power distribution system. A novel calculation scheme is introduced that couples the calculations of these two systems to estimate the resultant impact on predicted power usage effectiveness (PUE), computer room air conditioning (CRAC) return temperature, total system power requirement, and system power loss values. A two-sample z-test for comparing means is used to test for statistical significance with 95% confidence. The power distribution component efficiencies are calibrated to available published and experimental data. The predictions for a typical data center with an AC-based system suggest that the coupling of system calculations results in statistically significant differences for the cooling system PUE, the overall PUE, the CRAC return air temperature, and total electrical losses. However, none of the tested metrics are statistically significant for a DC-based system. The predictions also suggest that a DC-based system provides statistically significant lower overall PUE and electrical losses compared to the AC-based system, but only when coupled calculations are used. These results indicate that the coupled calculations impact predicted general energy efficiency metrics and enable statistically significant conclusions when comparing different data center cooling and power distribution strategies. 
    more » « less
  2. Multitask learning models provide benefits by reducing model complexity and improving accuracy by concurrently learning multiple tasks with shared representations. Leveraging inductive knowledge transfer, these models mitigate the risk of overfitting on any specific task, leading to enhanced overall performance. However, supervised multitask learning models, like many neural networks, require substantial amounts of labeled data. Given the cost associated with data labeling, there is a need for an efficient label acquisition mechanism, known as multitask active learning (MTAL). In wearable sensor systems, success of MTAL largely hinges on its query strategies because active learning in such settings involves interaction with end-users (e.g., patients) for annotation. However, these strategies have not been studied in mobile health settings and wearable systems to date. While strategies like one-sided sampling, alternating sampling, and rank-combination-based sampling have been proposed in the past, their applicability in mobile sensor settings—a domain constrained by label deficit—remains largely unexplored. This study investigates the MTAL querying approaches and addresses crucial questions related to the choice of sampling methods and the effectiveness of multitask learning in mobile health applications. Utilizing two datasets on activity recognition and emotion classification, our findings reveal that rank-based sampling outperforms other techniques, particularly in tasks with high correlation. However, sole reliance on informativeness for sample selection may introduce biases into models. To address this issue, we also propose a Clustered Stratified Sampling (CSS) method in tandem with the multitask active learning query process. CSS identifies clustered mini-batches of samples, optimizing budget utilization and maximizing performance. When employed alongside rank-based query selection, our proposed CSS algorithm demonstrates up to 9% improvement in accuracy over traditional querying approaches for a 2000-query budget. 
    more » « less
  3. The overall objective of this project funded by the NSF-IUSE program is to employ a sociotechnical systems lens and framework and identify and evaluate organization-wide capacities and change catalysts in a predominantly white institution's college of engineering. The college of engineering is viewed as a sociotechnical organization with social and technical subsystems. The social subsystem models who talks to whom about what. The technical subsystem models the main activities and programs in the organization. Our project aims to: (1) assess the technical system’s capacity to support recruitment and retention through a technical system analysis; (2) assess the social system’s capacity to support recruitment and retention through a social system analysis; and (3) generate systemwide catalysts for URM student success. We conducted semi-structured hour-long interviews with 38 stakeholders including students, faculty, administrators and staff from various departments and student organizations within and outside the college. We are qualitatively analyzing the interview data to identify technical and social system barriers and enablers. Data analysis is ongoing, but our preliminary findings and insights are as follows: (1) social system barriers for URM students were interactions with peers in classroom environment (leading to a sense of isolation and a lack of belonging), interactions with faculty and staff especially in relating to their needs and being empathetic, and familial concerns and being able to support their family financially. (2) interactions with their friends was the top social system enabler for URM students. Family also provided them comfort and solace while attending to the rigors of college. They also felt that living at home would alleviate some of the financial burdens they faced. (3) the lack in numbers (and hence the lack of diversity and identity), curricular and instructional methods, and high school preparation were cited as the most important technical system barriers these students faced. (4) students identified as technical system enablers the professional development opportunities they had, their participation in students organizations, particularly in identity-based organizations such as NSBE, SHPE and WISE, and how that helped them forge new contacts and provided emotional support during their stay here. (5) there is recognition among the administrators and the staff working with URM students that diversity is important in the student body and that the mission of enabling URM student success is important, although the mission itself with respect to URM students is somewhat poorly defined and understood. 
    more » « less
  4. The advent of deep learning has inspired research into end-to-end learning for a variety of problem domains in robotics. For navigation, the resulting methods may not have the generalization properties desired let alone match the performance of traditional methods. Instead of learning a navigation policy, we explore learning an adaptive policy in the parameter space of an existing navigation module. Having adaptive parameters provides the navigation module with a family of policies that can be dynamically reconfigured based on the local scene structure and addresses the common assertion in machine learning that engineered solutions are inflexible. Of the methods tested, reinforcement learning (RL) is shown to provide a significant performance boost to a modern navigation method through reduced sensitivity of its success rate to environmental clutter. The outcomes indicate that RL as a meta-policy learner, or dynamic parameter tuner, effectively robustifies algorithms sensitive to external, measurable nuisance factors. 
    more » « less
  5. Abstract Deep Reinforcement Learning (DRL) has shown promise for voltage control in power systems due to its speed and model‐free nature. However, learning optimal control policies through trial and error on a real grid is infeasible due to the mission‐critical nature of power systems. Instead, DRL agents are typically trained on a simulator, which may not accurately represent the real grid. This discrepancy can lead to suboptimal control policies and raises concerns for power system operators. In this paper, we revisit the problem of RL‐based voltage control and investigate how model inaccuracies affect the performance of the DRL agent. Extensive numerical experiments are conducted to quantify the impact of model inaccuracies on learning outcomes. Specifically, techniques that enable the DRL agent are focused on learning robust policies that can still perform well in the presence of model errors. Furthermore, the impact of the agent's decisions on the overall system loss are analyzed to provide additional insight into the control problem. This work aims to address the concerns of power system operators and make DRL‐based voltage control more practical and reliable. 
    more » « less