skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Model-Centered Assurance for Autonomous Systems
The functions of an autonomous system can generally be partitioned into those concerned with perception and those concerned with action. Perception builds and maintains an internal model of the world (i.e., the system's environment) that is used to plan and execute actions to accomplish a goal established by human supervisors. Accordingly, assurance decomposes into two parts: a) ensuring that the model is an accurate representation of the world as it changes through time and b) ensuring that the actions are safe (and e ective), given the model. Both perception and action may employ AI, including machine learning (ML), and these present challenges to assurance. However, it is usually feasible to guard the actions with traditionally engineered and assured monitors, and thereby ensure safety, given the model. Thus, the model becomes the central focus for assurance. We propose an architecture and methods to ensure the accuracy of models derived from sensors whose interpretation uses AI and ML. Rather than derive the model from sensors bottom-up, we reverse the process and use the model to predict sensor interpretation. Small prediction errors indicate the world is evolving as expected and the model is updated accordingly. Large prediction errors indicate surprise, which may be due to errors in sensing or interpretation, or unexpected changes in the world (e.g., a pedestrian steps into the road). The former initiate error masking or recovery, while the latter requires revision to the model. Higher-level AI functions assist in diagnosis and execution of these tasks. Although this two-level architecture where the lower level does \predictive processing" and the upper performs more re ective tasks, both focused on maintenance of a world model, is derived by engineering considerations, it also matches a widely accepted theory of human cognition.  more » « less
Award ID(s):
1740079
PAR ID:
10181035
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
39th International Conference on Computer Safety, Reliability and Security (SafeComp), 2020
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Crowd workers struggle to earn adequate wages. Given the limited task-related information provided on crowd platforms, workers often fail to estimate how long it would take to complete certain microtasks. Although there exist a few third-party tools and online communities that provide estimates of working times, such information is limited to microtasks that have been previously completed by other workers, and such tasks are usually booked immediately by experienced workers. This paper presents a computational technique for predicting microtask working times (i.e., how much time it takes to complete microtasks) based on past experiences of workers regarding similar tasks. The following two challenges were addressed during development of the proposed predictive model — (i) collection of sufficient training data labeled with accurate working times, and (ii) evaluation and optimization of the prediction model. The paper first describes how 7,303 microtask submission data records were collected using a web browser extension — installed by 83 Amazon Mechanical Turk (AMT) workers — created for characterization of the diversity of worker behavior to facilitate accurate recording of working times. Next, challenges encountered in defining evaluation and/or objective functions have been described based on the tolerance demonstrated by workers with regard to prediction errors. To this end, surveys were conducted in AMT asking workers how they felt regarding prediction errors in working times pertaining to microtasks simulated using an “imaginary” AI system. Based on 91,060 survey responses submitted by 875 workers, objective/evaluation functions were derived for use in the prediction model to reflect whether or not the calculated prediction errors would be tolerated by workers. Evaluation results based on worker perceptions of prediction errors revealed that the proposed model was capable of predicting worker-tolerable working times in 73.6% of all tested microtask cases. Further, the derived objective function contributed to realization of accurate predictions across microtasks with more diverse durations. 
    more » « less
  2. Abstract Vertical profiles of temperature and dewpoint are useful in predicting deep convection that leads to severe weather, which threatens property and lives. Currently, forecasters rely on observations from radiosonde launches and numerical weather prediction (NWP) models. Radiosonde observations are, however, temporally and spatially sparse, and NWP models contain inherent errors that influence short-term predictions of high impact events. This work explores using machine learning (ML) to postprocess NWP model forecasts, combining them with satellite data to improve vertical profiles of temperature and dewpoint. We focus on different ML architectures, loss functions, and input features to optimize predictions. Because we are predicting vertical profiles at 256 levels in the atmosphere, this work provides a unique perspective at using ML for 1D tasks. Compared to baseline profiles from the Rapid Refresh (RAP), ML predictions offer the largest improvement for dewpoint, particularly in the middle and upper atmosphere. Temperature improvements are modest, but CAPE values are improved by up to 40%. Feature importance analyses indicate that the ML models are primarily improving incoming RAP biases. While additional model and satellite data offer some improvement to the predictions, architecture choice is more important than feature selection in fine-tuning the results. Our proposed deep residual U-Net performs the best by leveraging spatial context from the input RAP profiles; however, the results are remarkably robust across model architecture. Further, uncertainty estimates for every level are well calibrated and can provide useful information to forecasters. 
    more » « less
  3. Abstract: This paper tackles the problem of learning value functions from undirected state-only experience (state transitions without action labels i.e. (s,s’,r) tuples). We first theoretically characterize the applicability of Q-learning in this setting. We show that tabular Q-learning in discrete Markov decision processes (MDPs) learns the same value function under any arbitrary refinement of the action space. This theoretical result motivates the design of Latent Action Q-learning or LAQ, an offline RL method that can learn effective value functions from state-only experience. Latent Action Q-learning (LAQ) learns value functions using Q-learning on discrete latent actions obtained through a latent-variable future prediction model. We show that LAQ can recover value functions that have high correlation with value functions learned using ground truth actions. Value functions learned using LAQ lead to sample efficient acquisition of goal-directed behavior, can be used with domain-specific low-level controllers, and facilitate transfer across embodiments. Our experiments in 5 environments ranging from 2D grid world to 3D visual navigation in realistic environments demonstrate the benefits of LAQ over simpler alternatives, imitation learning oracles, and competing methods. 
    more » « less
  4. Vision-based formation control systems are attractive because they can use inexpensive sensors and can work in GPS-denied environments. The safety assurance for such systems is challenging: the vision component’s accuracy depends on the environment in complicated ways, these errors propagate through the system and lead to incorrect control actions, and there exists no formal specification for end-to-end reasoning. We address this problem and propose a technique for safety assurance of vision-based formation control: First, we propose a scheme for constructing quantizers that are consistent with vision-based perception. Next, we show how the convergence analysis of a standard quantized consensus algorithm can be adapted for the constructed quantizers. We use the recently defined notion of perception contracts to create error bounds on the actual vision-based perception pipeline using sampled data from different ground truth states, environments, and weather conditions. Specifically, we use a quantizer in logarithmic polar coordinates, and we show that this quantizer is suitable for the constructed perception contracts for the vision-based position estimation, where the error worsens with respect to the absolute distance between agents. We build our formation control algorithm with this nonuniform quantizer, and we prove its convergence employing an existing result for quantized consensus. 
    more » « less
  5. We present ChainedDiffuser, a policy architecture that unifies action keypose prediction and trajectory diffusion generation for learning robot manipulation from demonstrations. Our main innovation is to use a global transformerbased action predictor to predict actions at keyframes, a task that requires multimodal semantic scene understanding, and to use a local trajectory diffuser to predict trajectory segments that connect predicted macro-actions. ChainedDiffuser sets a new record on established manipulation benchmarks, and outperforms both state-of-the-art keypose (macro-action) prediction models that use motion planners for trajectory prediction, and trajectory diffusion policies that do not predict keyframe macro-actions. We conduct experiments in both simulated and realworld environments and demonstrate ChainedDiffuser’s ability to solve a wide range of manipulation tasks involving interactions with diverse objects. 
    more » « less