Guaranteeing safety in human-centric applications is critical in robot learning as the learned policies may demonstrate unsafe behaviors in formerly unseen scenarios. We present a framework to locally repair an erroneous policy network to satisfy a set of formal safety constraints using Mixed Integer Quadratic Programming (MIQP). Our MIQP formulation explicitly imposes the safety constraints to the learned policy while minimizing the original loss function. The policy network is then verified to be locally safe. We demonstrate the application of our framework to derive safe policies for a robotic lower-leg prosthesis.
more »
« less
Safe Robot Learning in Assistive Devices through Neural Network Repair
Assistive robotic devices are a particularly promising field of application for neural networks (NN) due to the need for personalization and hard-to-model human-machine interaction dynamics. However, NN based estimators and controllers may produce potentially unsafe outputs over previously unseen data points. In this paper, we introduce an algorithm for updating NN control policies to satisfy a given set of formal safety constraints, while also optimizing the original loss function. Given a set of mixed-integer linear constraints, we define the NN repair problem as a Mixed Integer Quadratic Program (MIQP). In extensive experiments, we demonstrate the efficacy of our repair method in generating safe policies for a lower-leg prosthesis.
more »
« less
- Award ID(s):
- 1932189
- PAR ID:
- 10474224
- Publisher / Repository:
- Proc. of Machine Learning Resaerch
- Date Published:
- Journal Name:
- Proceedings of Machine Learning Research
- Volume:
- 205
- ISSN:
- 2640-3498
- Page Range / eLocation ID:
- 2148-2158
- Format(s):
- Medium: X
- Location:
- https://proceedings.mlr.press/v205/majd23a.html
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Given a neural network (NN) and a set of possible inputs to the network described by polyhedral constraints, we aim to compute a safe over-approximation of the set of possible output values. This operation is a fundamental primitive enabling the formal analysis of neural networks that are extensively used in a variety of machine learning tasks such as perception and control of autonomous systems. Increasingly, they are deployed in high-assurance applications, leading to a compelling use case for formal verification approaches. In this paper, we present an efficient range estimation algorithm that iterates between an expensive global combinatorial search using mixed-integer linear programming problems, and a relatively inexpensive local optimization that repeatedly seeks a local optimum of the function represented by the NN. We implement our approach and compare it with Reluplex, a recently proposed solver for deep neural networks. We demonstrate applications of our approach to computing flowpipes for neural network-based feedback controllers. We show that the use of local search in conjunction with mixed-integer linear programming solvers effectively reduces the combinatorial search over possible combinations of active neurons in the network by pruning away suboptimal nodes.more » « less
-
Given a neural network (NN) and a set of possible inputs to the network described by polyhedral constraints, we aim to compute a safe over-approximation of the set of possible output values. This operation is a fundamental primitive enabling the formal analysis of neural networks that are extensively used in a variety of machine learning tasks such as perception and control of autonomous systems. Increasingly, they are deployed in high-assurance applications, leading to a compelling use case for formal verification approaches. In this paper, we present an efficient range estimation algorithm that iterates between an expensive global combinatorial search using mixed-integer linear programming problems, and a relatively inexpensive local optimization that repeatedly seeks a local optimum of the function represented by the NN. We implement our approach and compare it with Reluplex, a recently proposed solver for deep neural networks. We demonstrate applications of our approach to computing flowpipes for neural network-based feedback controllers. We show that the use of local search in conjunction with mixed-integer linear programming solvers effectively reduces the combinatorial search over possible combinations of active neurons in the network by pruning away suboptimal nodes.more » « less
-
Semiactive model predictive control (sMPC) can be very effective, but its computational cost due to the inherent mixed-integer quadratic programming (MIQP) optimization precludes its use in real-time vibration control. This study proposes training neural networks (NNs) to predict in real-time only the MIQP's integer variables' values, called a strategy, for a given structure state. Because the number of strategies is exponential in the number of sMPC horizon steps, the resulting NN can be massive. This study proposes to reduce the NN dimension by exploiting the homogeneity-of-order-one nature of this control problem and, using state vector statistics, to efficiently choose training samples. The single large NN is proposed to be split into several much smaller NNs, each predicting a strategy grouping, that together uniquely and efficiently predict the strategy. Given the strategy's integer values, the MIQP optimization reduces to a quadratic programming (QP) problem, solved using a fast QP solver with proposed adaptations: exploiting optimization efficiencies and bounding sub-optimality; using several NN predictions; and reverting to a simpler (suboptimal) semiactive control algorithm upon occasional incorrect NN predictions or QP solver nonconvergence. Shear building examples demonstrate significant online computational cost reductions with control performance comparable to the conventional MIQP-based control.more » « less
-
null (Ed.)Action selection policies (ASPs), used to compose low-level robot skills into complex high-level tasks are commonly represented as neural networks (NNs) in the state of the art. Such a paradigm, while very effective, suffers from a few key problems: 1) NNs are opaque to the user and hence not amenable to verification, 2) they require significant amounts of training data, and 3) they are hard to repair when the domain changes. We present two key insights about ASPs for robotics. First, ASPs need to reason about physically meaningful quantities derived from the state of the world, and second, there exists a layered structure for composing these policies. Leveraging these insights, we introduce layered dimension-informed program synthesis (LDIPS) – by reasoning about the physical dimensions of state variables, and dimensional constraints on operators, LDIPS directly synthesizes ASPs in a human-interpretable domain-specific language that is amenable to program repair. We present empirical results to demonstrate that LDIPS 1) can synthesize effective ASPs for robot soccer and autonomous driving domains, 2) enables tractable synthesis for robot action selection policies not possible with state of the art synthesis techniques, 3) requires two orders of magnitude fewer training examples than a comparable NN representation, and 4) can repair the synthesized ASPs with only a small number of corrections when transferring from simulation to real robots.more » « less
An official website of the United States government

