skip to main content


Title: Learning to Play Cup-and-Ball with Noisy Camera Observations
Playing the cup-and-ball game is an intriguing task for robotics research since it abstracts important problem characteristics including system nonlinearity, contact forces and precise positioning as terminal goal. In this paper, we present a learning model based control strategy for the cup-and-ball game, where a Universal Robots UR5e manipulator arm learns to catch a ball in one of the cups on a Kendama. Our control problem is divided into two sub-tasks, namely (i) swinging the ball up in a constrained motion, and (ii) catching the free-falling ball. The swing-up trajectory is computed offline, and applied in open-loop to the arm. Subsequently, a convex optimization problem is solved online during the ball’s free-fall to control the manipulator and catch the ball. The controller utilizes noisy position feedback of the ball from an Intel RealSense D435 depth camera. We propose a novel iterative framework, where data is used to learn the support of the camera noise distribution iteratively in order to update the control policy. The probability of a catch with a fixed policy is computed empirically with a user specified number of roll-outs. Our design guarantees that probability of the catch increases in the limit, as the learned support nears the true support of the camera noise distribution. High-fidelity Mujoco simulations and preliminary experimental results support our theoretical analysis  more » « less
Award ID(s):
1931853
NSF-PAR ID:
10176536
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
IEEE International Conference on Automation Science and Engineering CASE
ISSN:
2161-8070
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We study the following problem, which to our knowledge has been addressed only partially in the literature and not in full generality. An agent observes two players play a zero-sum game that is known to the players but not the agent. The agent observes the actions and state transitions of their game play, but not rewards. The players may play either op-timally (according to some Nash equilibrium) or according to any other solution concept, such as a quantal response equilibrium. Following these observations, the agent must recommend a policy for one player, say Player 1. The goal is to recommend a policy that is minimally exploitable un-der the true, but unknown, game. We take a Bayesian ap-proach. We establish a likelihood function based on obser-vations and the specified solution concept. We then propose an approach based on Markov chain Monte Carlo (MCMC), which allows us to approximately sample games from the agent’s posterior belief distribution. Once we have a batch of independent samples from the posterior, we use linear pro-gramming and backward induction to compute a policy for Player 1 that minimizes the sum of exploitabilities over these games. This approximates the policy that minimizes the ex-pected exploitability under the full distribution. Our approach is also capable of handling counterfactuals, where known modifications are applied to the unknown game. We show that our Bayesian MCMC-based technique outperforms two other techniques—one based on the equilibrium policy of the maximum-probability game and the other based on imitation of observed behavior—on all the tested stochastic game envi-ronments. 
    more » « less
  2. Mechanical search is a robotic problem where a robot needs to retrieve a target item that is partially or fully occluded from its camera. State-of-the-art approaches for mechanical search either require an expensive search process to find the target item, or they require the item to be tagged with a radio frequency identification tag (e.g., RFID), making their approach beneficial only to tagged items in the environment. We present FuseBot, the first robotic system for RF-Visual mechanical search that enables efficient retrieval of both RFtagged and untagged items in a pile. Rather than requiring all target items in a pile to be RF-tagged, FuseBot leverages the mere existence of an RF-tagged item in the pile to benefit both tagged and untagged items. Our design introduces two key innovations. The first is RF-Visual Mapping, a technique that identifies and locates RF-tagged items in a pile and uses this information to construct an RF-Visual occupancy distribution map. The second is RF-Visual Extraction, a policy formulated as an optimization problem that minimizes the number of actions required to extract the target object by accounting for the probabilistic occupancy distribution, the expected grasp quality, and the expected information gain from future actions. We built a real-time end-to-end prototype of our system on a UR5e robotic arm with in-hand vision and RF perception modules. We conducted over 180 real-world experimental trials to evaluate FuseBot and compare its performance to a of-the-art vision-based system named X-Ray. Our experimental results demonstrate that FuseBot outperforms X-Ray’s efficiency by more than 40% in terms of the number of actions required for successful mechanical search. Furthermore, in comparison to X-Ray’s success rate of 84%, FuseBot achieves a success rate of 95% in retrieving untagged items, demonstrating for the first time that the benefits of RF perception extend beyond tagged objects in the mechanical search problem. 
    more » « less
  3. Control systems are increasingly targeted by malicious adversaries, who may inject spurious sensor measurements in order to bias the controller behavior and cause suboptimal performance or safety violations. This paper investigates the problem of tracking a reference trajectory while satisfying safety and reachability constraints in the presence of such false data injection attacks. We consider a linear, time-invariant system with additive Gaussian noise in which a subset of sensors can be compromised by an attacker, while the remaining sensors are regarded as secure. We propose a control policy in which two estimates of the system state are maintained, one based on all sensors and one based on only the secure sensors. The optimal control action based on the secure sensors alone is then computed at each time step, and the chosen control action is constrained to lie within a given distance of this value. We show that this policy can be implemented by solving a quadraticallyconstrained quadratic program at each time step. We develop a barrier function approach to choosing the parameters of our scheme in order to provide provable guarantees on safety and reachability, and derive bounds on the probability that our control policies deviate from the optimal policy when no attacker is present. Our framework is validated through numerical study. 
    more » « less
  4. null (Ed.)
    We investigate sublinear classical and quantum algorithms for matrix games, a fundamental problem in optimization and machine learning, with provable guarantees. Given a matrix, sublinear algorithms for the matrix game were previously known only for two special cases: (1) the maximizing vectors live in the L1-norm unit ball, and (2) the minimizing vectors live in either the L1- or the L2-norm unit ball. We give a sublinear classical algorithm that can interpolate smoothly between these two cases: for any fixed q between 1 and 2, we solve, within some additive error, matrix games where the minimizing vectors are in an Lq-norm unit ball. We also provide a corresponding sublinear quantum algorithm that solves the same task with a quadratic improvement in dimensions of the maximizing and minimizing vectors. Both our classical and quantum algorithms are optimal in the dimension parameters up to poly-logarithmic factors. Finally, we propose sublinear classical and quantum algorithms for the approximate Carathéodory problem and the Lq-margin support vector machines as applications. 
    more » « less
  5. In nature, animals with soft body parts demonstrate remarkable control over their shape, such as an elephant trunk wrapping around a tree branch to pick it up. However, most research on robotic manipulators focuses on controlling the end effector, partly because the manipulator’s arm is rigidly articulated. With recent advances in soft robotics research, controlling a soft manipulator into many different shapes will significantly improve the robot’s functionality, such as medical robots morphing their shape to navigate the digestive system and deliver drugs to specific locations. However, controlling the shape of soft robots is challenging due to their highly nonlinear dynamics that are computationally intensive. In this paper, we leverage a physics-informed, data-driven approach using the Koopman operator to realize the shape control of soft robots. We simulate the dynamics of a soft manipulator using a physics-based simulator (PyElastica) to generate the input-output data, which is then used to identify an approximated linear model based on the Koopman operator. We then formulate the shapecontrol problem as a convex optimization problem that is computationally efficient. Our linear model is over 12 times faster than the physics-based model in simulating the manipulator’s motion. Further, we can control a soft manipulator into different shapes using model predictive control. We envision that the proposed method can be effectively used to control the shapes of soft robots to interact with uncertain environments or enable shape-morphing robots to fulfill diverse tasks. 
    more » « less