skip to main content


This content will become publicly available on June 8, 2024

Title: Learning Constrained Corner Node Trajectories of a Tether Net System for Space Debris Capture
The earth’s orbit is becoming increasingly crowded with debris that poses significant safety risks to the operation of existing and new spacecraft and satellites. The active tether-net system, which consists of a flexible net with maneuverable corner nodes, launched from a small autonomous spacecraft, is a promising solution to capturing and disposing of such space debris. The requirement of autonomous operation and the need to generalize over debris scenarios in terms of different rotational rates makes the capture process significantly challenging. The space debris could rotate about multiple axes, which along with sensing/estimation and actuation uncertainties, call for a robust, generalizable approach to guiding the net launch and flight – one that can guarantee robust capture. This paper proposes a decentralized actuation system combined with reinforcement learning based on prior work in designing and controlling this tether-net system. In this new system, four microsatellites with thrusters act as the corner nodes of the net, and can thus help control the flight of the net after launch. The microsatellites pull the net to complete the task of approaching and capturing the space debris. The proposed method uses a reinforcement learning framework that integrates a proximal policy optimization to find the optimal solution based on the dynamics simulation of the net and the MUs in Vortex Studio. The reinforcement learning framework finds the optimal trajectory that is both energy-efficient and ensures a desired level of capture quality  more » « less
Award ID(s):
2128578
NSF-PAR ID:
10453051
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
AIAA AVIATION 2023 Forum
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Su, Zhongqing ; Limongelli, Maria Pina ; Glisic, Branko (Ed.)
    The battery-powered wireless sensor network (WSN) is a promising solution for structural health monitoring (SHM) applications because of its low cost and easy installation capability. However, the long-term WSN operation suffers from various concerns related to uneven battery degradation of wireless sensors, associated battery management, and replacement requirement, and ensuring desired quality of service (QoS) of the WSN in practice. The battery life is one of the biggest limiting factors for long-term WSN operation. Considering the costly maintenance trips for battery replacement, a lack of effective battery degradation management at the system level can lead to a failure in WSN operation. Moreover, the QoS needs to be ensured under various practical uncertainties. Optimal selection with a maximal number of nodes in WSN under uncertainties is a critical task to ensure the desired QoS. This study proposes a reinforcement learning (RL) based framework for active control of the battery degradation at the WSN system level with the aim of the battery group replacement while extending the service life and ensuring the QoS of WSN. A comprehensive simulation environment was developed in a real-life WSN setup, i.e. WSN for a cable-stayed bridge SHM, considering various practical uncertainties. The RL agent was trained under a developed RL environment to learn optimal nodes and duty cycles, meanwhile managing battery health at the network level. In this study, a mode shape-based quality index is proposed for the demonstration. The training and test results showed the prominence of the proposed framework in achieving effective battery health management of the WSN for SHM. 
    more » « less
  2. This work validates lumped-parameter models and cable-based models for nets against data from a parabolic flight experiment. The capabilities of a simulator based in Vortex Studio, a multibody dynamics simulation framework, are expanded by introducing i) a lumped-parameter model of the net with lumped masses placed along the threads and ii) a flexible-cable-based model, both of which enable collision detection with thin bodies. An experimental scenario is recreated in simulation, and the deployment and capture phases are analyzed. Good agreement with experiments is observed in both phases, although with differences primarily due to imperfect knowledge of experimental initial conditions. It is demonstrated that both a lumped-parameter model with inner nodes and a cable-based model can enable the detection of collisions between the net and thin geometries of the target. While both models improve notably capture realism compared to a lumped parameter model with no inner nodes, the cable-based model is found to be most computationally efficient. The effect of modeling thread-to-thread collisions (i.e., collisions among parts of the net) is analyzed and determined to be negligible during deployment and initial target wrapping. The results of this work validate the models and increase the confidence in the practicality of this simulator as a tool for research on net-based capture of debris. A cable-based model is validated for the first time in the literature. 
    more » « less
  3. null (Ed.)
    Autonomous robotic experimentation strategies are rapidly rising in use because, without the need for user intervention, they can efficiently and precisely converge onto optimal intrinsic and extrinsic synthesis conditions for a wide range of emerging materials. However, as the material syntheses become more complex, the meta-decisions of artificial intelligence (AI)-guided decision-making algorithms used in autonomous platforms become more important. In this work, a surrogate model is developed using data from over 1000 in-house conducted syntheses of metal halide perovskite quantum dots in a self-driven modular microfluidic material synthesizer. The model is designed to represent the global failure rate, unfeasible regions of the synthesis space, synthesis ground truth, and sampling noise of a real robotic material synthesis system with multiple output parameters (peak emission, emission linewidth, and quantum yield). With this model, over 150 AI-guided decision-making strategies within a single-period horizon reinforcement learning framework are automatically explored across more than 600 000 simulated experiments – the equivalent of 7.5 years of continuous robotic operation and 400 L of reagents – to identify the most effective methods for accelerated materials development with multiple objectives. Specifically, the structure and meta-decisions of an ensemble neural network-based material development strategy are investigated, which offers a favorable technique for intelligently and efficiently navigating a complex material synthesis space with multiple targets. The developed ensemble neural network-based decision-making algorithm enables more efficient material formulation optimization in a no prior information environment than well-established algorithms. 
    more » « less
  4. Cameras are increasingly being deployed in cities, enterprises and roads world-wide to enable many applications in public safety, intelligent transportation, retail, healthcare and manufacturing. Often, after initial deployment of the cameras, the environmental conditions and the scenes around these cameras change, and our experiments show that these changes can adversely impact the accuracy of insights from video analytics. This is because the camera parameter settings, though optimal at deployment time, are not the best settings for good-quality video capture as the environmental conditions and scenes around a camera change during operation. Capturing poor-quality video adversely affects the accuracy of analytics. To mitigate the loss in accuracy of insights, we propose a novel, reinforcement-learning based system APT that dynamically, and remotely (over 5G networks), tunes the camera parameters, to ensure a high-quality video capture, which mitigates any loss in accuracy of video analytics. As a result, such tuning restores the accuracy of insights when environmental conditions or scene content change. APT uses reinforcement learning, with no-reference perceptual quality estimation as the reward function. We conducted extensive real-world experiments, where we simultaneously deployed two cameras side-by-side overlooking an enterprise parking lot (one camera only has manufacturer-suggested default setting, while the other camera is dynamically tuned by APT during operation). Our experiments demonstrated that due to dynamic tuning by APT, the analytics insights are consistently better at all times of the day: the accuracy of object detection video analytics application was improved on average by ∼ 42%. Since our reward function is independent of any analytics task, APT can be readily used for different video analytics tasks. 
    more » « less
  5. Autonomous Mobility-on-Demand (AMoD) systems represent an attractive alternative to existing transportation paradigms, currently challenged by urbanization and increasing travel needs. By centrally controlling a fleet of self-driving vehicles, these systems provide mobility service to customers and are currently starting to be deployed in a number of cities around the world. Current learning-based approaches for controlling AMoD systems are limited to the single-city scenario, whereby the service operator is allowed to take an unlimited amount of operational decisions within the same transportation system. However, real-world system operators can hardly afford to fully re-train AMoD controllers for every city they operate in, as this could result in a high number of poor-quality decisions during training, making the single-city strategy a potentially impractical solution. To address these limitations, we propose to formalize the multi-city AMoD problem through the lens of meta-reinforcement learning (meta-RL) and devise an actor-critic algorithm based on recurrent graph neural networks. In our approach, AMoD controllers are explicitly trained such that a small amount of experience within a new city will produce good system performance. Empirically, we show how control policies learned through meta-RL are able to achieve near-optimal performance on unseen cities by learning rapidly adaptable policies, thus making them more robust not only to novel environments, but also to distribution shifts common in real-world operations, such as special events, unexpected congestion, and dynamic pricing schemes. 
    more » « less