Do More with Less: Single-Model, Multi-Goal Architectures for Resource-Constrained Robots

Wang, Zili; Threatt, Drew; Andersson, Sean B; Tron, Roberto

doi:10.1109/IROS55552.2023.10341879

Deep learning methods are widely used in robotic applications. By learning from prior experience, the robot can abstract knowledge of the environment, and use this knowledge to accomplish different goals, such as object search, frontier exploration, or scene understanding, with a smaller amount of resources than might be needed without that knowledge. Most existing methods typically require a significant amount of sensing, which in turn has significant costs in terms of power consumption for acquisition and processing, and typically focus on models that are tuned for each specific goal, leading to the need to train, store and run each one separately. These issues are particularly important in a resource-constrained setting, such as with small-scale robots or during long-duration missions. We propose a single, multi-task deep learning architecture that takes advantage of the structure of the partial environment to predict different abstractions of the environment (thus reducing the need for rich sensing), and to leverage these predictions to simultaneously achieve different high-level goals (thus sharing computation between goals). As an example application of the proposed architecture, we consider the specific example of a robot equipped with a 2-D laser scanner and an object detector, tasked with searching for an object (such as an exit) in a residential building while constructing a topological map that can be used for future missions. The prior knowledge of the environment is encoded using a U-Net deep network architecture. In this context, our work leads to an object search algorithm that is complete, and that outperforms a more traditional frontier-based approach. The topological map we produce uses scene trees to qualitatively represent the environment as a graph at a fraction of the cost of existing SLAM-based solutions. Our results demonstrate that it is possible to extract multi-task semantic information that is useful for navigation and mapping directly from bare-bone, non-semantic measurements.

More Like this