Foundation models (FM) have shown immense human-like capabilities for generating digital media. However, foundation models that can freely sense, interact, and actuate the physical domain is far from being realized. This is due to 1) requiring dense deployments of sensors to fully cover and analyze large spaces, while 2) events often being localized to small areas, making it difficult for FMs to pinpoint relevant areas of interest relevant to the current task. We propose FlexiFly, a platform that enables FMs to “zoom in” and analyze relevant areas with higher granularity to better understand the physical environment and carry out tasks. FlexiFly accomplishes by introducing 1) a novel image segmentation technique that aids in identifying relevant locations and 2) a modular and reconfigurable sensing and actuation drone platform that FMs can actuate to “zoom in” with relevant sensors and actuators. We demonstrate through real smart home deployments that FlexiFly enables FMs and LLMs to complete diverse tasks up to 85% more successfully. FlexiFly is critical step towards FMs and LLMs that can naturally interface with the physical world.
more »
« less
Connecting Foundation Models with the Physical World using Reconfigurable Drone Agents
Foundation models excel in tasks such as content generation, zero-shot classifications, and reasoning. However, they struggle with sensing, interacting, and actuating in the physical world due to their dependence on limited sensors and actuators in providing timely contextual information or physical interactions. This reliance restricts the system’s adaptability and coverage. To address these issues and create an embodied AI with foundation models (FMs), we introduce Embodied Reconfigurable Drone Agent (EmbodiedRDA). EmbodiedRDA features a custom drone platform that can autonomously swap payloads to reconfigure itself with a diverse list of sensors and actuators. We designed FM agents to instruct the drone to equip itself with appropriate physical modules, analyze sensor data, make decisions, and control the drone’s actions. This enables the system to perform a variety of tasks in dynamic physical environments, bridging the gap between the digital and physical worlds.
more »
« less
- Award ID(s):
- 1943396
- PAR ID:
- 10592099
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400704895
- Page Range / eLocation ID:
- 1745 to 1747
- Format(s):
- Medium: X
- Location:
- Washington D.C. DC USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)In the field of soft robotics, harnessing the nonlinear dynamics of soft and compliant bodies as a computational resource to enable embodied intelligence and control is known as morphological computation. Physical reservoir computing (PRC) is a true instance of morphological computation wherein; a physical nonlinear dynamic system is used as a fixed reservoir to perform complex computational tasks. These dynamic reservoirs can be used to approximate nonlinear dynamical systems and even perform machine learning tasks. By numerical simulation, this study illustrates that an origami meta-material can also be used as a dynamic reservoir for pattern generation, output modulation, and input sensing. These results could pave the way for intelligently designed origami-based robots that interact with the environment through a distributed network of sensors and actuators. This embodied intelligence will enable the next generations of soft robots to autonomously coordinate and modulate their activities, such as locomotion gait generation and limb manipulation while resisting external disturbances.more » « less
-
There has been an immense growth in sensors, actuators, and smart devices in recent years, which enable us to better sense, actuate, and understand the physical world. Despite this growth, we have yet to achieve fully intelligent environments. This is, in part, due to the large number of different organizations creating smart devices with proprietary technologies and communication protocols that are not compatible with each other and require significant engineering to incorporate and adapt to specific applications. In this work, we present an easy-to-install and low-cost embedded platform that allows users to rapidly configure a mixture of sensors and actuators. The system is based on the commonly-used Raspberry Pi ecosystem, easily configurable, and does not require users to have prior knowledge of programming, which allows anyone, regardless of background, to use. We also introduce a battery-powered wireless extension module that is suitable for mobile drone applications, where a chord-powered Raspberry Pi is not suitable. We demonstrate the impact our system has on enabling drones with flexible sensing modalities and creating smarter environments by integrating our platform into a variety of intelligent home applications.more » « less
-
This paper addresses the problem of dynamic allocation of robot resources to tasks with hierarchical representations and multiple types of execution constraints, with the goal of enabling single-robot multitasking capabilities. Although the vast majority of robot platforms are equipped with more than one sensor (cameras, lasers, sonars) and several actuators (wheels/legs, two arms), which would in principle allow the robot to concurrently work on multiple tasks, existing methods are limited to allocating robots in their entirety to only one task at a time. This approach employs only a subset of a robot's sensors and actuators, leaving other robot resources unused. Our aim is to enable a robot to make full use of its capabilities by having an individual robot multitask, distributing its sensors and actuators to multiple concurrent activities. We propose a new architectural framework based on Hierarchical Task Trees that supports multitasking through a new representation of robot behaviors that explicitly encodes the robot resources (sensors and actuators) and the environmental conditions needed for execution. This architecture was validated on a two-arm, mobile, PR2 humanoid robot, performing tasks with multiple types of execution constraints.more » « less
-
Characterizing computational demand of Cyber-Physical Systems (CPS) is critical for guaranteeing that multiple hard real-time tasks may be scheduled on shared resources without missing deadlines. In a CPS involving repetition such as industrial automation systems found in chemical process control or robotic manufacturing, sensors and actuators used as part of the industrial process may be conditionally enabled (and disabled) as a sequence of repeated steps is executed. In robotic manufacturing, for example, these steps may be the movement of a robotic arm through some trajectories followed by activation of end-effector sensors and actuators at the end of each completed motion. The conditional enabling of sensors and actuators produces a sequence of Monotonically Ascending Execution times (MAE) with lower WCET when the sensors are disabled and higher WCET when enabled. Since these systems may have several predefined steps to follow before repeating the entire sequence each unique step may result in several consecutive sequences of MAE. The repetition of these unique sequences of MAE result in a repeating WCET sequence. In the absence of an efficient demand characterization technique for repeating WCET sequences composed of subsequences with monotonically increasing execution time, this work proposes a new task model to describe the behavior of real-world systems which generate large repeating WCET sequences with subsequences of monotonically increasing execution times. In comparison to the most applicable current model, the Generalized Multiframe model (GMF), an empirically and theoretically faster method for characterizing the demand is provided. The demand characterization algorithm is evaluated through a case study of a robotic arm and simulation of 10,000 randomly generated tasks where, on average, the proposed approach is 231 and 179 times faster than the state-of-the-art in the case study and simulation respectively.more » « less
An official website of the United States government

