skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Aloimonos, Yiannis"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. DeCroon, Guido (Ed.)
    Abstract Imagine sitting at your desk, looking at objects on it. You do not know their exact distances from your eye in meters, but you can immediately reach out and touch them. Instead of an externally defined unit, your sense of distance is tied to your action’s embodiment. In contrast, conventional robotics relies on precise calibration to external units, with which vision and control processes communicate. We introduceEmbodied Visuomotor Representation, a methodology for inferring distance in a unit implied by action. With it a robot without knowledge of its size, environmental scale, or strength can quickly learn to touch and clear obstacles within seconds of operation. Likewise, in simulation, an agent without knowledge of its mass or strength can successfully jump across a gap of unknown size after a few test oscillations. These behaviors mirror natural strategies observed in bees and gerbils, which also lack calibration in an external unit. 
    more » « less
    Free, publicly-accessible full text available December 1, 2026
  2. Video Frame Interpolation aims to recover realistic missing frames between observed frames, generating a highframe- rate video from a low-frame-rate video. However, without additional guidance, the large motion between frames makes this problem ill-posed. Event-based Video Frame Interpolation (EVFI) addresses this challenge by using sparse, high-temporal-resolution event measurements as motion guidance. This guidance allows EVFI methods to significantly outperform frame-only methods. However, to date, EVFI methods have relied on a limited set of paired eventframe training data, severely limiting their performance and generalization capabilities. In this work, we overcome the limited data challenge by adapting pre-trained video diffusion models trained on internet-scale datasets to EVFI. We experimentally validate our approach on real-world EVFI datasets, including a new one that we introduce. Our method outperforms existing methods and generalizes across cameras far better than existing approaches. 
    more » « less
    Free, publicly-accessible full text available June 21, 2026
  3. Salakhutdinov, Ruslan; Kolter, Zico; Heller, Katherine; Weller, Adrian; Nuria, Jonathan; Scarlett, Oliver; Berkenkamp, Felix (Ed.)
    We propose VecKM, a local point cloud geometry encoder that is descriptive and efficient to compute. VecKM leverages a unique approach by vectorizing a kernel mixture to represent the local point cloud. Such representation's descriptiveness is supported by two theorems that validate its ability to reconstruct and preserve the similarity of the local shape. Unlike existing encoders down-sampling the local point cloud, VecKM constructs the local geometry encoding using all neighboring points, producing a more descriptive encoding. Moreover, VecKM is efficient to compute and scalable to large point cloud inputs: VecKM reduces the memory cost from (n2 + nKd) to (nd + np); and reduces the major runtime cost from computing nK MLPs to n MLPs, where n is the size of the point cloud, K is the neighborhood size, d is the encoding dimension, and p is a marginal factor. The efficiency is due to VecKM's unique factorizable property that eliminates the need of explicitly grouping points into neighbors. In the normal estimation task, VecKM demonstrates not only 100× faster inference speed but also highest accuracy and strongest robustness. In classification and segmentation tasks, integrating VecKM as a preprocessing module achieves consistently better performance than the PointNet, PointNet++, and point transformer baselines, and runs consistently faster by up to 10 times. 
    more » « less
  4. Oysters are a vital keystone species in coastal ecosystems, providing significant economic, environmental, and cultural benefits. As the importance of oysters grows, so does the relevance of autonomous systems for their detection and monitoring. However, current monitoring strategies often rely on destructive methods. While manual identification of oysters from video footage is non-destructive, it is time-consuming, requires expert input, and is further complicated by the challenges of the underwater environment. To address these challenges, we propose a novel pipeline using stable diffusion to augment a collected real dataset with photorealistic synthetic data. This method enhances the dataset used to train a YOLOv10-based vision model. The model is then deployed and tested on an edge platform; Aqua2, an Autonomous Underwater Vehicle (AUV), achieving a state-of-the-art 0.657 mAP@50 for oyster detection. 
    more » « less
    Free, publicly-accessible full text available May 28, 2026
  5. The rapidly increasing capabilities of autonomous mobile robots promise to make them ubiquitous in the coming decade. These robots will continue to enhance efficiency and safety in novel applications such as disaster management, environmental monitoring, bridge inspection, and agricultural inspection. To operate autonomously without constant human intervention, even in remote or hazardous areas, robots must sense, process, and interpret environmental data using only onboard sensing and computation. This capability is made possible by advancements in perception algorithms, allowing these robots to rely primarily on their perception capabilities for navigation tasks. However, tiny robot autonomy is hindered mainly by sensors, memory, and computing due to size, area, weight, and power constraints. The bottleneck in these robots lies in the real-time perception in resource-constrained robots. To enable autonomy in robots of sizes that are less than 100 mm in body length, we draw inspiration from tiny organisms such as insects and hummingbirds, known for their sophisticated perception, navigation, and survival abilities despite their minimal sensor and neural system. This work aims to provide insights into designing a compact and efficient minimal perception framework for tiny autonomous robots from higher cognitive to lower sensor levels. 
    more » « less