In this paper, we meticulously examine the robustness of computer vision object detection frameworks within the intricate realm of real-world traffic scenarios, with a particular emphasis on challenging adverse weather conditions. Conventional evaluation methods often prove inadequate in addressing the complexities inherent in dynamic traffic environments—an increasingly vital consideration as global advancements in autonomous vehicle technologies persist. Our investigation delves specifically into the nuanced performance of these algorithms amidst adverse weather conditions like fog, rain, snow, sun flare, and more, acknowledging the substantial impact of weather dynamics on their precision. Significantly, we seek to underscore that an object detection framework excelling in clear weather may encounter significant challenges in adverse conditions. Our study incorporates in-depth ablation studies on dual modality architectures, exploring a range of applications including traffic monitoring, vehicle tracking, and object tracking. The ultimate goal is to elevate the safety and efficiency of transportation systems, recognizing the pivotal role of robust computer vision systems in shaping the trajectory of future autonomous and intelligent transportation technologies.
more »
« less
Light-Weight Object Detection and Decision Making via Approximate Computing in Resource-Constrained Mobile Robots
Most of the current solutions for autonomous flights in indoor environments rely on purely geometric maps (e.g., point clouds). There has been, however, a growing interest in supplementing such maps with semantic information (e.g., object detections) using computer vision algorithms. Unfortunately, there is a disconnect between the relatively heavy computational requirements of these computer vision solutions, and the limited computation capacity available on mobile autonomous platforms. In this paper, we propose to bridge this gap with a novel Markov Decision Process framework that adapts the parameters of the vision algorithms to the incoming video data rather than fixing them a priori. As a concrete example, we test our framework on a object detection and tracking task, showing significant benefits in terms of energy consumption without considerable loss in accuracy, using a combination of publicly available and novel datasets.
more »
« less
- Award ID(s):
- 1734454
- PAR ID:
- 10107917
- Date Published:
- Journal Name:
- IEEE/RSJ International Conference on Intelligent Robots and Systems
- Page Range / eLocation ID:
- 6776 to 6781
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Contrary to the vast literature in modeling, perceiving, and understanding agent-object (e.g., human-object, hand-object, robot-object) interaction in computer vision and robotics, very few past works have studied the task of object-object interaction, which also plays an important role in robotic manipulation and planning tasks. There is a rich space of object-object interaction scenarios in our daily life, such as placing an object on a messy tabletop, fitting an object inside a drawer, pushing an object using a tool, etc. In this paper, we propose a unified affordance learning framework to learn object-object interaction for various tasks. By constructing four object-object interaction task environments using physical simulation (SAPIEN) and thousands of ShapeNet models with rich geometric diversity, we are able to conduct large-scale object-object affordance learning without the need for human annotations or demonstrations. At the core of technical contribution, we propose an object-kernel point convolution network to reason about detailed interaction between two objects. Experiments on large-scale synthetic data and real-world data prove the effectiveness of the proposed approach.more » « less
-
Abstract Drones are increasingly popular for collecting behaviour data of group‐living animals, offering inexpensive and minimally disruptive observation methods. Imagery collected by drones can be rapidly analysed using computer vision techniques to extract information, including behaviour classification, habitat analysis and identification of individual animals. While computer vision techniques can rapidly analyse drone‐collected data, the success of these analyses often depends on careful mission planning that considers downstream computational requirements—a critical factor frequently overlooked in current studies.We present a comprehensive summary of research in the growing AI‐driven animal ecology (ADAE) field, which integrates data collection with automated computational analysis focused on aerial imagery for collective animal behaviour studies. We systematically analyse current methodologies, technical challenges and emerging solutions in this field, from drone mission planning to behavioural inference. We illustrate computer vision pipelines that infer behaviour from drone imagery and present the computer vision tasks used for each step. We map specific computational tasks to their ecological applications, providing a framework for future research design.Our analysis reveals AI‐driven animal ecology studies for collective animal behaviour using drone imagery focus on detection and classification computer vision tasks. While convolutional neural networks (CNNs) remain dominant for detection and classification tasks, newer architectures like transformer‐based models and specialized video analysis networks (e.g. X3D, I3D, SlowFast) designed for temporal pattern recognition are gaining traction for pose estimation and behaviour inference. However, reported model accuracy varies widely by computer vision task, species, habitats and evaluation metrics, complicating meaningful comparisons between studies.Based on current trends, we conclude semi‐autonomous drone missions will be increasingly used to study collective animal behaviour. While manual drone operation remains prevalent, autonomous drone manoeuvrers, powered by edge AI, can scale and standardise collective animal behavioural studies while reducing the risk of disturbance and improving data quality. We propose guidelines for AI‐driven animal ecology drone studies adaptable to various computer vision tasks, species and habitats. This approach aims to collect high‐quality behaviour data while minimising disruption to the ecosystem.more » « less
-
Image data plays a pivotal role in the current data-driven era, particularly in applications such as computer vision, object recognition, and facial identification. Google Maps ® stands out as a widely used platform that heavily relies on street view images. To fulfill the pressing need for an effective and distributed mechanism for image data collection, we present a framework that utilizes smart contract technology and open-source robots to gather street-view image sequences. The proposed framework also includes a protocol for maintaining these sequences using a private blockchain capable of retaining different versions of street views while ensuring the integrity of collected data. With this framework, Google Maps ® data can be securely collected, stored, and published on a private blockchain. By conducting tests with actual robots, we demonstrate the feasibility of the framework and its capability to seamlessly upload privately maintained blockchain image sequences to Google Maps ® using the Google Street View ® Publish API.more » « less
-
Computer vision has shown promising potential in wearable robotics applications (e.g., human grasping target prediction and context understanding). However, in practice, the performance of computer vision algorithms is challenged by insufficient or biased training, observation noise, cluttered background, etc. By leveraging Bayesian deep learning (BDL), we have developed a novel, reliable vision-based framework to assist upper limb prosthesis grasping during arm reaching. This framework can measure different types of uncertainties from the model and data for grasping target recognition in realistic and challenging scenarios. A probability calibration network was developed to fuse the uncertainty measures into one calibrated probability for online decision making. We formulated the problem as the prediction of grasping target while arm reaching. Specifically, we developed a 3-D simulation platform to simulate and analyze the performance of vision algorithms under several common challenging scenarios in practice. In addition, we integrated our approach into a shared control framework of a prosthetic arm and demonstrated its potential at assisting human participants with fluent target reaching and grasping tasks.more » « less
An official website of the United States government

