Distributed multi-target tracking is a canonical task for multi-robot systems, encompassing applications from environmental monitoring to disaster response to surveillance. In many situations, the distribution of unknown objects in a search area is irregular, with objects are likely to distribute in clusters instead of evenly distributed. In this paper, we develop a novel distributed multi-robot multi-target tracking algorithm for effectively tracking clustered targets from noisy measurements. Our algorithm contains two major components. Firstly, both the instantaneous and cumulative target density are estimated, providing the best guess of current target states and long-term coarse distribution of clusters, respectively. Secondly, the power diagram is implemented in Lloyd’s algorithm to optimize task space assignment for each robot to trade-off between tracking detected targets in clusters and searching for potential targets outside clusters. We demonstrate the efficacy of our proposed method and show that our method outperforms of other candidates in tracking accuracy through a set of simulations.
more »
« less
Lost and Found!: associating target persons in camera surveillance footage with smartphone identifiers
We demonstrate an application of finding target persons on a surveillance video. Each visually detected participant is tagged with a smartphone ID and the target person with the query ID is highlighted. This work is motivated by the fact that establishing associations between subjects observed in camera images and messages transmitted from their wireless devices can enable fast and reliable tagging. This is particularly helpful when target pedestrians need to be found on public surveillance footage, without the reliance on facial recognition. The underlying system uses a multi-modal approach that leverages WiFi Fine Timing Measurements (FTM) and inertial sensor (IMU) data to associate each visually detected individual with a corresponding smartphone identifier. These smartphone measurements are combined strategically with RGB-D information from the camera, to learn affinity matrices using a multi-modal deep learning network.
more »
« less
- Award ID(s):
- 1901355
- PAR ID:
- 10283161
- Date Published:
- Journal Name:
- MobiSys 21
- Page Range / eLocation ID:
- 499 to 500
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Tracking subjects in videos is one of the most widely used functions in camera-based IoT applications such as security surveillance, smart city traffic safety enhancement, vehicle to pedestrian communication and so on. In computer vision domain, tracking is usually achieved by first detecting subjects, then associating detected bounding boxes across video frames. Typically, frames are transmitted to a remote site for processing, incurring high latency and network costs. To address this, we propose ViFiT, a transformerbased model that reconstructs vision bounding box trajectories from phone data (IMU and Fine Time Measurements). It leverages a transformer’s ability of better modeling long-term time series data. ViFiT is evaluated on Vi-Fi Dataset, a large-scale multimodal dataset in 5 diverse real world scenes, including indoor and outdoor environments. Results demonstrate that ViFiT outperforms the state-of-the-art approach for cross-modal reconstruction in LSTM Encoder-Decoder architecture X-Translator and achieves a high frame reduction rate as 97.76% with IMU and Wi-Fi data.more » « less
-
Object detection plays a pivotal in autonomous driving by enabling the vehicles to perceive and comprehend their environment, thereby making informed decisions for safe navigation. Camera data provides rich visual context and object recognition, while LiDAR data offers precise distance measurements and 3D mapping. Multi-modal object detection models are gaining prominence in incorporating these data types, which is essential for the comprehensive perception and situational awareness needed in autonomous vehicles. Although graphics processing units (GPUs) and field-programmable gate arrays (FPGAs) are promising hardware options for this application, the complex knowledge required to efficiently adapt and optimize multi-modal detection models for FPGAs presents a significant barrier to their utilization on this versatile and efficient platform. In this work, we evaluate the performance of camera and LiDARbased detection models on GPU and FPGA hardware, aiming to provide a specialized understanding for translating multi-modal detection models to suit the unique architecture of heterogeneous hardware platforms in autonomous driving systems. We focus on critical metrics from both system and model performance aspects. Based on our quantitative implications, we propose foundational insights and guidance for the design of camera and LiDAR-based multi-modal detection models on diverse hardware platforms.more » « less
-
Keratoconus is a progressive corneal disease which may cause blindness if it is not detected in the early stage. In this paper, we propose a portable, low-cost, and robust keratoconus detection method which is based on smartphone camera images. A gadget has been designed and manufactured using 3-D printing to supplement keratoconus detection. A smartphone camera with the gadget provides more accurate and robust keratoconus detection performance. We adopted the Prewitt operator for edge detection and the support vector machine (SVM) to classify keratoconus eyes from healthy eyes. Experimental results show that the proposed method can detect mild, moderate, advanced, and severe stages of keratoconus with 89% accuracy on average.more » « less
-
Multi-modal bio-sensing has recently been used as effective research tools in affective computing, autism, clinical disorders, and virtual reality among other areas. However, none of the existing bio-sensing systems support multi-modality in a wearable manner outside well-controlled laboratory environments with research-grade measurements. This work attempts to bridge this gap by developing a wearable multi-modal biosensing system capable of collecting, synchronizing, recording and transmitting data from multiple bio-sensors: PPG, EEG, eye-gaze headset, body motion capture, GSR, etc. while also providing task modulation features including visual-stimulus tagging. This study describes the development and integration of the various components of our system. We evaluate the developed sensors by comparing their measurements to those obtained by a standard research-grade bio-sensors. We first evaluate different sensor modalities of our headset, namely earlobe-based PPG module with motion-noise canceling for ECG during heart-beat calculation. We also compare the steady-state visually evoked potentials (SSVEP) measured by our shielded dry EEG sensors with the potentials obtained by commercially available dry EEG sensors. We also investigate the effect of head movements on the accuracy and precision of our wearable eyegaze system. Furthermore, we carry out two practical tasks to demonstrate the applications of using multiple sensor modalities for exploring previously unanswerable questions in bio-sensing. Specifically, utilizing bio-sensing we show which strategy works best for playing Where is Waldo? visual-search game, changes in EEG corresponding to true versus false target fixations in this game, and predicting the loss/draw/win states through biosensing modalities while learning their limitations in a Rock-Paper-Scissors game.more » « less
An official website of the United States government

