skip to main content


Title: Future Near-Collision Prediction from Monocular Video: Feasibility, Dataset, and Challenges
We explore the possibility of using a single monocular camera to forecast the time to collision between a suitcase-shaped robot being pushed by its user and other nearby pedestrians. We develop a purely image-based deep learning approach that directly estimates the time to collision without the need of relying on explicit geometric depth estimates or velocity information to predict future collisions. While previous work has focused on detecting immediate collision in the context of navigating Unmanned Aerial Vehicles, the detection was limited to a binary variable (i.e., collision or no collision). We propose a more fine-grained approach to collision forecasting by predicting the exact time to collision in terms of milliseconds, which is more helpful for collision avoidance in the context of dynamic path planning. To evaluate our method, we have collected a novel large-scale dataset of over 13,000 indoor video segments each showing a trajectory of at least one person ending in a close proximity (a near collision) with the camera mounted on a mobile suitcase-shaped platform. Using this dataset, we do extensive experimentation on different temporal windows as input using an exhaustive list of state-of-the-art convolutional neural networks (CNNs). Our results show that our proposed multi-stream CNN is the best model for predicting time to near-collision. The average prediction error of our time to near collision is 0.75 seconds across our test environments.  more » « less
Award ID(s):
1637927
NSF-PAR ID:
10304295
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE/RSJ International Conference on Intelligent Robots and Systems
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We explore the possibility of using a single monocular camera to forecast the time to collision between a suitcase-shaped robot being pushed by its user and other nearby pedestrians. We develop a purely image-based deep learning approach that directly estimates the time to collision without the need of relying on explicit geometric depth estimates or velocity information to predict future collisions. While previous work has focused on detecting immediate collision in the context of navigating Unmanned Aerial Vehicles, the detection was limited to a binary variable (i.e., collision or no collision). We propose a more fine-grained approach to collision forecasting by predicting the exact time to collision in terms of milliseconds, which is more helpful for collision avoidance in the context of dynamic path planning. To evaluate our method, we have collected a novel dataset of over 13,000 indoor video segments each showing a trajectory of at least one person ending in a close proximity (a near collision) with the camera mounted on a mobile suitcase-shaped platform. Using this dataset, we do extensive experimentation on different temporal windows as input using an exhaustive list of state-of-the-art convolutional neural networks (CNNs). Our results show that our proposed multi-stream CNN is the best model for predicting time to near-collision. The average prediction error of our time to near-collision is 0.75 seconds across the test videos. The project webpage can be found at https://aashi7.github.io/NearCollision.html. 
    more » « less
  2. null (Ed.)
    Despite its potential to overcome the design and processing barriers of traditional subtractive and formative manufacturing techniques, the use of laser powder bed fusion (LPBF) metal additive manufacturing is currently limited due to its tendency to create flaws. A multitude of LPBF-related flaws, such as part-level deformation, cracking, and porosity are linked to the spatiotemporal temperature distribution in the part during the process. The temperature distribution, also called the thermal history, is a function of several factors encompassing material properties, part geometry and orientation, processing parameters, placement of supports, among others. These broad range of factors are difficult and expensive to optimize through empirical testing alone. Consequently, fast and accurate models to predict the thermal history are valuable for mitigating flaw formation in LPBF-processed parts. In our prior works, we developed a graph theory-based approach for predicting the temperature distribution in LPBF parts. This mesh-free approach was compared with both non-proprietary and commercial finite element packages, and the thermal history predictions were experimentally validated with in- situ infrared thermal imaging data. It was found that the graph theory-derived thermal history predictions converged within 30–50% of the time of non-proprietary finite element analysis for a similar level of prediction error. However, these prior efforts were based on small prismatic and cylinder-shaped LPBF parts. In this paper, our objective was to scale the graph theory approach to predict the thermal history of large volume, complex geometry LPBF parts. To realize this objective, we developed and applied three computational strategies to predict the thermal history of a stainless steel (SAE 316L) impeller having outside diameter 155 mm and vertical height 35 mm (700 layers). The impeller was processed on a Renishaw AM250 LPBF system and required 16 h to complete. During the process, in-situ layer-by-layer steady state surface temperature measurements for the impeller were obtained using a calibrated longwave infrared thermal camera. As an example of the outcome, on implementing one of the three strategies reported in this work, which did not reduce or simplify the part geometry, the thermal history of the impeller was predicted with approximate mean absolute error of 6% (standard deviation 0.8%) and root mean square error 23 K (standard deviation 3.7 K). Moreover, the thermal history was simulated within 40 min using desktop computing, which is considerably less than the 16 h required to build the impeller part. Furthermore, the graph theory thermal history predictions were compared with a proprietary LPBF thermal modeling software and non-proprietary finite element simulation. For a similar level of root mean square error (28 K), the graph theory approach converged in 17 min, vs. 4.5 h for non-proprietary finite element analysis. 
    more » « less
  3. Consider the problem of determining the effect of a compound on a specific cell type. To answer this question, researchers traditionally need to run an experiment applying the drug of interest to that cell type. This approach is not scalable: given a large number of different actions (compounds) and a large number of different contexts (cell types), it is infeasible to run an experiment for every action-context pair. In such cases, one would ideally like to predict the outcome for every pair while only needing outcome data for a small _subset_ of pairs. This task, which we label "causal imputation", is a generalization of the causal transportability problem. To address this challenge, we extend the recently introduced _synthetic interventions_ (SI) estimator to handle more general data sparsity patterns. We prove that, under a latent factor model, our estimator provides valid estimates for the causal imputation task. We motivate this model by establishing a connection to the linear structural causal model literature. Finally, we consider the prominent CMAP dataset in predicting the effects of compounds on gene expression across cell types. We find that our estimator outperforms standard baselines, thus confirming its utility in biological applications. 
    more » « less
  4. null (Ed.)
    Abstract We present improvements over our previous approach to automatic winter hydrometeor classification by means of convolutional neural networks (CNNs), using more data and improved training techniques to achieve higher accuracy on a more complicated dataset than we had previously demonstrated. As an advancement of our previous proof-of-concept study, this work demonstrates broader usefulness of deep CNNs by using a substantially larger and more diverse dataset, which we make publicly available, from many more snow events. We describe the collection, processing, and sorting of this dataset of over 25,000 high-quality multiple-angle snowflake camera (MASC) image chips split nearly evenly between five geometric classes: aggregate, columnar crystal, planar crystal, graupel, and small particle. Raw images were collected over 32 snowfall events between November 2014 and May 2016 near Greeley, Colorado and were processed with an automated cropping and normalization algorithm to yield 224x224 pixel images containing possible hydrometeors. From the bulk set of over 8,400,000 extracted images, a smaller dataset of 14,793 images was sorted by image quality and recognizability (Q&R) using manual inspection. A presorting network trained on the Q&R dataset was applied to all 8,400,000+ images to automatically collect a subset of 283,351 good snowflake images. Roughly 5,000 representative examples were then collected from this subset manually for each of the five geometric classes. With a higher emphasis on in-class variety than our previous work, the final dataset yields trained networks that better capture the imperfect cases and diverse forms that occur within the broad categories studied to achieve an accuracy of 96.2% on a vastly more challenging dataset. 
    more » « less
  5. null (Ed.)
    Imaging photoplethysmography (iPPG) could greatly improve driver safety systems by enabling capabilities ranging from identifying driver fatigue to unobtrusive early heart failure detection. Unfortunately, the driving context poses unique challenges to iPPG, including illumination and motion. First, drastic illumination variations present during driving can overwhelm the small intensity-based iPPG signals. Second, significant driver head motion during driving, as well as camera motion (e.g., vibration) make it challenging to recover iPPG signals. To address these two challenges, we present two innovations. First, we demonstrate that we can reduce most outside light variations using narrow-band near-infrared (NIR) video recordings and obtain reliable heart rate estimates. Second, we present a novel optimization algorithm, which we call AutoSparsePPG, that leverages the quasi-periodicity of iPPG signals and achieves better performance than the state-of-the-art methods. In addition, we release the first publicly available driving dataset that contains both NIR and RGB video recordings of a passenger's face with simultaneous ground truth pulse oximeter recordings. 
    more » « less