skip to main content


Title: Single Frame Lidar and Stereo Camera Calibration Using Registration of 3D Planes
This work focuses on finding the extrinsic parameters (rotation and translation) between the lidar and the stereo camera setups. We use a planar checkerboard and place it inside the Field-of-View (FOV) of both the sensors, where we extract the 3D plane information of the checkerboard acquired from the sensor’s data. The planes extracted from the sensor’s data are used as reference data sets to find the relative transformation between the two sensors. We use our proposed method Correntropy Similarity Matrix Iterative Closest Point (CoSM-ICP) algorithm to estimate the relative transformation. In this work, we use a single frame of the point cloud data acquired from the lidar sensor and a single frame from the calibrated Stereo camera point cloud to perform this operation. We evaluate our approach on a simulated dataset since it has the freedom to evaluate under multiple configurations. Through results, we verify our approach under various configurations.  more » « less
Award ID(s):
1846513 1919127
NSF-PAR ID:
10318770
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2021 Fifth IEEE International Conference on Robotic Computing (IRC)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Event cameras, inspired by biological vision systems, provide a natural and data efficient representation of visual information. Visual information is acquired in the form of events that are triggered by local brightness changes. However, because most brightness changes are triggered by relative motion of the camera and the scene, the events recorded at a single sensor location seldom correspond to the same world point. To extract meaningful information from event cameras, it is helpful to register events that were triggered by the same underlying world point. In this work we propose a new model of event data that captures its natural spatio-temporal structure. We start by developing a model for aligned event data. That is, we develop a model for the data as though it has been perfectly registered already. In particular, we model the aligned data as a spatio-temporal Poisson point process. Based on this model, we develop a maximum likelihood approach to registering events that are not yet aligned. That is, we find transformations of the observed events that make them as likely as possible under our model. In particular we extract the camera rotation that leads to the best event alignment. We show new state of the art accuracy for rotational velocity estimation on the DAVIS 240C dataset [??]. In addition, our method is also faster and has lower computational complexity than several competing methods. 
    more » « less
  2. With the rapid proliferation of small unmanned aircraft systems (UAS), the risk of mid-air collisions is growing, as is the risk associated with the malicious use of these systems. Airborne Detect-and-Avoid (ABDAA) and counter-UAS technologies have similar sensing requirements to detect and track airborne threats, albeit for different purposes: to avoid a collision or to neutralize a threat, respectively. These systems typically include a variety of sensors, such as electro-optical or infrared (EO/IR) cameras, RADAR, or LiDAR, and they fuse the data from these sensors to detect and track a given threat and to predict its trajectory. Camera imagery can be an effective method for detection as well as for pose estimation and threat classification, though a single camera cannot resolve range to a threat without additional information, such as knowledge of the threat geometry. To support ABDAA and counter-UAS applications, we consider a merger of two image-based sensing methods that mimic human vision: (1) a "peripheral vision" camera (i.e., with a fisheye lens) to provide a large field-of-view and (2) a "central vision" camera (i.e., with a perspective lens) to provide high resolution imagery of a specific target. Beyond the complementary ability of the two cameras to support detection and classification, the pair form a heterogeneous stereo vision system that can support range resolution. This paper describes the initial development and testing of a peripheral-central vision system to detect, localize, and classify an airborne threat and finally to predict its path using knowledge of the threat class. 
    more » « less
  3. Recovering multi-person 3D poses and shapes with absolute scales from a single RGB image is a challenging task due to the inherent depth and scale ambiguity from a single view. Current works on 3D pose and shape estimation tend to mainly focus on the estimation of the 3D joint locations relative to the root joint , usually defined as the one closest to the shape centroid, in case of humans defined as the pelvis joint. In this paper, we build upon an existing multi-person 3D mesh predictor network, ROMP, to create Absolute-ROMP. By adding absolute root joint localization in the camera coordinate frame, we are able to estimate multi-person 3D poses and shapes with absolute scales from a single RGB image. Such a single-shot approach allows the system to better learn and reason about the inter-person depth relationship, thus improving multi-person 3D estimation. In addition to this end to end network, we also train a CNN and transformer hybrid network, called TransFocal, to predict the f ocal length of the image’s camera. Absolute-ROMP estimates the 3D mesh coordinates of all persons in the image and their root joint locations normalized by the focal point. We then use TransFocal to obtain focal length and get absolute depth information of all joints in the camera coordinate frame. We evaluate Absolute-ROMP on the root joint localization and root-relative 3D pose estimation tasks on publicly available multi-person 3D pose datasets. We evaluate TransFocal on dataset created from the Pano360 dataset and both are applicable to in-the-wild images and videos, due to real time performance. 
    more » « less
  4. To obtain more consistent measurements through the course of a wheat growing season, we conceived and designed an autonomous robotic platform that performs collision avoidance while navigating in crop rows using spatial artificial intelligence (AI). The main constraint the agronomists have is to not run over the wheat while driving. Accordingly, we have trained a spatial deep learning model that helps navigate the robot autonomously in the field while avoiding collisions with the wheat. To train this model, we used publicly available databases of prelabeled images of wheat, along with the images of wheat that we have collected in the field. We used the MobileNet single shot detector (SSD) as our deep learning model to detect wheat in the field. To increase the frame rate for real-time robot response to field environments, we trained MobileNet SSD on the wheat images and used a new stereo camera, the Luxonis Depth AI Camera. Together, the newly trained model and camera could achieve a frame rate of 18–23 frames per second (fps)—fast enough for the robot to process its surroundings once every 2–3 inches of driving. Once we knew the robot accurately detects its surroundings, we addressed the autonomous navigation of the robot. The new stereo camera allows the robot to determine its distance from the trained objects. In this work, we also developed a navigation and collision avoidance algorithm that utilizes this distance information to help the robot see its surroundings and maneuver in the field, thereby precisely avoiding collisions with the wheat crop. Extensive experiments were conducted to evaluate the performance of our proposed method. We also compared the quantitative results obtained by our proposed MobileNet SSD model with those of other state-of-the-art object detection models, such as the YOLO V5 and Faster region-based convolutional neural network (R-CNN) models. The detailed comparative analysis reveals the effectiveness of our method in terms of both model precision and inference speed.

     
    more » « less
  5. The Midnight Sun Golf Course in Fairbanks, Alaska is a legacy farm field that is part of the National Science Foundation (NSF) Funded Permafrost Grown project. This 65 hectare (ha) parcel was initially cleared for agriculture purposes but changed land-use practices to a golf course around 25 years ago. The land-use conversion was in part due to ice-rich permafrost thaw following clearing. We are studying the long-term effects of permafrost thaw following initial clearing for cultivation purposes. We are working with the current landowners to provide information regarding ongoing thermokarst development on the property and to conduct studies in reforested portions of the land area to understand land clearing and reforestation on permafrost-affected soils. In this regard, we have acquired very high resolution light detection and ranging (LiDAR) data and digital photography from a DJI M300 drone using a Zenmuse L1. The Zenmuse L1 integrates a Livox Lidar module, a high-accuracy inertial measurement units (IMU), and a camera with a 1-inch CMOS on a 3-axis stabilized gimbal. The drone was configured to fly in real-time kinematic (RTK) mode at an altitude of 60 meters above ground level using the DJI D-RTK 2 base station. Data was acquired using a 50% sidelap and a 70% frontlap. Additional ground control was established with a Leica GS18 global navigation satellite system (GNSS) and all data have been post-processed to World Geodetic System 1984 (WGS84) universal transverse mercator (UTM) Zone 6 North using ellipsoid heights. Data outputs include a two-class classified LiDAR point cloud, digital surface model, digital terrain model, and an orthophoto mosaic. Image acquisition occurred on 10 September 2023. The input images are available for download at http://arcticdata.io/data/10.18739/A2PC2TB1T. 
    more » « less