skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: CBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation
Recently, there has been a paradigm shift in stereo matching with learning-based methods achieving the best results on all popular benchmarks. The success of these methods is due to the availability of training data with ground truth; training learning-based systems on these datasets has allowed them to surpass the accuracy of conventional approaches based on heuristics and assumptions. Many of these assumptions, however, had been validated extensively and hold for the majority of possible inputs. In this paper, we generate a matching volume leveraging both data with ground truth and conventional wisdom. We accomplish this by coalescing diverse evidence from a bidirectional matching process via random forest classifiers. We show that the resulting matching volume estimation method achieves similar accuracy to purely data-driven alternatives on benchmarks and that it generalizes to unseen data much better. In fact, the results we submitted to the KITTI and ETH3D benchmarks were generated using a classifier trained on the Middlebury 2014 dataset.  more » « less
Award ID(s):
1637761 1527294
PAR ID:
10058045
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. For decades, atomistic modeling has played a crucial role in predicting the behavior of materials in numerous fields ranging from nanotechnology to drug discovery. The most accurate methods in this domain are rooted in first-principles quantum mechanical calculations such as density functional theory (DFT). Because these methods have remained computationally prohibitive, practitioners have traditionally focused on defining physically motivated closed-form expressions known as empirical interatomic potentials (EIPs) that approximately model the interactions between atoms in materials. In recent years, neural network (NN)-based potentials trained on quantum mechanical (DFT-labeled) data have emerged as a more accurate alternative to conventional EIPs. However, the generalizability of these models relies heavily on the amount of labeled training data, which is often still insufficient to generate models suitable for general-purpose applications. In this paper, we propose two generic strategies that take advantage of unlabeled training instances to inject domain knowledge from conventional EIPs to NNs in order to increase their generalizability. The first strategy, based on weakly supervised learning, trains an auxiliary classifier on EIPs and selects the best-performing EIP to generate energies to supplement the ground-truth DFT energies in training the NN. The second strategy, based on transfer learning, first pretrains the NN on a large set of easily obtainable EIP energies, and then fine-tunes it on ground-truth DFT energies. Experimental results on three benchmark datasets demonstrate that the first strategy improves baseline NN performance by 5% to 51% while the second improves baseline performance by up to 55%. Combining them further boosts performance. 
    more » « less
  2. Predicting flight trajectories is a research area that holds significant merit. In this paper, we propose a data-driven learning framework that leverages the predictive and feature extraction capabilities of the mixture models and seq2seq-based neural networks while addressing prevalent challenges caused by error propagation and dimensionality reduction. After training with this framework, the learned model can improve long-step prediction accuracy significantly given the past trajectories and the context information. The accuracy and effectiveness of the approach are evaluated by comparing the predicted trajectories with the ground truth. The results indicate that the proposed method has outperformed the state-of-the-art predicting methods on a terminal airspace flight trajectory dataset. The trajectories generated by the proposed method have a higher temporal resolution (1 time step per second vs 0.1 time step per second) and are closer to the ground truth. 
    more » « less
  3. Abstract Raindrop size distributions (DSD) and rain rate have been estimated from polarimetric radar data using different approaches with the accuracy depending on the errors both in the radar measurements and the estimation methods. Herein, a deep neural network (DNN) technique was utilized to improve the estimation of the DSD and rain rate by mitigating these errors. The performance of this approach was evaluated using measurements from a two-dimensional video disdrometer (2DVD) at the Kessler Atmospheric and Ecological Field Station in Oklahoma as ground truth with the results compared against conventional estimation methods for the period 2006–17. Physical parameters (mass-/volume-weighted diameter and liquid water content), rain rate, and polarimetric radar variables (including radar reflectivity and differential reflectivity) were obtained from the DSD data. Three methods—physics-based inversion, empirical formula, and DNN—were applied to two different temporal domains (instantaneous and rain-event average) with three diverse error assumptions (fitting, measurement, and model errors). The DSD retrievals and rain estimates from 18 cases were evaluated by calculating the bias and root-mean-squared error (RMSE). DNN produced the best performance for most cases, with up to a 5% reduction in RMSE when model errors existed. DSD and rain estimated from a nearby polarimetric radar using the empirical and DNN methods were well correlated with the disdrometer observations; the rain-rate estimate bias of the DNN was significantly reduced (3.3% in DNN vs 50.1% in empirical). These results suggest that DNN has advantages over the physics-based and empirical methods in retrieving rain microphysics from radar observations. 
    more » « less
  4. PurposeTo develop a strategy for training a physics‐guided MRI reconstruction neural network without a database of fully sampled data sets. MethodsSelf‐supervised learning via data undersampling (SSDU) for physics‐guided deep learning reconstruction partitions available measurements into two disjoint sets, one of which is used in the data consistency (DC) units in the unrolled network and the other is used to define the loss for training. The proposed training without fully sampled data is compared with fully supervised training with ground‐truth data, as well as conventional compressed‐sensing and parallel imaging methods using the publicly available fastMRI knee database. The same physics‐guided neural network is used for both proposed SSDU and supervised training. The SSDU training is also applied to prospectively two‐fold accelerated high‐resolution brain data sets at different acceleration rates, and compared with parallel imaging. ResultsResults on five different knee sequences at an acceleration rate of 4 shows that the proposed self‐supervised approach performs closely with supervised learning, while significantly outperforming conventional compressed‐sensing and parallel imaging, as characterized by quantitative metrics and a clinical reader study. The results on prospectively subsampled brain data sets, in which supervised learning cannot be used due to lack of ground‐truth reference, show that the proposed self‐supervised approach successfully performs reconstruction at high acceleration rates (4, 6, and 8). Image readings indicate improved visual reconstruction quality with the proposed approach compared with parallel imaging at acquisition acceleration. ConclusionThe proposed SSDU approach allows training of physics‐guided deep learning MRI reconstruction without fully sampled data, while achieving comparable results with supervised deep learning MRI trained on fully sampled data. 
    more » « less
  5. The ability to perceive 3D human bodies from a single image has a multitude of applications ranging from entertainment and robotics to neuroscience and healthcare. A fundamental challenge in human mesh recovery is in collecting the ground truth 3D mesh targets required for training, which requires burdensome motion capturing systems and is often limited to indoor laboratories. As a result, while progress is made on benchmark datasets collected in these restrictive settings, models fail to generalize to real-world "in-the-wild" scenarios due to distribution shifts. We propose Domain Adaptive 3D Pose Augmentation (DAPA), a data augmentation method that enhances the model's generalization ability in in-the-wild scenarios. DAPA combines the strength of methods based on synthetic datasets by getting direct supervision from the synthesized meshes, and domain adaptation methods by using ground truth 2D keypoints from the target dataset. We show quantitatively that finetuning with DAPA effectively improves results on benchmarks 3DPW and AGORA. We further demonstrate the utility of DAPA on a challenging dataset curated from videos of real-world parent-child interaction. 
    more » « less