skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: An objective comparison methodology of edge detection algorithms using a structure from motion task
This paper presents a task-oriented evaluation methodology for edge detectors. Performance is measured based on the task of structure from motion. Eighteen real image sequences from 2 different scenes varying in the complexity and scenery types are used. The task-level ground truth for each image sequence is manually specified in terms of the 3D motion and structure. An automated tool computes the accuracy of the motion and structure achieved using the set of edge maps. Parameter sensitivity and execution speed are also analyzed. Four edge detectors are compared. All implementations and data sets are publicly available.  more » « less
Award ID(s):
9724422
PAR ID:
10346805
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Page Range / eLocation ID:
190 to 195
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper addresses how to fairly compare ROCs of ad hoc (or data driven) detectors with tests derived from statistical models of digital media. We argue that the ways ROCs are typically drawn for each detector type correspond to different hypothesis testing problems with different optimality criteria, making the ROCs incomparable. To understand the problem and why it occurs, we model a source of natural images as a mixture of scene oracles and derive optimal detectors for the task of image steganalysis. Our goal is to guarantee that, when the data follows the statistical model adopted for the hypothesis test, the ROC of the optimal detector bounds the ROC of the ad hoc detector. While the results are applicable beyond the field of image steganalysis, we use this setup to point out possi- ble inconsistencies when comparing both types of detectors and explain guidelines for their proper comparison. Experiments on an artificial cover source with a known model with real stegano- graphic algorithms and deep learning detectors are used to confirm our claims. 
    more » « less
  2. This paper addresses how to fairly compare ROCs of ad hoc (or data driven) detectors with tests derived from statistical models of digital media. We argue that the ways ROCs are typically drawn for each detector type correspond to different hypothesis testing problems with different optimality criteria, making the ROCs incomparable. To understand the problem and why it occurs, we model a source of natural images as a mixture of scene oracles and derive optimal detectors for the task of image steganalysis. Our goal is to guarantee that, when the data follows the statistical model adopted for the hypothesis test, the ROC of the optimal detector bounds the ROC of the ad hoc detector. While the results are applicable beyond the field of image steganalysis, we use this setup to point out possi- ble inconsistencies when comparing both types of detectors and explain guidelines for their proper comparison. Experiments on an artificial cover source with a known model with real stegano- graphic algorithms and deep learning detectors are used to confirm our claims. 
    more » « less
  3. Event-based cameras have been designed for scene motion perception - their high temporal resolution and spatial data sparsity converts the scene into a volume of boundary trajectories and allows to track and analyze the evolution of the scene in time. Analyzing this data is computationally expensive, and there is substantial lack of theory on dense-in-time object motion to guide the development of new algorithms; hence, many works resort to a simple solution of discretizing the event stream and converting it to classical pixel maps, which allows for application of conventional image processing methods. In this work we present a Graph Convolutional neural network for the task of scene motion segmentation by a moving camera. We convert the event stream into a 3D graph in (x,y,t) space and keep per-event temporal information. The difficulty of the task stems from the fact that unlike in metric space, the shape of an object in (x,y,t) space depends on its motion and is not the same across the dataset. We discuss properties of of the event data with respect to this 3D recognition problem, and show that our Graph Convolutional architecture is superior to PointNet++. We evaluate our method on the state of the art event-based motion segmentation dataset - EV-IMO and perform comparisons to a frame-based method proposed by its authors. Our ablation studies show that increasing the event slice width improves the accuracy, and how subsampling and edge configurations affect the network performance. 
    more » « less
  4. We investigate knowledge retrieval with multi-modal queries, i.e. queries containing information split across image and text inputs, a challenging task that differs from previous work on cross-modal retrieval. We curate a new dataset called ReMuQ for benchmarking progress on this task. ReMuQ requires a system to retrieve knowledge from a large corpus by integrating contents from both text and image queries. We introduce a retriever model “ReViz” that can directly process input text and images to retrieve relevant knowledge in an end-to-end fashion without being dependent on intermediate modules such as object detectors or caption generators. We introduce a new pretraining task that is effective for learning knowledge retrieval with multimodal queries and also improves performance on downstream tasks. We demonstrate superior performance in retrieval on two datasets (ReMuQ and OK-VQA) under zero-shot settings as well as further improvements when finetuned on these datasets. 
    more » « less
  5. Spectral computed tomography (SCT) makes use of the spectral dependence of X-ray attenuation in tissues and contrast agents to separate the attenuation data into more than two energy bins. Current SCT detectors are costly and the measured data have low signal to noise ratio due to the detector's narrow bin bandwidth and quantum noise. A new approach called coded aperture compressive X-ray SCT that combines a conventional rotating X-ray CT system with a set of pixelated K-edge coded apertures is introduced. In this method, the amplitude and spectra of the X-ray source are filtered by a particular pattern of K-edge filters in each view angle. Compressed sensing (CS) reconstruction algorithms are then used to recover the spectral CT image from the coded measurements. Simulations results for random coded apertures are shown, and their performance is compared to the use of uncoded measurements. 
    more » « less