skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A New Approach for Pedestrian Density Estimation Using Moving Sensors and Computer Vision
An understanding of person dynamics is indispensable for numerous urban applications, including the design of transportation networks and planning for business development. Pedestrian counting often requires utilizing manual or technical means to count individuals in each location of interest. However, such methods do not scale to the size of a city and a new approach to fill this gap is here proposed. In this project, we used a large dense dataset of images of New York City along with computer vision techniques to construct a spatio-temporal map of relative person density. Due to the limitations of state-of-the-art computer vision methods, such automatic detection of person is inherently subject to errors. We model these errors as a probabilistic process, for which we provide theoretical analysis and thorough numerical simulations. We demonstrate that, within our assumptions, our methodology can supply a reasonable estimate of person densities and provide theoretical bounds for the resulting error.  more » « less
Award ID(s):
1828576
PAR ID:
10196757
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
ACM transactions on spatial algorithms and systems
ISSN:
2374-0353
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Quantitative analysis methods based on the usage of a scanning electron microscope (SEM), such as energy-dispersive X-ray spectroscopy, often require specimens to have a flat surface oriented normal to the electron beam. In-situ procedures for putting microscopic flat surfaces into this orientation generally rely on stereoscopic methods that measure the change in surface vector projections when the surface is tilted by some known angle. Although these methods have been used in the past, there is no detailed statistical analysis of the uncertainties involved in such methods, which leaves an uncertainty in how precisely a specimen can be oriented. Here, we present a first principles derivation of a specimen orientation method and apply our method to a flat sample to demonstrate it. Unlike previous works, we develop a computer vision program using the scale-invariant feature transform to automate and expedite the process of making measurements on our SEM images, thus enabling a detailed statistical analysis of the method with a large sample size. We find that our specimen orientation method is able to orient flat surfaces with high precision and can further provide insight into errors involved in the standard SEM rotation and tilt operations. 
    more » « less
  2. Scanned images of patent or historical documents often contain localized zigzag noise introduced by the digitizing process; yet when viewed as a whole image, global structures are apparent to humans, but not to machines. Existing denoising methods work well for natural images, but not for binary diagram images, which makes feature extraction difficult for computer vision and machine learning methods and algorithms. We propose a topological graph-based representation to tackle this denoising problem. The graph representation emphasizes the shapes and topology of diagram images, making it ideal for use in machine learning applications such as classification and matching of scientific diagram images. Our approach and algorithms provide essential structure and lay important foundation for computer vision such as scene graph-based applications, because topological relations and spatial arrangement among objects in images are captured and stored in our skeleton graph. In addition, while the parameters for almost all pixel-based methods are not adaptive, our method is robust in that it only requires one parameter and it is adaptive. Experimental comparisons with existing methods show the effectiveness of our approach. 
    more » « less
  3. Missing person searches are typically initiated with a description of a person that includes their age, race, clothing, and gender, possibly supported by a photo. Unmanned Aerial Systems (sUAS) imbued with Computer Vision (CV) capabilities, can be deployed to quickly search an area to find the missing person; however, the search task is far more difficult when a crowd of people is present, and only the person described in the missing person report must be identified. It is particularly challenging to perform this task on the potentially limited resources of an sUAS. We therefore propose AirSight, as a new model that hierarchically combines multiple CV models, exploits both onboard and off-board computing capabilities, and engages humans interactively in the search. For illustrative purposes, we use AirSight to show how a person's image, extracted from an aerial video can be matched to a basic description of the person. Finally, as a work-in-progress paper, we describe ongoing efforts in building an aerial dataset of partially occluded people and physically deploying AirSight on our sUAS. 
    more » « less
  4. Synopsis Identifying individual animals is crucial for many biological investigations. In response to some of the limitations of current identification methods, new automated computer vision approaches have emerged with strong performance. Here, we review current advances of computer vision identification techniques to provide both computer scientists and biologists with an overview of the available tools and discuss their applications. We conclude by offering recommendations for starting an animal identification project, illustrate current limitations, and propose how they might be addressed in the future. 
    more » « less
  5. The study of complex human interactions and group activities has become a focal point in human-centric computer vision. However, progress in related tasks is often hindered by the challenges of obtaining large-scale labeled datasets from real-world scenarios. To address the limitation, we introduce M3Act, a synthetic data generator for multi-view multi-group multi-person human atomic actions and group activities. Powered by Unity Engine, M3Act features multiple semantic groups, highly diverse and photorealistic images, and a comprehensive set of annotations, which facilitates the learning of human-centered tasks across singleperson, multi-person, and multi-group conditions. We demonstrate the advantages of M3Act across three core experiments. The results suggest our synthetic dataset can significantly improve the performance of several downstream methods and replace real-world datasets to reduce cost. Notably, M3Act improves the state-of-the-art MOTRv2 on DanceTrack dataset, leading to a hop on the leaderboard from 10th to 2nd place. Moreover, M3Act opens new research for controllable 3D group activity generation. We define multiple metrics and propose a competitive baseline for the novel task. Our code and data are available at our project page: http://cjerry1243.github.io/M3Act. 
    more » « less