skip to main content


Search for: All records

Award ID contains: 1828576

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Access to high-quality data is an important barrier in the digital analysis of urban settings, including applications within computer vision and urban design. Diverse forms of data collected from sensors in areas of high activity in the urban environment, particularly at street intersections, are valuable resources for researchers interpreting the dynamics between vehicles, pedestrians, and the built environment. In this paper, we present a high-resolution audio, video, and LiDAR dataset of three urban intersections in Brooklyn, New York, totaling almost 8 unique hours. The data were collected with custom Reconfigurable Environmental Intelligence Platform (REIP) sensors that were designed with the ability to accurately synchronize multiple video and audio inputs. The resulting data are novel in that they are inclusively multimodal, multi-angular, high-resolution, and synchronized. We demonstrate four ways the data could be utilized — (1) to discover and locate occluded objects using multiple sensors and modalities, (2) to associate audio events with their respective visual representations using both video and audio modes, (3) to track the amount of each type of object in a scene over time, and (4) to measure pedestrian speed using multiple synchronized camera views. In addition to these use cases, our data are available for other researchers to carry out analyses related to applying machine learning to understanding the urban environment (in which existing datasets may be inadequate), such as pedestrian-vehicle interaction modeling and pedestrian attribute recognition. Such analyses can help inform decisions made in the context of urban sensing and smart cities, including accessibility-aware urban design and Vision Zero initiatives.

     
    more » « less
  2. While cities around the world are increasingly promoting streets and public spaces that prioritize pedestrians over vehicles, significant data gaps have made pedestrian mapping, analysis, and modeling challenging to carry out. Most cities, even in industrialized economies, still lack information about the location and connectivity of their sidewalks, making it difficult to implement research on pedestrian infrastructure and holding the technology industry back from developing accurate, location-based Apps for pedestrians, wheelchair users, street vendors, and other sidewalk users. To address this gap, we have designed and implemented an end-to-end open-source tool— Tile2Net —for extracting sidewalk, crosswalk, and footpath polygons from orthorectified aerial imagery using semantic segmentation. The segmentation model, trained on aerial imagery from Cambridge, MA, Washington DC, and New York City, offers the first open-source scene classification model for pedestrian infrastructure from sub-meter resolution aerial tiles, which can be used to generate planimetric sidewalk data in North American cities. Tile2Net also generates pedestrian networks from the resulting polygons, which can be used to prepare datasets for pedestrian routing applications. The work offers a low-cost and scalable data collection methodology for systematically generating sidewalk network datasets, where orthorectified aerial imagery is available, contributing to over-due efforts to equalize data opportunities for pedestrians, particularly in cities that lack the resources necessary to collect such data using more conventional methods. 
    more » « less
  3. Sensor networks have dynamically expanded our ability to monitor and study the world. Their presence and need keep increasing, and new hardware configurations expand the range of physical stimuli that can be accurately recorded. Sensors are also no longer simply recording the data, they process it and transform into something useful before uploading to the cloud. However, building sensor networks is costly and very time consuming. It is difficult to build upon other people’s work and there are only a few open-source solutions for integrating different devices and sensing modalities. We introduce REIP, a Reconfigurable Environmental Intelligence Platform for fast sensor network prototyping. REIP’s first and most central tool, implemented in this work, is an open-source software framework, an SDK, with a flexible modular API for data collection and analysis using multiple sensing modalities. REIP is developed with the aim of being user-friendly, device-agnostic, and easily extensible, allowing for fast prototyping of heterogeneous sensor networks. Furthermore, our software framework is implemented in Python to reduce the entrance barrier for future contributions. We demonstrate the potential and versatility of REIP in real world applications, along with performance studies and benchmark REIP SDK against similar systems. 
    more » « less
  4. Video summarization aims to simplify large-scale video browsing by generating con- cise, short summaries that diver from but well represent the original video. Due to the scarcity of video annotations, recent progress for video summarization concentrates on unsupervised methods, among which the GAN-based methods are most prevalent. This type of methods includes a summarizer and a discriminator. The summarized video from the summarizer will be assumed as the final output, only if the video reconstructed from this summary cannot be discriminated from the original one by the discriminator. The primary problems of this GAN-based methods are two-folds. First, the summarized video in this way is a subset of original video with low redundancy and contains high priority events/entities. This summarization criterion is not enough. Second, the training of the GAN framework is not stable. This paper proposes a novel Entity–relationship Aware video summarization method (ERA) to address the above problems. To be more spe- cific, we introduce an Adversarial Spatio-Temporal network to construct the relationship among entities, which we think should also be given high priority in the summarization. The GAN training problem is solved by introducing the Wasserstein GAN and two newly proposed video-patch/score-sum losses. In addition, the score-sum loss can also relieve the model sensitivity to the varying video lengths, which is an inherent problem for most current video analysis tasks. Our method substantially lifts the performance on the target benchmark datasets and exceeds the current state-of-the-art. We hope our straightfor- ward yet effective approach will shed some light on the future research of unsupervised video summarization. The code is available online. 
    more » « less
  5. An understanding of person dynamics is indispensable for numerous urban applications, including the design of transportation networks and planning for business development. Pedestrian counting often requires utilizing manual or technical means to count individuals in each location of interest. However, such methods do not scale to the size of a city and a new approach to fill this gap is here proposed. In this project, we used a large dense dataset of images of New York City along with computer vision techniques to construct a spatio-temporal map of relative person density. Due to the limitations of state-of-the-art computer vision methods, such automatic detection of person is inherently subject to errors. We model these errors as a probabilistic process, for which we provide theoretical analysis and thorough numerical simulations. We demonstrate that, within our assumptions, our methodology can supply a reasonable estimate of person densities and provide theoretical bounds for the resulting error. 
    more » « less
  6. The sport data tracking systems available today are based on specialized hardware (high-definition cameras, speed radars, RFID) to detect and track targets on the field. While effective, implementing and maintaining these systems pose a number of challenges, including high cost and need for close human monitoring. On the other hand, the sports analytics community has been exploring human computation and crowdsourcing in order to produce tracking data that is trustworthy, cheaper and more accessible. However, state-of-the-art methods require a large number of users to perform the annotation, or put too much burden into a single user. We propose HistoryTracker, a methodology that facilitates the creation of tracking data for baseball games by warm-starting the annotation process using a vast collection of historical data. We show that HistoryTracker helps users to produce tracking data in a fast and reliable way. 
    more » « less