Access to high-quality data is an important barrier in the digital analysis of urban settings, including applications within computer vision and urban design. Diverse forms of data collected from sensors in areas of high activity in the urban environment, particularly at street intersections, are valuable resources for researchers interpreting the dynamics between vehicles, pedestrians, and the built environment. In this paper, we present a high-resolution audio, video, and LiDAR dataset of three urban intersections in Brooklyn, New York, totaling almost 8 unique hours. The data were collected with custom Reconfigurable Environmental Intelligence Platform (REIP) sensors that were designed with the ability to accurately synchronize multiple video and audio inputs. The resulting data are novel in that they are inclusively multimodal, multi-angular, high-resolution, and synchronized. We demonstrate four ways the data could be utilized — (1) to discover and locate occluded objects using multiple sensors and modalities, (2) to associate audio events with their respective visual representations using both video and audio modes, (3) to track the amount of each type of object in a scene over time, and (4) to measure pedestrian speed using multiple synchronized camera views. In addition to these use cases, our data are available for other researchers to carry out analyses related to applying machine learning to understanding the urban environment (in which existing datasets may be inadequate), such as pedestrian-vehicle interaction modeling and pedestrian attribute recognition. Such analyses can help inform decisions made in the context of urban sensing and smart cities, including accessibility-aware urban design and Vision Zero initiatives.
more »
« less
A Second-Order Time-Stepping Scheme for Simulating Ensembles of Parameterized Flow Problems
Abstract We consider settings for which one needs to perform multiple flow simulations based on the Navier–Stokes equations, each having different initial condition data, boundary condition data, forcing functions, and/or coefficients such as the viscosity. For such settings, we propose a second-order time accurate ensemble-based method that to simulate the whole set of solutions, requires, at each time step, the solution of only a single linear system with multiple right-hand-side vectors. Rigorous analyses are given proving the conditional stability and establishing error estimates for the proposed algorithm. Numerical experiments are provided that illustrate the analyses.
more »
« less
- PAR ID:
- 10161582
- Date Published:
- Journal Name:
- Computational Methods in Applied Mathematics
- Volume:
- 19
- Issue:
- 3
- ISSN:
- 1609-4840
- Page Range / eLocation ID:
- 681 to 701
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Without imposing prior distributional knowledge underlying multivariate time series of interest, we propose a nonparametric change-point detection approach to estimate the number of change points and their locations along the temporal axis. We develop a structural subsampling procedure such that the observations are encoded into multiple sequences of Bernoulli variables. A maximum likelihood approach in conjunction with a newly developed searching algorithm is implemented to detect change points on each Bernoulli process separately. Then, aggregation statistics are proposed to collectively synthesize change-point results from all individual univariate time series into consistent and stable location estimations. We also study a weighting strategy to measure the degree of relevance for different subsampled groups. Simulation studies are conducted and shown that the proposed change-point methodology for multivariate time series has favorable performance comparing with currently available state-of-the-art nonparametric methods under various settings with different degrees of complexity. Real data analyses are finally performed on categorical, ordinal, and continuous time series taken from fields of genetics, climate, and finance.more » « less
-
Collecting massive amounts of image data is a common way to record the post-event condition of buildings, to be used by engineers and researchers to learn from that event. Key information needed to interpret the image data collected during these reconnaissance missions is the location within the building where each image was taken. However, image localization is difficult in an indoor environment, as GPS is not generally available because of weak or broken signals. To support rapid, seamless data collection during a reconnaissance mission, we develop and validate a fully automated technique to provide robust indoor localization while requiring no prior information about the condition or spatial layout of an indoor environment. The technique is meant for large-scale data collection across multiple floors within multiple buildings. A systematic method is designed to separate the reconnaissance data into individual buildings and individual floors. Then, for data within each floor, an optimization problem is formulated to automatically overlay the path onto the structural drawings providing robust results, and subsequently, yielding the image locations. The end-to end technique only requires the data collector to wear an additional inexpensive motion camera, thus, it does not add time or effort to the current rapid reconnaissance protocol. As no prior information about the condition or spatial layout of the indoor environment is needed, this technique can be adapted to a large variety of building environments and does not require any type of preparation in the postevent settings. This technique is validated using data collected from several real buildings.more » « less
-
Training example order in SGD has long been known to affect convergence rate. Recent results show that accelerated rates are possible in a variety of cases for permutation-based sample orders, in which each example from the training set is used once before any example is reused. In this paper, we develop a broad condition on the sequence of examples used by SGD that is sufficient to prove tight convergence rates in both strongly convex and non-convex settings. We show that our approach suffices to recover, and in some cases improve upon, previous state-of-the-art analyses for four known example-selection schemes: (1) shuffle once, (2) random reshuffling, (3) random reshuffling with data echoing, and (4) Markov Chain Gradient Descent. Motivated by our theory, we propose two new example-selection approaches. First, using quasi-Monte-Carlo methods, we achieve unprecedented accelerated convergence rates for learning with data augmentation. Second, we greedily choose a fixed scan-order to minimize the metric used in our condition and show that we can obtain more accurate solutions from the same number of epochs of SGD. We conclude by empirically demonstrating the utility of our approach for both convex linear-model and deep learning tasks. Our code is available at: https://github.com/EugeneLYC/qmc-ordering.more » « less
-
Abstract Replicate lines under uniform selection often evolve in different ways. Previously, analyses using whole-genome sequence data for individual mice (Mus musculus) from 4 replicate High Runner lines and 4 nonselected control lines demonstrated genomic regions that have responded consistently to selection for voluntary wheel-running behavior. Here, we ask whether the High Runner lines have evolved differently from each other, even though they reached selection limits at similar levels. We focus on 1 High Runner line (HR3) that became fixed for a mutation at a gene of major effect (Myh4Minimsc) that, in the homozygous condition, causes a 50% reduction in hindlimb muscle mass and many pleiotropic effects. We excluded HR3 from SNP analyses and identified 19 regions not consistently identified in analyses with all 4 lines. Repeating analyses while dropping each of the other High Runner lines identified 12, 8, and 6 such regions. (Of these 45 regions, 37 were unique.) These results suggest that each High Runner line indeed responded to selection somewhat uniquely, but also that HR3 is the most distinct. We then applied 2 additional analytical approaches when dropping HR3 only (based on haplotypes and nonstatistical tests involving fixation patterns). All 3 approaches identified 7 new regions (as compared with analyses using all 4 High Runner lines) that include genes associated with activity levels, dopamine signaling, hippocampus morphology, heart size, and body size, all of which differ between High Runner and control lines. Our results illustrate how multiple solutions and “private” alleles can obscure general signatures of selection involving “public” alleles.more » « less
An official website of the United States government

