Event detection is gaining increasing attention in smart cities research. Large-scale mobility data serves as an important tool to uncover the dynamics of urban transportation systems, and more often than not the dataset is incomplete. In this article, we develop a method to detect extreme events in large traffic datasets, and to impute missing data during regular conditions. Specifically, we propose a robust tensor recovery problem to recover low-rank tensors under fiber-sparse corruptions with partial observations, and use it to identify events, and impute missing data under typical conditions. Our approach is scalable to large urban areas, taking full advantage of the spatio-temporal correlations in traffic patterns. We develop an efficient algorithm to solve the tensor recovery problem based on the alternating direction method of multipliers (ADMM) framework. Compared with existing l 1 norm regularized tensor decomposition methods, our algorithm can exactly recover the values of uncorrupted fibers of a low-rank tensor and find the positions of corrupted fibers under mild conditions. Numerical experiments illustrate that our algorithm can achieve exact recovery and outlier detection even with missing data rates as high as 40% under 5% gross corruption, depending on the tensor size and the Tucker rank of the low rank tensor. Finally, we apply our method on a real traffic dataset corresponding to downtown Nashville, TN and successfully detect the events like severe car crashes, construction lane closures, and other large events that cause significant traffic disruptions.
more »
« less
This content will become publicly available on October 1, 2026
Physics-driven dynamic interpolation with application to pollution satellite images
Satellite images using multiple wavelength channels provide crucial measurements over large areas, aiding the understanding of pollution generation and transport. However, these images often contain missing data due to cloud cover and algorithm limitations. In this paper, we introduce a novel method for interpolating missing values in satellite images by incorporating pollution transport dynamics influenced by wind patterns. Our approach utilizes a fundamental physics equation to structure the covariance of missing data, improving accuracy by considering pollution transport dynamics. To address computational challenges associated with large datasets, we implement a gradient ascent algorithm. We demonstrate the effectiveness of our method through a case study, showcasing its potential for accurate interpolation in high-resolution, spatio-temporal air pollution datasets.
more »
« less
- Award ID(s):
- 2413823
- PAR ID:
- 10628316
- Publisher / Repository:
- Elsevier
- Date Published:
- Journal Name:
- Spatial Statistics
- Volume:
- 69
- Issue:
- C
- ISSN:
- 2211-6753
- Page Range / eLocation ID:
- 100923
- Subject(s) / Keyword(s):
- Advection–diffusion model Gradient ascent Remote sensing data Spatio-temporal modeling
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Chest X-ray (CXR) analysis plays an important role in patient treatment. As such, a multitude of machine learning models have been applied to CXR datasets attempting automated analysis. However, each patient has a differing number of images per angle, and multi-modal learning should deal with the missing data for specific angles and times. Furthermore, the large dimensionality of multi-modal imaging data with the shapes inconsistent across the dataset introduces the challenges in training. In light of these issues, we propose the Fast Multi-Modal Support Vector Machine (FMMSVM) which incorporates modality-specific factorization to deal with missing CXRs in the specific angle. Our model is able to adjust the fine-grained details in feature extraction and we provide an efficient optimization algorithm scalable to a large number of features. In our experiments, FMMSVM shows clearly improved classification performance.more » « less
-
Modern plant phenotyping requires tools that are robust to noise and missing data, while being able to efficiently process large numbers of plants. Here, we studied the skeletonization of plant architectures from 3D point clouds, which is critical for many downstream tasks, including analyses of plant shape, morphology, and branching angles. Specifically, we developed an algorithm to improve skeletonization at branch points (forks) by leveraging the geometric properties of cylinders around branch points. We tested this algorithm on a diverse set of high-resolution 3D point clouds of tomato and tobacco plants, grown in five environments and across multiple developmental timepoints. Compared to existing methods for 3D skeletonization, our method efficiently and more accurately estimated branching angles even in areas with noisy, missing, or non-uniformly sampled data. Our method is also applicable to inorganic datasets, such as scans of industrial pipes or urban scenes containing networks of complex cylindrical shapes.more » « less
-
null (Ed.)Global surface water classification layers, such as the European Joint Research Centre’s (JRC) Monthly Water History dataset, provide a starting point for accurate and large scale analyses of trends in waterbody extents. On the local scale, there is an opportunity to increase the accuracy and temporal frequency of these surface water maps by using locally trained classifiers and gap-filling missing values via imputation in all available satellite images. We developed the Surface Water IMputation (SWIM) classification framework using R and the Google Earth Engine computing platform to improve water classification compared to the JRC study. The novel contributions of the SWIM classification framework include (1) a cluster-based algorithm to improve classification sensitivity to a variety of surface water conditions and produce approximately unbiased estimation of surface water area, (2) a method to gap-fill every available Landsat image for a region of interest to generate submonthly classifications at the highest possible temporal frequency, (3) an outlier detection method for identifying images that contain classification errors due to failures in cloud masking. Validation and several case studies demonstrate the SWIM classification framework outperforms the JRC dataset in spatiotemporal analyses of small waterbody dynamics with previously unattainable sensitivity and temporal frequency. Most importantly, this study shows that reliable surface water classifications can be obtained for all pixels in every available Landsat image, even those containing cloud cover, after performing gap-fill imputation. By using this technique, the SWIM framework supports monitoring water extent on a submonthly basis, which is especially applicable to assessing the impact of short-term flood and drought events. Additionally, our results contribute to addressing the challenges of training machine learning classifiers with biased ground truth data and identifying images that contain regions of anomalous classification errors.more » « less
-
Urban and environmental researchers seek to obtain building features (e.g., building shapes, counts, and areas) at large scales. However, blurriness, occlusions, and noise from prevailing satellite images severely hinder the performance of image segmentation, super-resolution, or deep-learning-based translation networks. In this article, we combine globally available satellite images and spatial geometric feature datasets to create a generative modeling framework that enables obtaining significantly improved accuracy in per-building feature estimation and the generation of visually plausible building footprints. Our approach is a novel design that compensates for the degradation present in satellite images by using a novel deep network setup that includes segmentation, generative modeling, and adversarial learning for instance-level building features. Our method has proven its robustness through large-scale prototypical experiments covering heterogeneous scenarios from dense urban to sparse rural. Results show better quality over advanced segmentation networks for urban and environmental planning, and show promise for future continental-scale urban applications.more » « less
An official website of the United States government
