skip to main content


Title: Interactive Visual Exploration of Spatio-Temporal Urban Data Sets using Urbane
The recent explosion in the number and size of spatio-temporal data sets from urban environments and social sensors creates new opportunities for data-driven approaches to understand and improve cities. Visual analytics systems like Urbane aim to empower domain experts to explore multiple data sets, at different time and space resolutions. Since these systems rely on computationally-intensive spatial aggregation queries that slice and summarize the data over different regions, an important challenge is how to attain interactivity. While traditional pre-aggregation approaches support interactive exploration, they are unsuitable in this setting because they do not support ad-hoc query constraints or polygons of arbitrary shapes. To address this limitation, we have recently proposed Raster Join, an approach that converts a spatial aggregation query into a set of drawing operations on a canvas and leverages the rendering pipeline of the graphics hardware (GPU). By doing so, Raster Join evaluates queries on the fly at interactive speeds on commodity laptops and desktops. In this demonstration, we showcase the efficiency of Raster Join by integrating it with Urbane and enabling interactivity. Demo visitors will interact with Urbane to filter and visualize several urban data sets over multiple resolutions.  more » « less
Award ID(s):
1730396
NSF-PAR ID:
10062987
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
SIGMOD '18
Page Range / eLocation ID:
1693 to 1696
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Advances in technology coupled with the availability of low-cost sensors have resulted in the continuous generation of large time series from several sources. In order to visually explore and compare these time series at different scales, analysts need to execute online analytical processing (OLAP) queries that include constraints and group-by's at multiple temporal hierarchies. Effective visual analysis requires these queries to be interactive. However, while existing OLAP cube-based structures can support interactive query rates, the exponential memory requirement to materialize the data cube is often unsuitable for large data sets. Moreover, none of the recent space-efficient cube data structures allow for updates. Thus, the cube must be re-computed whenever there is new data, making them impractical in a streaming scenario. We propose Time Lattice, a memory-efficient data structure that makes use of the implicit temporal hierarchy to enable interactive OLAP queries over large time series. Time Lattice is a subset of a fully materialized cube and is designed to handle fast updates and streaming data. We perform an experimental evaluation which shows that the space efficiency of the data structure does not hamper its performance when compared to the state of the art. In collaboration with signal processing and acoustics research scientists, we use the Time Lattice data structure to design the Noise Profiler, a web-based visualization framework that supports the analysis of noise from cities. We demonstrate the utility of Noise Profiler through a set of case studies. 
    more » « less
  2. Challenges in interactive visualizations over satellite data collections stem primarily from their inherent data volumes. Enabling interactive visualizations of such data results in both processing and I/O (network and disk) on the server side. These are further exacerbated by multiple, concurrent requests issued by different clients. Hotspots may also arise when multiple users are interested in a particular geographical extent. We propose a novel methodology to support interactive visualizations over voluminous satellite imagery. Our system, codenamed Glance, generates models that once installed on the client side, substantially alleviate resource requirements on the server side. Our system dynamically generates imagery during zoom-in operations. Glance also supports image refinements using partial high-resolution information when available. Glance is based broadly on a deep Generative Adversarial Network, and our model is space-efficient to facilitate memory-residency at the clients. We supplement Glance with a module to estimate rendering errors when using the model to generate imagery as opposed to a resource-intensive query-and-retrieve operation to the server. Benchmarks to profile our methodology show substantive improvements in interactivity with up to 23x reduction in time lags without utilizing GPU and 297x-6627x reduction while harnessing GPU. Further, the perceptual quality of the images from our generative model is robust with PSNR values ranging from 32.2-40.5, depending on the scenario and upscale factor. 
    more » « less
  3. Statistical knowledge and domain expertise are key to extract actionable insights out of data, yet such skills rarely coexist together. In Machine Learning, high-quality results are only attainable via mindful data preprocessing, hyperparameter tuning and model selection. Domain experts are often overwhelmed by such complexity, de-facto inhibiting a wider adoption of ML techniques in other fields. Existing libraries that claim to solve this problem, still require well-trained practitioners. Those frameworks involve heavy data preparation steps and are often too slow for interactive feedback from the user, severely limiting the scope of such systems. In this paper we present Alpine Meadow, a first Interactive Automated Machine Learning tool. What makes our system unique is not only the focus on interactivity, but also the combined systemic and algorithmic design approach; on one hand we leverage ideas from query optimization, on the other we devise novel selection and pruning strategies combining cost-based Multi-Armed Bandits and Bayesian Optimization. We evaluate our system on over 300 datasets and compare against other AutoML tools, including the current NIPS winner, as well as expert solutions. Not only is Alpine Meadow able to significantly outperform the other AutoML systems while --- in contrast to the other systems --- providing interactive latencies, but also outperforms in 80% of the cases expert solutions over data sets we have never seen before. 
    more » « less
  4. SkinnerDB uses reinforcement learning for reliable join ordering, exploiting an adaptive processing engine with specialized join algorithms and data structures. It maintains no data statistics and uses no cost or cardinality models. Also, it uses no training workloads nor does it try to link the current query to seemingly similar queries in the past. Instead, it uses reinforcement learning to learn optimal join orders from scratch during the execution of the current query. To that purpose, it divides the execution of a query into many small time slices. Different join orders are tried in different time slices. SkinnerDB merges result tuples generated according to different join orders until a complete query result is obtained. By measuring execution progress per time slice, it identifies promising join orders as execution proceeds. Along with SkinnerDB, we introduce a new quality criterion for query execution strategies. We upper-bound expected execution cost regret, i.e., the expected amount of execution cost wasted due to sub-optimal join order choices. SkinnerDB features multiple execution strategies that are optimized for that criterion. Some of them can be executed on top of existing database systems. For maximal performance, we introduce a customized execution engine, facilitating fast join order switching via specialized multi-way join algorithms and tuple representations. We experimentally compare SkinnerDB’s performance against various baselines, including MonetDB, Postgres, and adaptive processing methods. We consider various benchmarks, including the join order benchmark, TPC-H, and JCC-H, as well as benchmark variants with user-defined functions. Overall, the overheads of reliable join ordering are negligible compared to the performance impact of the occasional, catastrophic join order choice. 
    more » « less
  5. In this work, we present a multi-view framework to classify spatio-temporal phenomena at multiple resolutions. This approach utilizes the complementarity of features across different resolutions and improves the corresponding models by enforcing consistency of their predictions on unlabeled data. Unlike traditional multi-view learning problems, the key challenge in our case is that there is a many-to-one correspondence between instances across different resolutions, which needs to be explicitly modeled. Experiments on the real-world application of mapping urban areas using spatial raster datasets from satellite observations show the benefits of the proposed multi-view framework. 
    more » « less