skip to main content


Title: Glance: A Generative Approach to Interactive Visualization of Voluminous Satellite Imagery
Challenges in interactive visualizations over satellite data collections stem primarily from their inherent data volumes. Enabling interactive visualizations of such data results in both processing and I/O (network and disk) on the server side. These are further exacerbated by multiple, concurrent requests issued by different clients. Hotspots may also arise when multiple users are interested in a particular geographical extent. We propose a novel methodology to support interactive visualizations over voluminous satellite imagery. Our system, codenamed Glance, generates models that once installed on the client side, substantially alleviate resource requirements on the server side. Our system dynamically generates imagery during zoom-in operations. Glance also supports image refinements using partial high-resolution information when available. Glance is based broadly on a deep Generative Adversarial Network, and our model is space-efficient to facilitate memory-residency at the clients. We supplement Glance with a module to estimate rendering errors when using the model to generate imagery as opposed to a resource-intensive query-and-retrieve operation to the server. Benchmarks to profile our methodology show substantive improvements in interactivity with up to 23x reduction in time lags without utilizing GPU and 297x-6627x reduction while harnessing GPU. Further, the perceptual quality of the images from our generative model is robust with PSNR values ranging from 32.2-40.5, depending on the scenario and upscale factor.  more » « less
Award ID(s):
1931363
NSF-PAR ID:
10352215
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE International Conference on Big Data (Big Data)
Page Range / eLocation ID:
359 to 367
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider a hierarchical inference system with multiple clients connected to a server via a shared communication resource. When necessary, clients with low-accuracy machine learning models can offload classification tasks to a server for processing on a high-accuracy model. We propose a distributed online offloading algorithm which maximizes the accuracy subject to a shared resource utilization constraint thus indirectly realizing accuracy-delay tradeoffs possible given an underlying network scheduler. The proposed algorithm, named Lyapunov-EXP4, introduces a loss structure based on Lyapunov-drift minimization techniques to the bandits with expert advice framework. We prove that the algorithm converges to a near-optimal threshold policy on the confidence of the clients’ local inference without prior knowledge of the system’s statistics and efficiently solves a constrained bandit problem with sublinear regret. We further consider settings where clients may employ multiple thresholds, allowing more aggressive optimization of overall accuracy at a possible loss in fairness. Extensive simulation results on real and synthetic data demonstrate convergence of Lyapunov-EXP4, and show the 
    more » « less
  2. Deep learning models are prone to forgetting information learned in the past when trained on new data. This problem becomes even more pronounced in the context of federated learning (FL), where data is decentralized and subject to independent changes for each user. Continual Learning (CL) studies this so-called \textit{catastrophic forgetting} phenomenon primarily in centralized settings, where the learner has direct access to the complete training dataset. However, applying CL techniques to FL is not straightforward due to privacy concerns and resource limitations. This paper presents a framework for federated class incremental learning that utilizes a generative model to synthesize samples from past distributions instead of storing part of past data. Then, clients can leverage the generative model to mitigate catastrophic forgetting locally. The generative model is trained on the server using data-free methods at the end of each task without requesting data from clients. Therefore, it reduces the risk of data leakage as opposed to training it on the client's private data. We demonstrate significant improvements for the CIFAR-100 dataset compared to existing baselines. 
    more » « less
  3. Interactive visual analytics over distributed systems housing voluminous datasets is hindered by three main factors - disk and network I/O, and data processing overhead. Requests over geospatial data are prone to erratic query load and hotspots due to users’ simultaneous interest over a small sub-domain of the overall data space at a time. Interactive analytics in a distributed setting is further hindered in cases of voluminous datasets with large/high-dimensional data objects, such as multi-spectral satellite imagery. The size of the data objects prohibits efficient caching mechanisms that could significantly reduce response latencies. Additionally, extracting information from these large data objects incurs significant data processing overheads and they often entail resource-intensive computational methods. Here, we present our framework, ARGUS, that extracts low- dimensional representation (embeddings) of high-dimensional satellite images during ingestion and houses them in the cache for use in model-driven analysis relating to wildfire detection. These embeddings are versatile and are used to perform model- based extraction of analytical information for a set of dif- ferent scenarios, to reduce the high computational costs that are involved with typical transformations over high-dimensional datasets. The models for each such analytical process are trained in a distributed manner in a connected, multi-task learning fashion, along with the encoder network that generates the original embeddings. 
    more » « less
  4. Ranzato, M. ; Beygelzimer, A. ; Liang, P.S. ; Vaughan, J.W. ; Dauphin, Y. (Ed.)
    Federated Learning (FL) is a distributed learning framework, in which the local data never leaves clients’ devices to preserve privacy, and the server trains models on the data via accessing only the gradients of those local data. Without further privacy mechanisms such as differential privacy, this leaves the system vulnerable against an attacker who inverts those gradients to reveal clients’ sensitive data. However, a gradient is often insufficient to reconstruct the user data without any prior knowledge. By exploiting a generative model pretrained on the data distribution, we demonstrate that data privacy can be easily breached. Further, when such prior knowledge is unavailable, we investigate the possibility of learning the prior from a sequence of gradients seen in the process of FL training. We experimentally show that the prior in a form of generative model is learnable from iterative interactions in FL. Our findings demonstrate that additional mechanisms are necessary to prevent privacy leakage in FL. 
    more » « less
  5. Abstract

    We study the performance of a cloud-based GPU-accelerated inference server to speed up event reconstruction in neutrino data batch jobs. Using detector data from the ProtoDUNE experiment and employing the standard DUNE grid job submission tools, we attempt to reprocess the data by running several thousand concurrent grid jobs, a rate we expect to be typical of current and future neutrino physics experiments. We process most of the dataset with the GPU version of our processing algorithm and the remainder with the CPU version for timing comparisons. We find that a 100-GPU cloud-based server is able to easily meet the processing demand, and that using the GPU version of the event processing algorithm is two times faster than processing these data with the CPU version when comparing to the newest CPUs in our sample. The amount of data transferred to the inference server during the GPU runs can overwhelm even the highest-bandwidth network switches, however, unless care is taken to observe network facility limits or otherwise distribute the jobs to multiple sites. We discuss the lessons learned from this processing campaign and several avenues for future improvements.

     
    more » « less