skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations
Videos that are shot using commodity hardware such as phones and surveillance cameras record various metadata such as time and location. We encounter suchgeospatial videoson a daily basis and such videos have been growing in volume significantly. Yet, we do not have data management systems that allow users to interact with such data effectively. In this paper, we describe Spatialyze, a new framework for end-to-end querying of geospatial videos. Spatialyze comes with a domain-specific language where users can construct geospatial video analytic workflows using a 3-step, declarative,build-filter-observeparadigm. Internally, Spatialyze leverages the declarative nature of such workflows, the temporal-spatial metadata stored with videos, and physical behavior of real-world objects to optimize the execution of workflows. Our results using real-world videos and workflows show that Spatialyze can reduce execution time by up to 5.3×, while maintaining up to 97.1% accuracy compared to unoptimized execution.  more » « less
Award ID(s):
1955488 2027575
PAR ID:
10575729
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
PVLDB
Date Published:
Journal Name:
Proceedings of the VLDB Endowment
Volume:
17
Issue:
9
ISSN:
2150-8097
Page Range / eLocation ID:
2136 to 2148
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Dataflow systems have an increasing need to support a wide range of tasks in data-centric applications using latest techniques such as machine learning. These tasks often involve custom functions with complex internal states. Consequently, users need enhanced debugging support to understand runtime behaviors and investigate internal states of dataflows. Traditional forward debuggers allow users to follow the chronological order of operations in an execution. Therefore, a user cannot easily identify a past runtime behavior after an unexpected result is produced. In this paper, we present a novel time-travel debugging paradigm called IcedTea, which supports reverse debugging. In particular, in a dataflow's execution, which is inherently distributed across multiple operators, the user can periodically interact with the job and retrieve the global states of the operators. After the execution, the system allows the user to roll back the dataflow state to any past interactions. The user can use step instructions to repeat the past execution to understand how data was processed in the original execution. We give a full specification of this powerful paradigm, study how to reduce its runtime overhead and develop techniques to support debugging instructions responsively. Our experiments on real-world datasets and workflows show that IcedTea can support responsive time-travel debugging with low time and space overhead. 
    more » « less
  2. Scientific workflows have become ubiquitous across scientific fields, and their execution methods and systems continue to be the subject of research and development. Most experimental evaluations of these workflows rely on workflow instances, which can be either real-world or synthetic, to ensure relevance to current application domains or explore hypothetical/future scenarios. The WfCommons project addresses this need by providing data and tools to support such evaluations. In this paper, we present an overview of WfCommons and describe two recent developments. Firstly, we introduce a workflow execution "tracer" for NextFlow, which significantly enhances the set of real-world instances available in WfCommons. Secondly, we describe a workflow instance "translator" that enables the execution of any real-world or synthetic WfCommons workflow instance using Dask. Our contributions aim to provide researchers and practitioners with more comprehensive resources for evaluating scientific workflows. 
    more » « less
  3. Next-generation stream processing systems for community scale IoT applications must handle complex nonfunctional needs, e.g. scalability of input, reliability/timeliness of communication and privacy/security of captured data. In many IoT settings, efficiently batching complex workflows remains challenging in resource-constrained environments. High data rates, combined with real-time processing needs for applications, have pointed to the need for efficient edge stream processing techniques. In this work, we focus on designing scalable edge stream processing workflows in real-world IoT deployments where performance and privacy are key concerns. Initial efforts have revealed that privacy policy execution/enforcement at the edge for intensive workloads is prohibitively expensive. Thus, we leverage intelligent batching techniques to enhance the performance and throughput of streaming in IoT smart spaces. We introduce BatchIT, a processing middleware based on a smart batching strategy that optimizes the trade-off between batching delay and the end-to-end delay requirements of IoT applications. Through experiments with a deployed system we demonstrate that BatchIT outperforms several approaches, including micro-batching and EdgeWise, while reducing computation overhead. 
    more » « less
  4. Abstract AQME, automated quantum mechanical environments, is a free and open‐source Python package for the rapid deployment of automated workflows using cheminformatics and quantum chemistry. AQME workflows integrate tasks performed across multiple computational chemistry packages and data formats, preserving all computational protocols, data, and metadata for machine and human users to access and reuse. AQME has a modular structure of independent modules that can be implemented in any sequence, allowing the users to use all or only the desired parts of the program. The code has been developed for researchers with basic familiarity with the Python programming language. The CSEARCH module interfaces to molecular mechanics and semi‐empirical QM (SQM) conformer generation tools (e.g., RDKit and Conformer–Rotamer Ensemble Sampling Tool, CREST) starting from various initial structure formats. The CMIN module enables geometry refinement with SQM and neural network potentials, such as ANI. The QPREP module interfaces with multiple QM programs, such as Gaussian, ORCA, and PySCF. The QCORR module processes QM results, storing structural, energetic, and property data while also enabling automated error handling (i.e., convergence errors, wrong number of imaginary frequencies, isomerization, etc.) and job resubmission. The QDESCP module provides easy access to QM ensemble‐averaged molecular descriptors and computed properties, such as NMR spectra. Overall, AQME provides automated, transparent, and reproducible workflows to produce, analyze and archive computational chemistry results. SMILES inputs can be used, and many aspects of tedious human manipulation can be avoided. Installation and execution on Windows, macOS, and Linux platforms have been tested, and the code has been developed to support access through Jupyter Notebooks, the command line, and job submission (e.g., Slurm) scripts. Examples of pre‐configured workflows are available in various formats, and hands‐on video tutorials illustrate their use. This article is categorized under:Data Science > ChemoinformaticsData Science > Computer Algorithms and ProgrammingSoftware > Quantum Chemistry 
    more » « less
  5. Sudeepa Roy and Jun Yang (Ed.)
    Data we encounter in the real-world such as printed menus, business documents, and nutrition labels, are often ad-hoc. Valuable insights can be gathered from this data when combined with additional information. Recent advances in computer vision and augmented reality have made it possible to understand and enrich such data. Joining real-world data with remote data stores and surfacing those enhanced results in place, within an augmented reality interface can lead to better and more informed decision-making capabilities. However, building end-user applications that perform these joins with minimal human effort is not straightforward. It requires a diverse set of expertise, including machine learning, database systems, computer vision, and data visualization. To address this complexity, we present Quill – a framework to develop end-to-end applications that model augmented reality applications as a join between real- world data and remote data stores. Using an intuitive domain-specific language, Quill accelerates the development of end-user applications that join real-world data with remote data stores. Through experiments on applications from multiple different domains, we show that Quill not only expedites the process of development, but also allows developers to build applications that are more performant than those built using standard developer tools, thanks to the ability to optimize declarative specifications. We also perform a user-focused study to investigate how easy (or difficult) it is to use Quill for developing augmented reality applications than other existing tools. Our results show that Quill allows developers to build and deploy applications with a lower technical background than building the same application using existing developer tools. 
    more » « less