skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Mitra, Subrata"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. State-of-the-art video database management systems (VDBMSs) often use lightweight proxy models to accelerate object retrieval and aggregate queries. The key assumption underlying these systems is that the proxy model is an order of magnitude faster than the heavyweight oracle model. However, recent advances in computer vision have invalidated this assumption. Inference time of recently proposed oracle models is on par with or even lower than the proxy models used in state-of-the-art (SoTA) VDBMSs. This paper presents Seiden, a VDBMS that leverages this radical shift in the runtime gap between the oracle and proxy models. Instead of relying on a proxy model, Seiden directly applies the oracle model over a subset of frames to build a query-agnostic index, and samples additional frames to answer the query using an exploration-exploitation scheme during query processing. By leveraging the temporal continuity of the video and the output of the oracle model on the sampled frames, Seiden delivers faster query processing and better query accuracy than SoTA VDBMSs. Our empirical evaluation shows that Seiden is on average 6.6 x faster than SoTA VDBMSs across diverse queries and datasets.

     
    more » « less
  2. Deep neural networks (DNNs) are increasingly popular owing to their ability to solve complex problems such as image recognition, autonomous driving, and natural language processing. Their growing complexity coupled with the use of larger volumes of training data (to achieve acceptable accuracy) has warranted the use of GPUs and other accelerators. Such accelerators are typically expensive, with users having to pay a high upfront cost to acquire them. For infrequent use, users can, instead, leverage the public cloud to mitigate the high acquisition cost. However, with the wide diversity of hardware instances (particularly GPU instances) available in public cloud, it becomes challenging for a user to make an appropriate choice from a cost/performance standpoint. In this work, we try to address this problem by (i) introducing a comprehensive distributed deep learning (DDL) profiler Stash, which determines the various execution stalls that DDL suffers from, and (ii) using Stash to extensively characterize various public cloud GPU instances by running popular DNN models on them. Specifically, it estimates two types of communication stalls, namely, interconnect and network stalls, that play a dominant role in DDL execution time. Stash is implemented on top of prior work, DS-analyzer, that computes only the CPU and disk stalls. Using our detailed stall characterization, we list the advantages and shortcomings of public cloud GPU instances for users to help them make an informed decision(s). Our characterization results indicate that the more expensive GPU instances may not be the most performant for all DNN models and that AWS can sometimes sub-optimally allocate hardware interconnect resources. Specifically, the intra-machine interconnect can introduce communication overheads of up to 90% of DNN training time and the network-connected instances can suffer from up to 5× slowdown compared to training on a single instance. Furthermore, (iii) we also model the impact of DNN macroscopic features such as the number of layers and the number of gradients on communication stalls, and finally, (iv) we briefly discuss a cost comparison with existing work. 
    more » « less
  3. null (Ed.)