Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available August 7, 2025
-
The effective reporting of climate hazards, such as flash floods, hurricanes, and earthquakes, is critical. To quickly and correctly assess the situation and deploy resou rces, emergency services often rely on citizen reports that must be timely, comprehensive, and accurate. The pervasive availability and use of smartphone cameras allow the transmission of dynamic incident information from citizens in near-real-time. While high-quality reporting is beneficial, generating such reports can place an additional burden on citizens who are already suffering from the stress of a climate-related disaster. Furthermore, reporting methods are often challenging to use, due to their length and complexity. In this paper, we explore reducing the friction of climate hazard reporting by automating parts of the form-filling process. By building on existing computer vision and natural language models, we demonstrate the automated generation of a full-form hazard impact assessment report from a single photograph. Our proposed data pipeline can be integrated with existing systems and used with geospatial data solutions, such as flood hazard maps.more » « less
-
null (Ed.)Unlike traditional object stores, Augmented Reality (AR) query workloads possess several unique characteristics, such as spatial and visual information. Such workloads are often keyed on a variety of attributes simultaneously, such as device orientation and position, the scene in view, and spatial anchors. The natural mode of user-interaction in these devices triggers queries implicitly based on the field in the user's view at any instant, generating data queries in excess of the device frame rate. Ensuring a smooth user experience in such a scenario requires a systemic solution exploiting the unique characteristics of the AR workloads. For exploration in such contexts, we are presented with a view-maintenance or cache-prefetching problem; how do we download the smallest subset from the server to the mixed reality device such that latency and device space constraints are met? We present a novel data platform - DreamStore, that considers AR queries as first-class queries, and view-maintenance and large-scale analytics infrastructure around this design choice. Through performance experiments on large-scale and query-intensive AR workloads on DreamStore, we show the advantages and the capabilities of our proposed platform.more » « less
-
null (Ed.)Along with textual content, visual features play an essential role in the semantics of visually rich documents. Information extraction (IE) tasks perform poorly on these documents if these visual cues are not taken into account. In this paper, we present Artemis - a visually aware, machine-learning-based IE method for heterogeneous visually rich documents. Artemis represents a visual span in a document by jointly encoding its visual and textual context for IE tasks. Our main contribution is two-fold. First, we develop a deep-learning model that identifies the local context boundary of a visual span with minimal human-labeling. Second, we describe a deep neural network that encodes the multimodal context of a visual span into a fixed-length vector by taking its textual and layout-specific features into account. It identifies the visual span(s) containing a named entity by leveraging this learned representation followed by an inference task. We evaluate Artemis on four heterogeneous datasets from different domains over a suite of information extraction tasks. Results show that it outperforms state-of-the-art text-based methods by up to 17 points in F1-score.more » « less
-
Classifying heterogeneous visually rich documents is a challenging task. Difficulty of this task increases even more if the maximum allowed inference turnaround time is constrained by a threshold. The increased overhead in inference cost, compared to the limited gain in classification capabilities make current multi-scale approaches infeasible in such scenarios. There are two major contributions of this work. First, we propose a spatial pyramid model to extract highly discriminative multi-scale feature descriptors from a visually rich document by leveraging the inherent hierarchy of its layout. Second, we propose a deterministic routing scheme for accelerating end-to-end inference by utilizing the spatial pyramid model. A depth-wise separable multi-column convolutional network is developed to enable our method. We evaluated the proposed approach on four publicly available, benchmark datasets of visually rich documents. Results suggest that our proposed approach demonstrates robust performance compared to the state-of-the-art methods in both classification accuracy and total inference turnaround.
-
Physical and digital documents often contain visually rich information. With such information, there is no strict order- ing or positioning in the document where the data values must appear. Along with textual cues, these documents often also rely on salient visual features to define distinct semantic boundaries and augment the information they disseminate. When performing information extraction (IE), traditional techniques fall short, as they use a text-only representation and do not consider the visual cues inherent to the layout of these documents. We propose VS2, a generalized approach for information extraction from heterogeneous visually rich documents. There are two major contributions of this work. First, we propose a robust segmentation algorithm that de- composes a visually rich document into a bag of visually iso- lated but semantically coherent areas, called logical blocks. Document type agnostic low-level visual and semantic fea- tures are used in this process. Our second contribution is a distantly supervised search-and-select method for identify- ing the named entities within these documents by utilizing the context boundaries defined by these logical blocks. Ex- perimental results on three heterogeneous datasets suggest that the proposed approach significantly outperforms its text-only counterparts on all datasets. Comparing it against the state-of-the-art methods also reveal that VS2 performs comparably or better on all datasets.more » « less
-
Sampling is often used to reduce query latency for interactive big data analytics. The established parallel data processing paradigm relies on function shipping, where a coordinator dispatches queries to worker nodes and then collects the results. The commoditization of high-performance networking makes data shipping possible, where the coordinator directly reads data in the workers’ memory using RDMA while workers process other queries. In this work, we explore when to use function shipping or data shipping for interactive query processing with sampling. Whether function shipping or data shipping should be preferred depends on the amount of data transferred, the current CPU utilization and the sampling method. The results show that data shipping is up to 6.5Ă— faster when performing clustered sampling with heavily-utilized workers.more » « less
-
Data is becoming increasingly personal. Individuals regularly interact with a wide variety of structured data, from SQLite databases on phones, to HR spreadsheets, to personal sensors, to open government data appearing in news articles. Although these workloads are important, many of the classical challenges associated with scale and Big Data do not apply. This panel brings together experts in a variety of fields to explore the new opportunities and challenges presented by "Small Data".more » « less