skip to main content


Title: Tapis v3 Streams API: Time‐series and data‐driven event support in science gateway infrastructure
Summary

The explosion of IoT devices and sensors in recent years has led to a demand for efficiently storing, processing and analyzing time‐series data. Geoscience researchers use time‐series data stores such as Hydroserver, Virtual Observatory and Ecological Informatics System (VOEIS), and Cloud‐Hosted Real‐time Data Service (CHORDS). Many of these tools require a great deal of infrastructure to deploy and expertise to manage and scale. The Tapis framework, an NSF funded project, provides science as a service APIs to allow researchers to achieve faster scientific results, by eliminating the need to set up a complex infrastructure stack. The University of Hawai'i (UH) and Texas Advanced Computing Center (TACC) have collaborated to develop an open source Tapis Streams API that builds on the concepts of the CHORDS time series data service to support research. This new hosted service allows storing, processing, annotating, archiving, and querying time‐series data in the Tapis multi‐user and multi‐tenant collaborative platform. The Streams API provides a hosted production level middleware service that enables new data‐driven event workflows capabilities that may be leveraged by researchers and Tapis powered science gateways for handling spatially indexed time‐series datasets.

 
more » « less
NSF-PAR ID:
10449300
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Concurrency and Computation: Practice and Experience
Volume:
33
Issue:
19
ISSN:
1532-0626
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The explosion of IoT devices and sensors in recent years has led to a demand for efficiently storing, processing and analyzing time-series data. Geoscience researchers use time-series data stores such as Hydroserver, VOEIS and CHORDS. Many of these tools require a great deal of infrastructure to deploy and expertise to manage and scale. Tapis's (formerly known as Agave) platform as a service provides a way to support researchers in a way that they are not responsible for the infrastructure and can focus on the science. The University of Hawaii (UH) and Texas Advanced Computing Center (TACC) have collaborated to develop a new API integration that combines Tapis with the CHORDS time series data service to support projects at both institutions for storing, annotating and querying time-series data. This new Streams API leverages the strengths of both the Tapis platform and CHORDS service to enable capabilities for supporting time-series data streams not available in either tool alone. These new capabilities may be leveraged by Tapis powered science gateways with needs for handling spatially indexed time-series data-sets for their researchers as they have been at UH and TACC. 
    more » « less
  2. The Tapis Streams API is a production grade quality service that provides REST APIs for storing, processing and analyzing real-time streaming data. This paper focuses on improvements made to Tapis 1.0 Streams API for making it up-to-date and easily accessible. The newer version, Tapis 1.2 Streams API adopts the latest version of InfluxDB, InfluxDB 2.X, which has built-in security features and supports next generation data analytics and processing with a data processing language Flux. This paper also discusses the measures implemented in the Tapis 1.2 Streams API to mitigate potential security risks involved in unauthorized data stream access by users who do not own it. Additionally, new data Channel Actions supporting 3rd Party notification and web-hooks has been released. Lastly a tool, Tapis UI, which is a self contained server less application to access Tapis Services via rest calls is discussed in the paper. Tapis UI is a lightweight browser only client application which allows interactive access to Streams resources and real-time streaming data. 
    more » « less
  3. In the last decade, the rise of hosted Software-as-a-Service (SaaS) application programming interfaces (APIs) across both academia and industry has exploded, and simultaneously, microservice architectures have replaced monolithic application platforms for the flexibility and maintainability they offer. These SaaS APIs rely on small, independent and reusable microservices that can be assembled relatively easily into more complex applications. As a result, developers can focus on their own unique functionality and surround it with fully functional, distributed processes developed by other specialists, which they access through APIs. The Tapis framework, a NSF funded project, provides SaaS APIs to allow researchers to achieve faster scientific results, by eliminating the need to set up a complex infrastructure stack. In this paper, we describe the best practices followed to create Tapis APIs using Python and the Stream API as an example implementation illustrating authorization and authentication with the Tapis Security Kernel, Tenants and Tokens APIs, leveraging OpenAPI v3 specification for the API definitions and docker containerization. Finally, we discuss our deployment strategy with Kubernetes, which is an emerging orchestration technology and the early adopter use cases of the Streams API service. 
    more » « less
  4. On August 9-10, 2023, the Thomas J. O’Keefe Institute for Sustainable Supply of Strategic Minerals at Missouri University of Science and Technology (Missouri S&T) hosted the third annual workshop on ‘Resilient Supply of Critical Minerals’. The workshop was funded by the National Science Foundation (NSF) and was attended by 218 participants. 128 participants attended in-person in the Havener Center on the Missouri S&T campus in Rolla, Missouri, USA. Another 90 participants attended online via Zoom. Fourteen participants (including nine students) received travel support through the NSF grant to attend the conference in Rolla. Additionally, the online participation fee was waived for another six students and early career researchers to attend the workshop virtually. Out of the 218 participants, 190 stated their sectors of employment during registration showing that 87 participants were from academia (32 students), 62 from the private sector and 41 from government agencies. Four topical sessions were covered: A. The Critical Mineral Potential of the USA: Evaluation of existing, and exploration for new resources. B. Mineral Processing and Recycling: Maximizing critical mineral recovery from existing production streams. C. Critical Mineral Policies: Toward effective and responsible governance. D. Resource Sustainability: Ethical and environmentally sustainable supply of critical minerals. Each topical session was composed of two keynote lectures and complemented by oral and poster presentations by the workshop participants. Additionally, a panel discussion with panelists from academia, the private sector and government agencies was held that discussed ‘How to grow the American critical minerals workforce’. The 2023 workshop was followed by a post-workshop field trip to the lead-zinc mining operations of the Doe Run Company in southeast Missouri that was attended by 18 workshop participants from academia (n=10; including 4 students), the private sector (n=4), and government institutions (n=4). Discussions during the workshop led to the following suggestions to increase the domestic supply of critical minerals: (i) Research to better understand the geologic critical mineral potential of the USA, including primary reserves/resources, historic mine wastes, and mineral exploration potential. (ii) Development of novel extraction techniques targeted at the recovery of critical minerals as co-products from existing production streams, mine waste materials, and recyclables. (iii) Faster and more transparent permitting processes for mining and mineral processing operations. (iv) A more environmentally sustainable and ethical approach to mining and mineral processing. (v) Development of a highly skilled critical minerals workforce. This workshop report provides a detailed summary of the workshop discussions and describes a way forward for this workshop series for 2024 and beyond. 
    more » « less
  5. Artificial intelligence applications within the geosciences are becoming increasingly common, yet there are still many challenges involved in adapting established techniques to geoscience data sets. Applications in the realm of volcanic hazards assessment show great promise for addressing such challenges. Here, we describe a Jupyter Notebook we developed that ingests real-time Global Navigation Satellite System (GNSS) data streams from the EarthCube CHORDS (Cloud-Hosted Real-time Data Services for the geosciences) portal TZVOLCANO, applies unsupervised learning algorithms to perform automated data quality control (“noise reduction”), and explores autonomous detection of unusual volcanic activity using a neural network. The TZVOLCANO CHORDS portal streams real-time GNSS positioning data in 1[Formula: see text]s intervals from the TZVOLCANO network, which monitors the active volcano Ol Doinyo Lengai in Tanzania, through UNAVCO’s real-time GNSS data services. UNAVCO’s real-time data services provide near-real-time positions processed by the Trimble Pivot system. The positioning data (latitude, longitude and height) are imported into the Jupyter Notebook presented in this paper in user-defined time spans. The positioning data are then collected in sets by the Jupyter Notebook and processed to extract a useful calculated variable in preparation for the machine learning algorithms, of which we choose the vector magnitude for further processing. Unsupervised K-means and Gaussian Mixture machine learning algorithms are then utilized to locate and remove data points (“filter”) that are likely caused by noise and unrelated to volcanic signals. We find that both the K-means and Gaussian Mixture machine learning algorithms perform well at identifying regions of high noise within tested GNSS data sets. The filtered data are then used to train an artificial intelligence neural network that predicts volcanic deformation. Our Jupyter Notebook has promise to be used for detecting potentially hazardous volcanic activity in the form of rapid vertical or horizontal displacement of the Earth’s surface. 
    more » « less