skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Serverless Electroencephalogram Data Retrieval and Preprocessing Framework
Electroencephalogram (EEG) research continues to rely heavily on data silos used in isolated physical lab environments. However, as a part of the digital transformation, the EEG community has begun its exploration of the public cloud to determine how it can be best utilized to increase collaboration and accelerate research outcomes. The growing number of online repositories for data and tools has provided additional computational resources but the process of downloading data and software along with the installation and configuration requirements is cumbersome and prone to error. To break away from this research paradigm, we present a novel application of cloud technologies to provide reusable EEG data acquisition and preprocessing software as a service (SaaS) that eliminates data and software downloading prerequisites. We utilize the Amazon Web Services (AWS) cloud platform and serverless technologies to create a distributed, highly scalable and extensible solution for EEG signal data preprocessing that is more conducive to effective collaboration and data reproducibility with the potential to expedite neurotechnology breakthroughs.  more » « less
Award ID(s):
2219634
PAR ID:
10634659
Author(s) / Creator(s):
;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-3458-6
Page Range / eLocation ID:
221 to 226
Subject(s) / Keyword(s):
Cloud computing,Electric potential,Web services,Software as a service,Collaboration, Distributed databases,Electroencephalography
Format(s):
Medium: X
Location:
Bellevue, WA, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Electrical signatures characteristic of complex neurological activity and neuropsychiatric disease are embedded in electroencephalography (EEG) signal data. To firmly establish new correlations between these brain electrical pulses and cognition, behavior, and disorders, researchers must achieve adequate statistical power to validate and mitigate uncertainties in their findings. This necessitates the usage of extensive studies involving large volumes of raw EEG data files from multiple subjects, data which must be preprocessed before conducting further analysis. While conventional processing and analysis of these raw data have been performed using isolated physical lab environments and stovepiped applications, there is a growing necessity for processing and analysis solutions that enable distributed processing of large data collections. This study presents a novel microservices approach as an alternative and complementary solution for retrieving and preprocessing EEG signal data. The approach leverages serverless technologies to deliver a highly scalable solution for processing massive amounts of EEG data. Deployed within a public cloud environment, we assess the efficacy of this method when employing various container orchestration configurations. This work demonstrates the capability for substantial enhancements in processing speeds, particularly when dealing with extensive EEG datasets. 
    more » « less
  2. The healthcare industry has experienced a re-markable digital transformation through the adoption of IoT technologies, resulting in a significant increase in the volume and variety of medical data generated. Challenges in processing, analyzing, and sharing healthcare data persist. Traditional cloud computing approaches, while useful for processing healthcare data, have drawbacks, including delays in data transfer, data privacy concerns, and the risk of data unavailability. In this paper, we propose a software-defined 5G and AI-enabled distributed edge-cloud collaboration platform to classify healthcare data at the edge devices, facilitate realtime service delivery, and create AI/ML-based models for identifying patients' potential medical conditions. In our architecture, we have incorporated a federated learning scheme based on homomorphic encryption to provide privacy in data sharing and processing. The proposed framework ensures secure and efficient data communication and processing, ultimately fostering effective collaboration among healthcare institutions. The models will be validated by performing a comparative time analysis, and the interplay between edge and cloud computing will be investigated to support realtime healthcare applications. 
    more » « less
  3. Obeid, I.; Selesnick, I. (Ed.)
    The Neural Engineering Data Consortium at Temple University has been providing key data resources to support the development of deep learning technology for electroencephalography (EEG) applications [1-4] since 2012. We currently have over 1,700 subscribers to our resources and have been providing data, software and documentation from our web site [5] since 2012. In this poster, we introduce additions to our resources that have been developed within the past year to facilitate software development and big data machine learning research. Major resources released in 2019 include: ● Data: The most current release of our open source EEG data is v1.2.0 of TUH EEG and includes the addition of 3,874 sessions and 1,960 patients from mid-2015 through 2016. ● Software: We have recently released a package, PyStream, that demonstrates how to correctly read an EDF file and access samples of the signal. This software demonstrates how to properly decode channels based on their labels and how to implement montages. Most existing open source packages to read EDF files do not directly address the problem of channel labels [6]. ● Documentation: We have released two documents that describe our file formats and data representations: (1) electrodes and channels [6]: describes how to map channel labels to physical locations of the electrodes, and includes a description of every channel label appearing in the corpus; (2) annotation standards [7]: describes our annotation file format and how to decode the data structures used to represent the annotations. Additional significant updates to our resources include: ● NEDC TUH EEG Seizure (v1.6.0): This release includes the expansion of the training dataset from 4,597 files to 4,702. Calibration sequences have been manually annotated and added to our existing documentation. Numerous corrections were made to existing annotations based on user feedback. ● IBM TUSZ Pre-Processed Data (v1.0.0): A preprocessed version of the TUH Seizure Detection Corpus using two methods [8], both of which use an FFT sliding window approach (STFT). In the first method, FFT log magnitudes are used. In the second method, the FFT values are normalized across frequency buckets and correlation coefficients are calculated. The eigenvalues are calculated from this correlation matrix. The eigenvalues and correlation matrix's upper triangle are used to generate feature. ● NEDC TUH EEG Artifact Corpus (v1.0.0): This corpus was developed to support modeling of non-seizure signals for problems such as seizure detection. We have been using the data to build better background models. Five artifact events have been labeled: (1) eye movements (EYEM), (2) chewing (CHEW), (3) shivering (SHIV), (4) electrode pop, electrostatic artifacts, and lead artifacts (ELPP), and (5) muscle artifacts (MUSC). The data is cross-referenced to TUH EEG v1.1.0 so you can match patient numbers, sessions, etc. ● NEDC Eval EEG (v1.3.0): In this release of our standardized scoring software, the False Positive Rate (FPR) definition of the Time-Aligned Event Scoring (TAES) metric has been updated [9]. The standard definition is the number of false positives divided by the number of false positives plus the number of true negatives: #FP / (#FP + #TN). We also recently introduced the ability to download our data from an anonymous rsync server. The rsync command [10] effectively synchronizes both a remote directory and a local directory and copies the selected folder from the server to the desktop. It is available as part of most, if not all, Linux and Mac distributions (unfortunately, there is not an acceptable port of this command for Windows). To use the rsync command to download the content from our website, both a username and password are needed. An automated registration process on our website grants both. An example of a typical rsync command to access our data on our website is: rsync -auxv nedc_tuh_eeg@www.isip.piconepress.com:~/data/tuh_eeg/ Rsync is a more robust option for downloading data. We have also experimented with Google Drive and Dropbox, but these types of technology are not suitable for such large amounts of data. All of the resources described in this poster are open source and freely available at https://www.isip.piconepress.com/projects/tuh_eeg/downloads/. We will demonstrate how to access and utilize these resources during the poster presentation and collect community feedback on the most needed additions to enable significant advances in machine learning performance. 
    more » « less
  4. M, Murugappan (Ed.)
    Scalp Electroencephalography (EEG) is one of the most popular noninvasive modalities for studying real-time neural phenomena. While traditional EEG studies have focused on identifying group-level statistical effects, the rise of machine learning has prompted a shift in computational neuroscience towards spatio-temporal predictive analyses. We introduce a novel open-source viewer, the EEG Prediction Visualizer (EPViz), to aid researchers in developing, validating, and reporting their predictive modeling outputs. EPViz is a lightweight and standalone software package developed in Python. Beyond viewing and manipulating the EEG data, EPViz allows researchers to load a PyTorch deep learning model, apply it to EEG features, and overlay the output channel-wise or subject-level temporal predictions on top of the original time series. These results can be saved as high-resolution images for use in manuscripts and presentations. EPViz also provides valuable tools for clinician-scientists, including spectrum visualization, computation of basic data statistics, and annotation editing. Finally, we have included a built-in EDF anonymization module to facilitate sharing of clinical data. Taken together, EPViz fills a much needed gap in EEG visualization. Our user-friendly interface and rich collection of features may also help to promote collaboration between engineers and clinicians. 
    more » « less
  5. This Innovative Practice Work-In-Progress paper presents a collaborative virtual computer lab (CVCL) environment to support collaborative learning in cloud-based virtual computer labs. With advances of cloud computing and virtualization technologies, a new paradigm of virtual computer labs has emerged, where students carry out labs on virtualized resources remotely through the Internet. Virtual computer labs bring advantages, such as anywhere, anytime, on-demand access of specialized software and hardware. However, with current implementations, it also makes it difficult for students to collaborate, due to the fact that students are assigned separated virtual working spaces in a remote-accessing environment and there is a lack of support for sharing and collaboration. To address this issue, we develop a CVCL environment that allows students to reserve virtual computers labs with multiple participants and support remote real-time collaboration among the participants during a lab. The CVCL environment will implement several well-defined collaborative lab models, including shared remote collaboration, virtual study room, and virtual tutoring center. This paper describes the overall architecture and main features of the CVCL environment and shows preliminary results. 
    more » « less