skip to main content


Title: SRI-EEG: State-Based Recurrent Imputation for EEG Artifact Correction
Electroencephalogram (EEG) signals are often used as an input modality for Brain Computer Interfaces (BCIs). While EEG signals can be beneficial for numerous types of interaction scenarios in the real world, high levels of noise limits their usage to strictly noise-controlled environments such as a research laboratory. Even in a controlled environment, EEG is susceptible to noise, particularly from user motion, making it highly challenging to use EEG, and consequently BCI, as a ubiquitous user interaction modality. In this work, we address the EEG noise/artifact correction problem. Our goal is to detect physiological artifacts in EEG signal and automatically replace the detected artifacts with imputed values to enable robust EEG sensing overall requiring significantly reduced manual effort than is usual. We present a novel EEG state-based imputation model built upon a recurrent neural network, which we call SRI-EEG, and evaluate the proposed method on three publicly available EEG datasets. From quantitative and qualitative comparisons with six conventional and neural network based approaches, we demonstrate that our method achieves comparable performance to the state-of-the-art methods on the EEG artifact correction task.  more » « less
Award ID(s):
1845587
NSF-PAR ID:
10332222
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Frontiers in Computational Neuroscience
Volume:
16
ISSN:
1662-5188
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Obeid, Iyad ; Selesnick, Ivan ; Picone, Joseph (Ed.)
    The Neural Engineering Data Consortium has recently developed a new subset of its popular open source EEG corpus – TUH EEG (TUEG) [1]. The TUEG Corpus is the world’s largest open source corpus of EEG data and currently has over 3,300 subscribers. There are several valuable subsets of this data, including the TUH Seizure Detection Corpus (TUSZ) [2], which was featured in the Neureka 2020 Epilepsy Challenge [3]. In this poster, we present a new subset of the TUEG Corpus – the TU Artifact Corpus. This corpus contains 310 EEG files in which every artifact has been annotated. This data can be used to evaluate artifact reduction technology. Since TUEG is comprised of actual clinical data, the set of artifacts appearing in the data is rich and challenging. EEG artifacts are defined as waveforms that are not of cerebral origin and may be affected by numerous external and or physiological factors. These extraneous signals are often mistaken for seizures due to their morphological similarity in amplitude and frequency [4]. Artifacts often lead to raised false alarm rates in machine learning systems, which poses a major challenge for machine learning research. Most state-of-the-art systems use some form of artifact reduction technology to suppress these events. The corpus was annotated using a five-way classification that was developed to meet the needs of our constituents. Brief descriptions of each form of the artifact are provided in Ochal et al. [4]. The five basic tags are: • Chewing (CHEW): An artifact resulting from the tensing and relaxing of the jaw muscles. Chewing is a subset of the muscle artifact class. Chewing has the same characteristic high frequency sharp waves with 0.5 sec baseline periods between bursts. This artifact is generally diffuse throughout the different regions of the brain. However, it might have a higher level of activity in one hemisphere. Classification of a muscle artifact as chewing often depends on whether the accompanying patient report mentions any chewing, since other muscle artifacts can appear superficially similar to chewing artifact. • Electrode (ELEC): An electrode artifact encompasses various electrode related artifacts. Electrode pop is an artifact characterized by channels using the same electrode “spiking” with an electrographic phase reversal. Electrostatic is an artifact caused by movement or interference of electrodes and or the presence of dissimilar metals. A lead artifact is caused by the movement of electrodes from the patient’s head and or poor connection of electrodes. This results in disorganized and high amplitude slow waves. • Eye Movement (EYEM): A spike-like waveform created during patient eye movement. This artifact is usually found on all of the frontal polar electrodes with occasional echoing on the frontal electrodes. • Muscle (MUSC): A common artifact with high frequency, sharp waves corresponding to patient movement. These waveforms tend to have a frequency above 30 Hz with no specific pattern, often occurring because of agitation in the patient. • Shiver (SHIV): A specific and sustained sharp wave artifact that occurs when a patient shivers, usually seen on all or most channels. Shivering is a relatively rare subset of the muscle artifact class. Since these artifacts can overlap in time, a concatenated label format was implemented as a compromise between the limitations of our annotation tool and the complexity needed in an annotation data structure used to represent these overlapping events. We distribute an XML format that easily handles overlapping events. Our annotation tool [5], like most annotation tools of this type, is limited to displaying and manipulating a flat or linear annotation. Therefore, we encode overlapping events as a series of concatenated names using symbols such as: • EYEM+CHEW: eye movement and chewing • EYEM+SHIV: eye movement and shivering • CHEW+SHIV: chewing and shivering An example of an overlapping annotation is shown below in Figure 1. This release is an update of TUAR v1.0.0, which was a partially annotated database. In v1.0.0, a similar five way system was used as well as an additional “null” tag. The “null” tag covers anything that was not annotated, including instances of artifact. Only a limited number of artifacts were annotated in v1.0.0. In this updated version, every instance of an artifact is annotated; ultimately, this provides the user with confidence that any part of the record that is not annotated with one of the five classes does not contain an artifact. No new files, patients, or sessions were added in v2.0.0. However, the data was reannotated with these standards. The total number of files remains the same, but the number of artifact events increases significantly. Complete statistics will be provided on the corpus once annotation is complete and the data is released. This is expected to occur in early July – just after the IEEE SPMB submission deadline. The TUAR Corpus is an open-source database that is currently available for use by any registered member of our consortium. To register and receive access, please follow the instructions provided at this web page: https://www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml. The data is located here: https://www.isip.piconepress.com/projects/tuh_eeg/downloads/tuh_eeg_artifact/v2.0.0/. 
    more » « less
  2. Brain-computer interface (BCI) actively translates the brain signals into executable actions by establishing direct communication between the human brain and external devices. Recording brain activity through electroencephalography (EEG) is generally contaminated with both physiological and nonphysiological artifacts, which significantly hinders the BCI performance. Artifact subspace reconstruction (ASR) is a well-known statistical technique that automatically removes artifact components by determining the rejection threshold based on the initial reference EEG segment in multichannel EEG recordings. In real-world applications, the fixed threshold may limit the efficacy of the artifact correction, especially when the quality of the reference data is poor. This study proposes an adaptive online ASR technique by integrating the Hebbian/anti-Hebbian neural networks into the ASR algorithm, namely, principle subspace projection ASR (PSP-ASR) and principal subspace whitening ASR (PSW-ASR) that segmentwise self-organize the artifact subspace by updating the synaptic weights according to the Hebbian and anti-Hebbian learning rules. The effectiveness of the proposed algorithm is compared to the conventional ASR approaches on benchmark EEG dataset and three BCI frameworks, including steady-state visual evoked potential (SSVEP), rapid serial visual presentation (RSVP), and motor imagery (MI) by evaluating the root-mean-square error (RMSE), the signal-to-noise ratio (SNR), the Pearson correlation, and classification accuracy. The results demonstrated that the PSW-ASR algorithm effectively removed the EEG artifacts and retained the activity-specific brain signals compared to the PSP-ASR, standard ASR (Init-ASR), and moving-window ASR (MW-ASR) methods, thereby enhancing the SSVEP, RSVP, and MI BCI performances. Finally, our empirical results from the PSW-ASR algorithm suggested the choice of an aggressive cutoff range of c = 1-10 for activity-specific BCI applications and a moderate range of for the benchmark dataset and general BCI applications. 
    more » « less
  3. Obeid, I. ; Selesnick, I. (Ed.)
    The Neural Engineering Data Consortium at Temple University has been providing key data resources to support the development of deep learning technology for electroencephalography (EEG) applications [1-4] since 2012. We currently have over 1,700 subscribers to our resources and have been providing data, software and documentation from our web site [5] since 2012. In this poster, we introduce additions to our resources that have been developed within the past year to facilitate software development and big data machine learning research. Major resources released in 2019 include: ● Data: The most current release of our open source EEG data is v1.2.0 of TUH EEG and includes the addition of 3,874 sessions and 1,960 patients from mid-2015 through 2016. ● Software: We have recently released a package, PyStream, that demonstrates how to correctly read an EDF file and access samples of the signal. This software demonstrates how to properly decode channels based on their labels and how to implement montages. Most existing open source packages to read EDF files do not directly address the problem of channel labels [6]. ● Documentation: We have released two documents that describe our file formats and data representations: (1) electrodes and channels [6]: describes how to map channel labels to physical locations of the electrodes, and includes a description of every channel label appearing in the corpus; (2) annotation standards [7]: describes our annotation file format and how to decode the data structures used to represent the annotations. Additional significant updates to our resources include: ● NEDC TUH EEG Seizure (v1.6.0): This release includes the expansion of the training dataset from 4,597 files to 4,702. Calibration sequences have been manually annotated and added to our existing documentation. Numerous corrections were made to existing annotations based on user feedback. ● IBM TUSZ Pre-Processed Data (v1.0.0): A preprocessed version of the TUH Seizure Detection Corpus using two methods [8], both of which use an FFT sliding window approach (STFT). In the first method, FFT log magnitudes are used. In the second method, the FFT values are normalized across frequency buckets and correlation coefficients are calculated. The eigenvalues are calculated from this correlation matrix. The eigenvalues and correlation matrix's upper triangle are used to generate feature. ● NEDC TUH EEG Artifact Corpus (v1.0.0): This corpus was developed to support modeling of non-seizure signals for problems such as seizure detection. We have been using the data to build better background models. Five artifact events have been labeled: (1) eye movements (EYEM), (2) chewing (CHEW), (3) shivering (SHIV), (4) electrode pop, electrostatic artifacts, and lead artifacts (ELPP), and (5) muscle artifacts (MUSC). The data is cross-referenced to TUH EEG v1.1.0 so you can match patient numbers, sessions, etc. ● NEDC Eval EEG (v1.3.0): In this release of our standardized scoring software, the False Positive Rate (FPR) definition of the Time-Aligned Event Scoring (TAES) metric has been updated [9]. The standard definition is the number of false positives divided by the number of false positives plus the number of true negatives: #FP / (#FP + #TN). We also recently introduced the ability to download our data from an anonymous rsync server. The rsync command [10] effectively synchronizes both a remote directory and a local directory and copies the selected folder from the server to the desktop. It is available as part of most, if not all, Linux and Mac distributions (unfortunately, there is not an acceptable port of this command for Windows). To use the rsync command to download the content from our website, both a username and password are needed. An automated registration process on our website grants both. An example of a typical rsync command to access our data on our website is: rsync -auxv nedc_tuh_eeg@www.isip.piconepress.com:~/data/tuh_eeg/ Rsync is a more robust option for downloading data. We have also experimented with Google Drive and Dropbox, but these types of technology are not suitable for such large amounts of data. All of the resources described in this poster are open source and freely available at https://www.isip.piconepress.com/projects/tuh_eeg/downloads/. We will demonstrate how to access and utilize these resources during the poster presentation and collect community feedback on the most needed additions to enable significant advances in machine learning performance. 
    more » « less
  4. Objective: Accurate implementation of real-time non-invasive Brain-Machine / Computer Interfaces (BMI / BCI) requires handling physiological and non-physiological artifacts associated with the measurement modalities. For example, scalp electroencephalographic (EEG) measurements are often considered prone to excessive motion artifacts and other types of artifacts that contaminate the EEG recordings. Although the magnitude of such artifacts heavily depends on the task and the setup, complete minimization or isolation of such artifacts is generally not possible. Approach: We present an adaptive de-noising framework with robustness properties, using a Volterra based non-linear mapping to characterize and handle the motion artifact contamination in EEG measurements. We asked healthy able-bodied subjects to walk on a treadmill at gait speeds of 1-to-4 mph, while we tracked the motion of select EEG electrodes with an infrared video-based motion tracking system. We also placed Inertial Measurement Unit (IMU) sensors on the forehead and feet of the subjects for assessing the overall head movement and segmenting the gait. Main Results: We discuss in detail the characteristics of the motion artifacts and propose a real-time compatible solution to filter them. We report the effective handling of both the fundamental frequency of contamination (synchronized to the walking speed) and its harmonics. Event-Related Spectral Perturbation (ERSP) analysis for walking shows that the gait dependency of artifact contamination is also eliminated on all target frequencies. Significance: The real-time compatibility and generalizability of our adaptive filtering framework allows for the effective use of non-invasive BMI/BCI systems and greatly expands the implementation type and application domains to other types of problems where signal denoising is desirable. Combined with our previous efforts of filtering ocular artifacts, the presented technique allows for a comprehensive adaptive filtering framework to increase the EEG Signal to Noise Ratio (SNR). We believe the implementation will benefit all non-invasive neural measurement modalities, including studies discussing neural correlates of movement and other internal states, not necessarily of BMI focus. 
    more » « less
  5. Abstract

    Objective.Invasive brain–computer interfaces (BCIs) have shown promise in restoring motor function to those paralyzed by neurological injuries. These systems also have the ability to restore sensation via cortical electrostimulation. Cortical stimulation produces strong artifacts that can obscure neural signals or saturate recording amplifiers. While front-end hardware techniques can alleviate this problem, residual artifacts generally persist and must be suppressed by back-end methods.Approach.We have developed a technique based on pre-whitening and null projection (PWNP) and tested its ability to suppress stimulation artifacts in electroencephalogram (EEG), electrocorticogram (ECoG) and microelectrode array (MEA) signals from five human subjects.Main results.In EEG signals contaminated by narrow-band stimulation artifacts, the PWNP method achieved average artifact suppression between 32 and 34 dB, as measured by an increase in signal-to-interference ratio. In ECoG and MEA signals contaminated by broadband stimulation artifacts, our method suppressed artifacts by 78%–80% and 85%, respectively, as measured by a reduction in interference index. When compared to independent component analysis, which is considered the state-of-the-art technique for artifact suppression, our method achieved superior results, while being significantly easier to implement.Significance.PWNP can potentially act as an efficient method of artifact suppression to enable simultaneous stimulation and recording in bi-directional BCIs to biomimetically restore motor function.

     
    more » « less