skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Expanding the Scope of Artifact Evaluation at HPC Conferences: Experience of SC21
A scientific paper consists of a constellation of artifacts that ex- tend beyond the document itself: software, hardware, evaluation data and documentation, raw survey results, mechanized proofs, models, test suites, benchmarks, and so on. In some cases, the quality of these artifacts is as important as that of the document itself. Based on the success of the Artifact Evaluation efforts at other systems conferences, the 2021 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC21) organized a comprehensive Artifact Description/Artifact Evaluation (AD/AE) review and competition as part of the SC21 Reproducibility Initiative. This paper summarizes the key findings of the AD/AE effort.  more » « less
Award ID(s):
1846418
PAR ID:
10325827
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of Practical Reproducible Evaluation in Computer Systems
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Bi-directional brain-computer interfaces (BCIs) require simultaneous stimulation and recording to achieve closed-loop operation. It is therefore important that the interface be able to distinguish between neural signals of interest and stimulation artifacts. Current bi-directional BCIs address this problem by temporally multiplexing stimulation and recording. This approach, however, is suboptimal in many BCI applications. Alternative artifact mitigation methods can be devised by investigating the mechanics of artifact propagation. To characterize stimulation artifact behaviors, we collected and analyzed electrocorticography (ECoG) data from eloquent cortex mapping. Ratcheting and phase-locking of stimulation artifacts were observed, as well as dipole-like properties. Artifacts as large as ±1,100 μV appeared as far as 15-37 mm away from the stimulating channel when stimulating at 10 mA. Analysis also showed that the majority of the artifact power was concentrated at the stimulation pulse train frequency (50 Hz) and its super-harmonics (100, 150, 200 Hz). Lower frequencies (0-32 Hz) experienced minimal artifact contamination. These findings could inform the design of future bi-directional ECoG-based BCIs. 
    more » « less
  2. This paper presents a novel system architecture to suppress in-band artifacts (IBAs) generated from out-of-band (OOB) interferers, including reciprocal mixing by the local oscillator's (LO) spurs and phase noise (PN), third-order intermodulation (IM3) artifacts, and harmonic down-conversion (HDC) artifacts. Theory and design procedure are explained, and measurement results from a prototype taped out in 45nm RF SOI process are presented. The receiver was designed for the frequency range of 1.2-2.4GHz and achieved a noise figure (NF) of 3.1-6.2dB, blocker -1dB compression point (B1dB) of -10.3Bm, and OOB third-order input-referred intercept point (IIP3) of 9.3dBm on average, before artifact suppression. Measurements were performed on 16-quadrature amplitude modulated (16QAM) signals with modulated and unmodulated OOB interferers to show artifact suppression for various kinds of IBA. For each IBA, artifact suppression performance was assessed across frequency and interferer power. Interference tolerance improvement of up to 38dB was achieved. Additionally, reconstruction of the artifacts for the cases of spur and HDC was demonstrated, showing simultaneous recovery of two signals, providing a form of carrier aggregation. 
    more » « less
  3. Obeid, Iyad; Selesnick, Ivan; Picone, Joseph (Ed.)
    The Neural Engineering Data Consortium has recently developed a new subset of its popular open source EEG corpus – TUH EEG (TUEG) [1]. The TUEG Corpus is the world’s largest open source corpus of EEG data and currently has over 3,300 subscribers. There are several valuable subsets of this data, including the TUH Seizure Detection Corpus (TUSZ) [2], which was featured in the Neureka 2020 Epilepsy Challenge [3]. In this poster, we present a new subset of the TUEG Corpus – the TU Artifact Corpus. This corpus contains 310 EEG files in which every artifact has been annotated. This data can be used to evaluate artifact reduction technology. Since TUEG is comprised of actual clinical data, the set of artifacts appearing in the data is rich and challenging. EEG artifacts are defined as waveforms that are not of cerebral origin and may be affected by numerous external and or physiological factors. These extraneous signals are often mistaken for seizures due to their morphological similarity in amplitude and frequency [4]. Artifacts often lead to raised false alarm rates in machine learning systems, which poses a major challenge for machine learning research. Most state-of-the-art systems use some form of artifact reduction technology to suppress these events. The corpus was annotated using a five-way classification that was developed to meet the needs of our constituents. Brief descriptions of each form of the artifact are provided in Ochal et al. [4]. The five basic tags are: • Chewing (CHEW): An artifact resulting from the tensing and relaxing of the jaw muscles. Chewing is a subset of the muscle artifact class. Chewing has the same characteristic high frequency sharp waves with 0.5 sec baseline periods between bursts. This artifact is generally diffuse throughout the different regions of the brain. However, it might have a higher level of activity in one hemisphere. Classification of a muscle artifact as chewing often depends on whether the accompanying patient report mentions any chewing, since other muscle artifacts can appear superficially similar to chewing artifact. • Electrode (ELEC): An electrode artifact encompasses various electrode related artifacts. Electrode pop is an artifact characterized by channels using the same electrode “spiking” with an electrographic phase reversal. Electrostatic is an artifact caused by movement or interference of electrodes and or the presence of dissimilar metals. A lead artifact is caused by the movement of electrodes from the patient’s head and or poor connection of electrodes. This results in disorganized and high amplitude slow waves. • Eye Movement (EYEM): A spike-like waveform created during patient eye movement. This artifact is usually found on all of the frontal polar electrodes with occasional echoing on the frontal electrodes. • Muscle (MUSC): A common artifact with high frequency, sharp waves corresponding to patient movement. These waveforms tend to have a frequency above 30 Hz with no specific pattern, often occurring because of agitation in the patient. • Shiver (SHIV): A specific and sustained sharp wave artifact that occurs when a patient shivers, usually seen on all or most channels. Shivering is a relatively rare subset of the muscle artifact class. Since these artifacts can overlap in time, a concatenated label format was implemented as a compromise between the limitations of our annotation tool and the complexity needed in an annotation data structure used to represent these overlapping events. We distribute an XML format that easily handles overlapping events. Our annotation tool [5], like most annotation tools of this type, is limited to displaying and manipulating a flat or linear annotation. Therefore, we encode overlapping events as a series of concatenated names using symbols such as: • EYEM+CHEW: eye movement and chewing • EYEM+SHIV: eye movement and shivering • CHEW+SHIV: chewing and shivering An example of an overlapping annotation is shown below in Figure 1. This release is an update of TUAR v1.0.0, which was a partially annotated database. In v1.0.0, a similar five way system was used as well as an additional “null” tag. The “null” tag covers anything that was not annotated, including instances of artifact. Only a limited number of artifacts were annotated in v1.0.0. In this updated version, every instance of an artifact is annotated; ultimately, this provides the user with confidence that any part of the record that is not annotated with one of the five classes does not contain an artifact. No new files, patients, or sessions were added in v2.0.0. However, the data was reannotated with these standards. The total number of files remains the same, but the number of artifact events increases significantly. Complete statistics will be provided on the corpus once annotation is complete and the data is released. This is expected to occur in early July – just after the IEEE SPMB submission deadline. The TUAR Corpus is an open-source database that is currently available for use by any registered member of our consortium. To register and receive access, please follow the instructions provided at this web page: https://www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml. The data is located here: https://www.isip.piconepress.com/projects/tuh_eeg/downloads/tuh_eeg_artifact/v2.0.0/. 
    more » « less
  4. Obeid, Iyad Selesnick (Ed.)
    The Temple University Hospital EEG Corpus (TUEG) [1] is the largest publicly available EEG corpus of its type and currently has over 5,000 subscribers (we currently average 35 new subscribers a week). Several valuable subsets of this corpus have been developed including the Temple University Hospital EEG Seizure Corpus (TUSZ) [2] and the Temple University Hospital EEG Artifact Corpus (TUAR) [3]. TUSZ contains manually annotated seizure events and has been widely used to develop seizure detection and prediction technology [4]. TUAR contains manually annotated artifacts and has been used to improve machine learning performance on seizure detection tasks [5]. In this poster, we will discuss recent improvements made to both corpora that are creating opportunities to improve machine learning performance. Two major concerns that were raised when v1.5.2 of TUSZ was released for the Neureka 2020 Epilepsy Challenge were: (1) the subjects contained in the training, development (validation) and blind evaluation sets were not mutually exclusive, and (2) high frequency seizures were not accurately annotated in all files. Regarding (1), there were 50 subjects in dev, 50 subjects in eval, and 592 subjects in train. There was one subject common to dev and eval, five subjects common to dev and train, and 13 subjects common between eval and train. Though this does not substantially influence performance for the current generation of technology, it could be a problem down the line as technology improves. Therefore, we have rebuilt the partitions of the data so that this overlap was removed. This required augmenting the evaluation and development data sets with new subjects that had not been previously annotated so that the size of these subsets remained approximately the same. Since these annotations were done by a new group of annotators, special care was taken to make sure the new annotators followed the same practices as the previous generations of annotators. Part of our quality control process was to have the new annotators review all previous annotations. This rigorous training coupled with a strict quality control process where annotators review a significant amount of each other’s work ensured that there is high interrater agreement between the two groups (kappa statistic greater than 0.8) [6]. In the process of reviewing this data, we also decided to split long files into a series of smaller segments to facilitate processing of the data. Some subscribers found it difficult to process long files using Python code, which tends to be very memory intensive. We also found it inefficient to manipulate these long files in our annotation tool. In this release, the maximum duration of any single file is limited to 60 mins. This increased the number of edf files in the dev set from 1012 to 1832. Regarding (2), as part of discussions of several issues raised by a few subscribers, we discovered some files only had low frequency epileptiform events annotated (defined as events that ranged in frequency from 2.5 Hz to 3 Hz), while others had events annotated that contained significant frequency content above 3 Hz. Though there were not many files that had this type of activity, it was enough of a concern to necessitate reviewing the entire corpus. An example of an epileptiform seizure event with frequency content higher than 3 Hz is shown in Figure 1. Annotating these additional events slightly increased the number of seizure events. In v1.5.2, there were 673 seizures, while in v1.5.3 there are 1239 events. One of the fertile areas for technology improvements is artifact reduction. Artifacts and slowing constitute the two major error modalities in seizure detection [3]. This was a major reason we developed TUAR. It can be used to evaluate artifact detection and suppression technology as well as multimodal background models that explicitly model artifacts. An issue with TUAR was the practicality of the annotation tags used when there are multiple simultaneous events. An example of such an event is shown in Figure 2. In this section of the file, there is an overlap of eye movement, electrode artifact, and muscle artifact events. We previously annotated such events using a convention that included annotating background along with any artifact that is present. The artifacts present would either be annotated with a single tag (e.g., MUSC) or a coupled artifact tag (e.g., MUSC+ELEC). When multiple channels have background, the tags become crowded and difficult to identify. This is one reason we now support a hierarchical annotation format using XML – annotations can be arbitrarily complex and support overlaps in time. Our annotators also reviewed specific eye movement artifacts (e.g., eye flutter, eyeblinks). Eye movements are often mistaken as seizures due to their similar morphology [7][8]. We have improved our understanding of ocular events and it has allowed us to annotate artifacts in the corpus more carefully. In this poster, we will present statistics on the newest releases of these corpora and discuss the impact these improvements have had on machine learning research. We will compare TUSZ v1.5.3 and TUAR v2.0.0 with previous versions of these corpora. We will release v1.5.3 of TUSZ and v2.0.0 of TUAR in Fall 2021 prior to the symposium. ACKNOWLEDGMENTS Research reported in this publication was most recently supported by the National Science Foundation’s Industrial Innovation and Partnerships (IIP) Research Experience for Undergraduates award number 1827565. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the official views of any of these organizations. REFERENCES [1] I. Obeid and J. Picone, “The Temple University Hospital EEG Data Corpus,” in Augmentation of Brain Function: Facts, Fiction and Controversy. Volume I: Brain-Machine Interfaces, 1st ed., vol. 10, M. A. Lebedev, Ed. Lausanne, Switzerland: Frontiers Media S.A., 2016, pp. 394 398. https://doi.org/10.3389/fnins.2016.00196. [2] V. Shah et al., “The Temple University Hospital Seizure Detection Corpus,” Frontiers in Neuroinformatics, vol. 12, pp. 1–6, 2018. https://doi.org/10.3389/fninf.2018.00083. [3] A. Hamid et, al., “The Temple University Artifact Corpus: An Annotated Corpus of EEG Artifacts.” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1-3. https://ieeexplore.ieee.org/document/9353647. [4] Y. Roy, R. Iskander, and J. Picone, “The NeurekaTM 2020 Epilepsy Challenge,” NeuroTechX, 2020. [Online]. Available: https://neureka-challenge.com/. [Accessed: 01-Dec-2021]. [5] S. Rahman, A. Hamid, D. Ochal, I. Obeid, and J. Picone, “Improving the Quality of the TUSZ Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1–5. https://ieeexplore.ieee.org/document/9353635. [6] V. Shah, E. von Weltin, T. Ahsan, I. Obeid, and J. Picone, “On the Use of Non-Experts for Generation of High-Quality Annotations of Seizure Events,” Available: https://www.isip.picone press.com/publications/unpublished/journals/2019/elsevier_cn/ira. [Accessed: 01-Dec-2021]. [7] D. Ochal, S. Rahman, S. Ferrell, T. Elseify, I. Obeid, and J. Picone, “The Temple University Hospital EEG Corpus: Annotation Guidelines,” Philadelphia, Pennsylvania, USA, 2020. https://www.isip.piconepress.com/publications/reports/2020/tuh_eeg/annotations/. [8] D. Strayhorn, “The Atlas of Adult Electroencephalography,” EEG Atlas Online, 2014. [Online]. Availabl 
    more » « less
  5. Abstract Objective.Invasive brain–computer interfaces (BCIs) have shown promise in restoring motor function to those paralyzed by neurological injuries. These systems also have the ability to restore sensation via cortical electrostimulation. Cortical stimulation produces strong artifacts that can obscure neural signals or saturate recording amplifiers. While front-end hardware techniques can alleviate this problem, residual artifacts generally persist and must be suppressed by back-end methods.Approach.We have developed a technique based on pre-whitening and null projection (PWNP) and tested its ability to suppress stimulation artifacts in electroencephalogram (EEG), electrocorticogram (ECoG) and microelectrode array (MEA) signals from five human subjects.Main results.In EEG signals contaminated by narrow-band stimulation artifacts, the PWNP method achieved average artifact suppression between 32 and 34 dB, as measured by an increase in signal-to-interference ratio. In ECoG and MEA signals contaminated by broadband stimulation artifacts, our method suppressed artifacts by 78%–80% and 85%, respectively, as measured by a reduction in interference index. When compared to independent component analysis, which is considered the state-of-the-art technique for artifact suppression, our method achieved superior results, while being significantly easier to implement.Significance.PWNP can potentially act as an efficient method of artifact suppression to enable simultaneous stimulation and recording in bi-directional BCIs to biomimetically restore motor function. 
    more » « less