skip to main content


Title: Detecting Smartwatch-Based Behavior Change in Response to a Multi-Domain Brain Health Intervention
In this study, we introduce and validate a computational method to detect lifestyle change that occurs in response to a multi-domain healthy brain aging intervention. To detect behavior change, digital behavior markers are extracted from smartwatch sensor data and a permutation-based change detection algorithm quantifies the change in marker-based behavior from a pre-intervention, 1-week baseline. To validate the method, we verify that changes are successfully detected from synthetic data with known pattern differences. Next, we employ this method to detect overall behavior change for n = 28 brain health intervention subjects and n = 17 age-matched control subjects. For these individuals, we observe a monotonic increase in behavior change from the baseline week with a slope of 0.7460 for the intervention group and a slope of 0.0230 for the control group. Finally, we utilize a random forest algorithm to perform leave-one-subject-out prediction of intervention versus control subjects based on digital marker delta values. The random forest predicts whether the subject is in the intervention or control group with an accuracy of 0.87. This work has implications for capturing objective, continuous data to inform our understanding of intervention adoption and impact.  more » « less
Award ID(s):
1954372
NSF-PAR ID:
10418199
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ACM Transactions on Computing for Healthcare
Volume:
3
Issue:
3
ISSN:
2691-1957
Page Range / eLocation ID:
1 to 18
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    This research explored kindergarten students' learning of simple particle models to explain the properties and behavior of matter in the solid, liquid, and gas states and during phase transitions (evaporation, melting, freezing, and condensation). Science instruction for young learners tends to focus on the concrete and directly observable. This focus on the here and now of experience can foreclose opportunities for young learners to explain their world in terms of the mechanisms posited by modern science. The current research examined how kindergarten children's models of matter develop as they engage with technology‐mediated, model‐based inquiry lessons. Data were collected from two intervention groups who engaged in a series of modeling lessons with the aid of digital tools for visualizing and explaining particle behavior across varied material phenomena. The two intervention groups engaged in a common set of modeling activities but differed in the use of one digital tool. Intervention students were interviewed in the week before the beginning of the intervention and the week following the completion of the intervention to assess the development of their models of matter. To provide a baseline for comparison, we assessed a third group of kindergarten children who did not receive any instruction on matter in the same time frame as the intervention students. Data were coded using cognitive science techniques of verbal protocol analysis. Repeated measures ANOVAs were conducted to explore changes across the pre‐ and post‐intervention interviews. We found that children from both intervention groups showed significant gains in the use of particle models to explain material phenomena, while the comparison group showed only small gains in the use of particle models.

     
    more » « less
  2. Obeid, Iyad ; Picone, Joseph ; Selesnick, Ivan (Ed.)
    The Neural Engineering Data Consortium (NEDC) is developing a large open source database of high-resolution digital pathology images known as the Temple University Digital Pathology Corpus (TUDP) [1]. Our long-term goal is to release one million images. We expect to release the first 100,000 image corpus by December 2020. The data is being acquired at the Department of Pathology at Temple University Hospital (TUH) using a Leica Biosystems Aperio AT2 scanner [2] and consists entirely of clinical pathology images. More information about the data and the project can be found in Shawki et al. [3]. We currently have a National Science Foundation (NSF) planning grant [4] to explore how best the community can leverage this resource. One goal of this poster presentation is to stimulate community-wide discussions about this project and determine how this valuable resource can best meet the needs of the public. The computing infrastructure required to support this database is extensive [5] and includes two HIPAA-secure computer networks, dual petabyte file servers, and Aperio’s eSlide Manager (eSM) software [6]. We currently have digitized over 50,000 slides from 2,846 patients and 2,942 clinical cases. There is an average of 12.4 slides per patient and 10.5 slides per case with one report per case. The data is organized by tissue type as shown below: Filenames: tudp/v1.0.0/svs/gastro/000001/00123456/2015_03_05/0s15_12345/0s15_12345_0a001_00123456_lvl0001_s000.svs tudp/v1.0.0/svs/gastro/000001/00123456/2015_03_05/0s15_12345/0s15_12345_00123456.docx Explanation: tudp: root directory of the corpus v1.0.0: version number of the release svs: the image data type gastro: the type of tissue 000001: six-digit sequence number used to control directory complexity 00123456: 8-digit patient MRN 2015_03_05: the date the specimen was captured 0s15_12345: the clinical case name 0s15_12345_0a001_00123456_lvl0001_s000.svs: the actual image filename consisting of a repeat of the case name, a site code (e.g., 0a001), the type and depth of the cut (e.g., lvl0001) and a token number (e.g., s000) 0s15_12345_00123456.docx: the filename for the corresponding case report We currently recognize fifteen tissue types in the first installment of the corpus. The raw image data is stored in Aperio’s “.svs” format, which is a multi-layered compressed JPEG format [3,7]. Pathology reports containing a summary of how a pathologist interpreted the slide are also provided in a flat text file format. A more complete summary of the demographics of this pilot corpus will be presented at the conference. Another goal of this poster presentation is to share our experiences with the larger community since many of these details have not been adequately documented in scientific publications. There are quite a few obstacles in collecting this data that have slowed down the process and need to be discussed publicly. Our backlog of slides dates back to 1997, meaning there are a lot that need to be sifted through and discarded for peeling or cracking. Additionally, during scanning a slide can get stuck, stalling a scan session for hours, resulting in a significant loss of productivity. Over the past two years, we have accumulated significant experience with how to scan a diverse inventory of slides using the Aperio AT2 high-volume scanner. We have been working closely with the vendor to resolve many problems associated with the use of this scanner for research purposes. This scanning project began in January of 2018 when the scanner was first installed. The scanning process was slow at first since there was a learning curve with how the scanner worked and how to obtain samples from the hospital. From its start date until May of 2019 ~20,000 slides we scanned. In the past 6 months from May to November we have tripled that number and how hold ~60,000 slides in our database. This dramatic increase in productivity was due to additional undergraduate staff members and an emphasis on efficient workflow. The Aperio AT2 scans 400 slides a day, requiring at least eight hours of scan time. The efficiency of these scans can vary greatly. When our team first started, approximately 5% of slides failed the scanning process due to focal point errors. We have been able to reduce that to 1% through a variety of means: (1) best practices regarding daily and monthly recalibrations, (2) tweaking the software such as the tissue finder parameter settings, and (3) experience with how to clean and prep slides so they scan properly. Nevertheless, this is not a completely automated process, making it very difficult to reach our production targets. With a staff of three undergraduate workers spending a total of 30 hours per week, we find it difficult to scan more than 2,000 slides per week using a single scanner (400 slides per night x 5 nights per week). The main limitation in achieving this level of production is the lack of a completely automated scanning process, it takes a couple of hours to sort, clean and load slides. We have streamlined all other aspects of the workflow required to database the scanned slides so that there are no additional bottlenecks. To bridge the gap between hospital operations and research, we are using Aperio’s eSM software. Our goal is to provide pathologists access to high quality digital images of their patients’ slides. eSM is a secure website that holds the images with their metadata labels, patient report, and path to where the image is located on our file server. Although eSM includes significant infrastructure to import slides into the database using barcodes, TUH does not currently support barcode use. Therefore, we manage the data using a mixture of Python scripts and manual import functions available in eSM. The database and associated tools are based on proprietary formats developed by Aperio, making this another important point of community-wide discussion on how best to disseminate such information. Our near-term goal for the TUDP Corpus is to release 100,000 slides by December 2020. We hope to continue data collection over the next decade until we reach one million slides. We are creating two pilot corpora using the first 50,000 slides we have collected. The first corpus consists of 500 slides with a marker stain and another 500 without it. This set was designed to let people debug their basic deep learning processing flow on these high-resolution images. We discuss our preliminary experiments on this corpus and the challenges in processing these high-resolution images using deep learning in [3]. We are able to achieve a mean sensitivity of 99.0% for slides with pen marks, and 98.9% for slides without marks, using a multistage deep learning algorithm. While this dataset was very useful in initial debugging, we are in the midst of creating a new, more challenging pilot corpus using actual tissue samples annotated by experts. The task will be to detect ductal carcinoma (DCIS) or invasive breast cancer tissue. There will be approximately 1,000 images per class in this corpus. Based on the number of features annotated, we can train on a two class problem of DCIS or benign, or increase the difficulty by increasing the classes to include DCIS, benign, stroma, pink tissue, non-neoplastic etc. Those interested in the corpus or in participating in community-wide discussions should join our listserv, nedc_tuh_dpath@googlegroups.com, to be kept informed of the latest developments in this project. You can learn more from our project website: https://www.isip.piconepress.com/projects/nsf_dpath. 
    more » « less
  3. Obeid, Iyad Selesnick (Ed.)
    The Temple University Hospital EEG Corpus (TUEG) [1] is the largest publicly available EEG corpus of its type and currently has over 5,000 subscribers (we currently average 35 new subscribers a week). Several valuable subsets of this corpus have been developed including the Temple University Hospital EEG Seizure Corpus (TUSZ) [2] and the Temple University Hospital EEG Artifact Corpus (TUAR) [3]. TUSZ contains manually annotated seizure events and has been widely used to develop seizure detection and prediction technology [4]. TUAR contains manually annotated artifacts and has been used to improve machine learning performance on seizure detection tasks [5]. In this poster, we will discuss recent improvements made to both corpora that are creating opportunities to improve machine learning performance. Two major concerns that were raised when v1.5.2 of TUSZ was released for the Neureka 2020 Epilepsy Challenge were: (1) the subjects contained in the training, development (validation) and blind evaluation sets were not mutually exclusive, and (2) high frequency seizures were not accurately annotated in all files. Regarding (1), there were 50 subjects in dev, 50 subjects in eval, and 592 subjects in train. There was one subject common to dev and eval, five subjects common to dev and train, and 13 subjects common between eval and train. Though this does not substantially influence performance for the current generation of technology, it could be a problem down the line as technology improves. Therefore, we have rebuilt the partitions of the data so that this overlap was removed. This required augmenting the evaluation and development data sets with new subjects that had not been previously annotated so that the size of these subsets remained approximately the same. Since these annotations were done by a new group of annotators, special care was taken to make sure the new annotators followed the same practices as the previous generations of annotators. Part of our quality control process was to have the new annotators review all previous annotations. This rigorous training coupled with a strict quality control process where annotators review a significant amount of each other’s work ensured that there is high interrater agreement between the two groups (kappa statistic greater than 0.8) [6]. In the process of reviewing this data, we also decided to split long files into a series of smaller segments to facilitate processing of the data. Some subscribers found it difficult to process long files using Python code, which tends to be very memory intensive. We also found it inefficient to manipulate these long files in our annotation tool. In this release, the maximum duration of any single file is limited to 60 mins. This increased the number of edf files in the dev set from 1012 to 1832. Regarding (2), as part of discussions of several issues raised by a few subscribers, we discovered some files only had low frequency epileptiform events annotated (defined as events that ranged in frequency from 2.5 Hz to 3 Hz), while others had events annotated that contained significant frequency content above 3 Hz. Though there were not many files that had this type of activity, it was enough of a concern to necessitate reviewing the entire corpus. An example of an epileptiform seizure event with frequency content higher than 3 Hz is shown in Figure 1. Annotating these additional events slightly increased the number of seizure events. In v1.5.2, there were 673 seizures, while in v1.5.3 there are 1239 events. One of the fertile areas for technology improvements is artifact reduction. Artifacts and slowing constitute the two major error modalities in seizure detection [3]. This was a major reason we developed TUAR. It can be used to evaluate artifact detection and suppression technology as well as multimodal background models that explicitly model artifacts. An issue with TUAR was the practicality of the annotation tags used when there are multiple simultaneous events. An example of such an event is shown in Figure 2. In this section of the file, there is an overlap of eye movement, electrode artifact, and muscle artifact events. We previously annotated such events using a convention that included annotating background along with any artifact that is present. The artifacts present would either be annotated with a single tag (e.g., MUSC) or a coupled artifact tag (e.g., MUSC+ELEC). When multiple channels have background, the tags become crowded and difficult to identify. This is one reason we now support a hierarchical annotation format using XML – annotations can be arbitrarily complex and support overlaps in time. Our annotators also reviewed specific eye movement artifacts (e.g., eye flutter, eyeblinks). Eye movements are often mistaken as seizures due to their similar morphology [7][8]. We have improved our understanding of ocular events and it has allowed us to annotate artifacts in the corpus more carefully. In this poster, we will present statistics on the newest releases of these corpora and discuss the impact these improvements have had on machine learning research. We will compare TUSZ v1.5.3 and TUAR v2.0.0 with previous versions of these corpora. We will release v1.5.3 of TUSZ and v2.0.0 of TUAR in Fall 2021 prior to the symposium. ACKNOWLEDGMENTS Research reported in this publication was most recently supported by the National Science Foundation’s Industrial Innovation and Partnerships (IIP) Research Experience for Undergraduates award number 1827565. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the official views of any of these organizations. REFERENCES [1] I. Obeid and J. Picone, “The Temple University Hospital EEG Data Corpus,” in Augmentation of Brain Function: Facts, Fiction and Controversy. Volume I: Brain-Machine Interfaces, 1st ed., vol. 10, M. A. Lebedev, Ed. Lausanne, Switzerland: Frontiers Media S.A., 2016, pp. 394 398. https://doi.org/10.3389/fnins.2016.00196. [2] V. Shah et al., “The Temple University Hospital Seizure Detection Corpus,” Frontiers in Neuroinformatics, vol. 12, pp. 1–6, 2018. https://doi.org/10.3389/fninf.2018.00083. [3] A. Hamid et, al., “The Temple University Artifact Corpus: An Annotated Corpus of EEG Artifacts.” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1-3. https://ieeexplore.ieee.org/document/9353647. [4] Y. Roy, R. Iskander, and J. Picone, “The NeurekaTM 2020 Epilepsy Challenge,” NeuroTechX, 2020. [Online]. Available: https://neureka-challenge.com/. [Accessed: 01-Dec-2021]. [5] S. Rahman, A. Hamid, D. Ochal, I. Obeid, and J. Picone, “Improving the Quality of the TUSZ Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1–5. https://ieeexplore.ieee.org/document/9353635. [6] V. Shah, E. von Weltin, T. Ahsan, I. Obeid, and J. Picone, “On the Use of Non-Experts for Generation of High-Quality Annotations of Seizure Events,” Available: https://www.isip.picone press.com/publications/unpublished/journals/2019/elsevier_cn/ira. [Accessed: 01-Dec-2021]. [7] D. Ochal, S. Rahman, S. Ferrell, T. Elseify, I. Obeid, and J. Picone, “The Temple University Hospital EEG Corpus: Annotation Guidelines,” Philadelphia, Pennsylvania, USA, 2020. https://www.isip.piconepress.com/publications/reports/2020/tuh_eeg/annotations/. [8] D. Strayhorn, “The Atlas of Adult Electroencephalography,” EEG Atlas Online, 2014. [Online]. Availabl 
    more » « less
  4. Abstract

    Context.Large multi-site neuroimaging datasets have significantly advanced our quest to understand brain-behavior relationships and to develop biomarkers of psychiatric and neurodegenerative disorders. Yet, such data collections come at a cost, as the inevitable differences across samples may lead to biased or erroneous conclusions.Objective.We aim to validate the estimation of individual brain network dynamics fingerprints and appraise sources of variability in large resting-state functional magnetic resonance imaging (rs-fMRI) datasets by providing a novel point of view based on data-driven dynamical models.Approach.Previous work has investigated this critical issue in terms of effects on static measures, such as functional connectivity and brain parcellations. Here, we utilize dynamical models (hidden Markov models—HMM) to examine how diverse scanning factors in multi-site fMRI recordings affect our ability to infer the brain’s spatiotemporal wandering between large-scale networks of activity. Specifically, we leverage a stable HMM trained on the Human Connectome Project (homogeneous) dataset, which we then apply to an heterogeneous dataset of traveling subjects scanned under a multitude of conditions.Main Results.Building upon this premise, we first replicate previous work on the emergence of non-random sequences of brain states. We next highlight how these time-varying brain activity patterns are robust subject-specific fingerprints. Finally, we suggest these fingerprints may be used to assess which scanning factors induce high variability in the data.Significance.These results demonstrate that we can (i) use large scale dataset to train models that can be then used to interrogate subject-specific data, (ii) recover the unique trajectories of brain activity changes in each individual, but also (iii) urge caution as our ability to infer such patterns is affected by how, where and when we do so.

     
    more » « less
  5. Abstract Background

    The language of the science curriculum is complex, even in the early grades. To communicate their scientific observations, children must produce complex syntax, particularly complement clauses (e.g.,I think it will float;We noticed that it vibrates). Complex syntax is often challenging for children with developmental language disorder (DLD), and thus their learning and communication of science may be compromised.

    Aims

    We asked whether recast therapy delivered in the context of a science curriculum led to gains in complement clause use and scientific content knowledge. To understand the efficacy of recast therapy, we compared changes in science and language knowledge in children who received treatment for complement clauses embedded in a first‐grade science curriculum to two active control conditions (vocabulary + science, phonological awareness + science).

    Methods & Procedures

    This 2‐year single‐site three‐arm parallel randomized controlled trial was conducted in Delaware, USA. Children with DLD, not yet in first grade and with low accuracy on complement clauses, were eligible. Thirty‐three 4–7‐year‐old children participated in the summers of 2018 and 2019 (2020 was cancelled due to COVID‐19). We assigned participants to arms using 1:1:1 pseudo‐random allocation (avoiding placing siblings together). The intervention consisted of 39 small‐group sessions of recast therapy, robust vocabulary instruction or phonological awareness intervention during eight science units over 4 weeks, followed by two science units (1 week) taught without language intervention. Pre‐/post‐measures were collected 3 weeks before and after camp by unmasked assessors.

    Outcomes & Results

    Primary outcome measures were accuracy on a 20‐item probe of complement clause production and performance on ten 10‐item unit tests (eight science + language, two science only). Complete data were available for 31 children (10 grammar, 21 active control); two others were lost to follow‐up. Both groups made similar gains on science unit tests for science + language content (pre versus post,d= 2.9,p< 0.0001; group,p= 0.24). The grammar group performed significantly better at post‐test than the active control group (d= 2.5,p= 0.049) on complement clause probes and marginally better on science‐only unit tests (d= 2.5,p= 0.051).

    Conclusions & Implications

    Children with DLD can benefit from language intervention embedded in curricular content and learn both language and science targets taught simultaneously. Tentative findings suggest that treatment for grammar targets may improve academic outcomes.

    What this paper addsWhat is already known on the subject

    We know that recast therapy focused on morphology is effective but very time consuming. Treatment for complex syntax in young children has preliminary efficacy data available. Prior research provides mixed evidence as to children’s ability to learn language targets in conjunction with other information.

    What this study adds

    This study provides additional data supporting the efficacy of intensive complex syntax recast therapy for children ages 4–7 with Developmental Language Disorder. It also provides data that children can learn language targets and science curricular content simultaneously.

    What are the clinical implications of this work?

    As SLPs, we have to talk about something to deliver language therapy; we should consider talking about curricular content. Recast therapy focused on syntactic frames is effective with young children.

     
    more » « less