skip to main content

Title: Efficient Label Gathering for Machine Training: Results from Muon Hunter 2
In 2017, the Muon Hunter project on the Zooniverse.org citizen science platform successfully gathered more than two million classification labels for nearly 140,000 camera images from VER- ITAS. The aim was to select and parameterize muon events for use in training convolutional neural networks. The success of this project proved that crowdsourcing labels for IACT image analy- sis is a viable avenue for further development of advanced machine-learning algorithms. These algorithms could potentially lend themselves to improving class separation between gamma-ray and hadronic event types. Nonetheless, it took two months to gather these labels from volun- teers, which could be a bottleneck for future applications of this method. Here we present Muon Hunters 2.0: the follow-on project that demonstrates the development of unsupervised clustering techniques to gather muon labels more efficiently from volunteer classifiers.
Authors:
; ; ; ; ; ;
Award ID(s):
1835530
Publication Date:
NSF-PAR ID:
10208291
Journal Name:
International Cosmic Ray Conference 2019 Proceedings of Science
Page Range or eLocation-ID:
https://pos.sissa.it/358/678/pdf
Sponsoring Org:
National Science Foundation
More Like this
  1. Obeid, I. (Ed.)
    The Neural Engineering Data Consortium (NEDC) is developing the Temple University Digital Pathology Corpus (TUDP), an open source database of high-resolution images from scanned pathology samples [1], as part of its National Science Foundation-funded Major Research Instrumentation grant titled “MRI: High Performance Digital Pathology Using Big Data and Machine Learning” [2]. The long-term goal of this project is to release one million images. We have currently scanned over 100,000 images and are in the process of annotating breast tissue data for our first official corpus release, v1.0.0. This release contains 3,505 annotated images of breast tissue including 74 patients withmore »cancerous diagnoses (out of a total of 296 patients). In this poster, we will present an analysis of this corpus and discuss the challenges we have faced in efficiently producing high quality annotations of breast tissue. It is well known that state of the art algorithms in machine learning require vast amounts of data. Fields such as speech recognition [3], image recognition [4] and text processing [5] are able to deliver impressive performance with complex deep learning models because they have developed large corpora to support training of extremely high-dimensional models (e.g., billions of parameters). Other fields that do not have access to such data resources must rely on techniques in which existing models can be adapted to new datasets [6]. A preliminary version of this breast corpus release was tested in a pilot study using a baseline machine learning system, ResNet18 [7], that leverages several open-source Python tools. The pilot corpus was divided into three sets: train, development, and evaluation. Portions of these slides were manually annotated [1] using the nine labels in Table 1 [8] to identify five to ten examples of pathological features on each slide. Not every pathological feature is annotated, meaning excluded areas can include focuses particular to these labels that are not used for training. A summary of the number of patches within each label is given in Table 2. To maintain a balanced training set, 1,000 patches of each label were used to train the machine learning model. Throughout all sets, only annotated patches were involved in model development. The performance of this model in identifying all the patches in the evaluation set can be seen in the confusion matrix of classification accuracy in Table 3. The highest performing labels were background, 97% correct identification, and artifact, 76% correct identification. A correlation exists between labels with more than 6,000 development patches and accurate performance on the evaluation set. Additionally, these results indicated a need to further refine the annotation of invasive ductal carcinoma (“indc”), inflammation (“infl”), nonneoplastic features (“nneo”), normal (“norm”) and suspicious (“susp”). This pilot experiment motivated changes to the corpus that will be discussed in detail in this poster presentation. To increase the accuracy of the machine learning model, we modified how we addressed underperforming labels. One common source of error arose with how non-background labels were converted into patches. Large areas of background within other labels were isolated within a patch resulting in connective tissue misrepresenting a non-background label. In response, the annotation overlay margins were revised to exclude benign connective tissue in non-background labels. Corresponding patient reports and supporting immunohistochemical stains further guided annotation reviews. The microscopic diagnoses given by the primary pathologist in these reports detail the pathological findings within each tissue site, but not within each specific slide. The microscopic diagnoses informed revisions specifically targeting annotated regions classified as cancerous, ensuring that the labels “indc” and “dcis” were used only in situations where a micropathologist diagnosed it as such. Further differentiation of cancerous and precancerous labels, as well as the location of their focus on a slide, could be accomplished with supplemental immunohistochemically (IHC) stained slides. When distinguishing whether a focus is a nonneoplastic feature versus a cancerous growth, pathologists employ antigen targeting stains to the tissue in question to confirm the diagnosis. For example, a nonneoplastic feature of usual ductal hyperplasia will display diffuse staining for cytokeratin 5 (CK5) and no diffuse staining for estrogen receptor (ER), while a cancerous growth of ductal carcinoma in situ will have negative or focally positive staining for CK5 and diffuse staining for ER [9]. Many tissue samples contain cancerous and non-cancerous features with morphological overlaps that cause variability between annotators. The informative fields IHC slides provide could play an integral role in machine model pathology diagnostics. Following the revisions made on all the annotations, a second experiment was run using ResNet18. Compared to the pilot study, an increase of model prediction accuracy was seen for the labels indc, infl, nneo, norm, and null. This increase is correlated with an increase in annotated area and annotation accuracy. Model performance in identifying the suspicious label decreased by 25% due to the decrease of 57% in the total annotated area described by this label. A summary of the model performance is given in Table 4, which shows the new prediction accuracy and the absolute change in error rate compared to Table 3. The breast tissue subset we are developing includes 3,505 annotated breast pathology slides from 296 patients. The average size of a scanned SVS file is 363 MB. The annotations are stored in an XML format. A CSV version of the annotation file is also available which provides a flat, or simple, annotation that is easy for machine learning researchers to access and interface to their systems. Each patient is identified by an anonymized medical reference number. Within each patient’s directory, one or more sessions are identified, also anonymized to the first of the month in which the sample was taken. These sessions are broken into groupings of tissue taken on that date (in this case, breast tissue). A deidentified patient report stored as a flat text file is also available. Within these slides there are a total of 16,971 total annotated regions with an average of 4.84 annotations per slide. Among those annotations, 8,035 are non-cancerous (normal, background, null, and artifact,) 6,222 are carcinogenic signs (inflammation, nonneoplastic and suspicious,) and 2,714 are cancerous labels (ductal carcinoma in situ and invasive ductal carcinoma in situ.) The individual patients are split up into three sets: train, development, and evaluation. Of the 74 cancerous patients, 20 were allotted for both the development and evaluation sets, while the remain 34 were allotted for train. The remaining 222 patients were split up to preserve the overall distribution of labels within the corpus. This was done in hope of creating control sets for comparable studies. Overall, the development and evaluation sets each have 80 patients, while the training set has 136 patients. In a related component of this project, slides from the Fox Chase Cancer Center (FCCC) Biosample Repository (https://www.foxchase.org/research/facilities/genetic-research-facilities/biosample-repository -facility) are being digitized in addition to slides provided by Temple University Hospital. This data includes 18 different types of tissue including approximately 38.5% urinary tissue and 16.5% gynecological tissue. These slides and the metadata provided with them are already anonymized and include diagnoses in a spreadsheet with sample and patient ID. We plan to release over 13,000 unannotated slides from the FCCC Corpus simultaneously with v1.0.0 of TUDP. Details of this release will also be discussed in this poster. Few digitally annotated databases of pathology samples like TUDP exist due to the extensive data collection and processing required. The breast corpus subset should be released by November 2021. By December 2021 we should also release the unannotated FCCC data. We are currently annotating urinary tract data as well. We expect to release about 5,600 processed TUH slides in this subset. We have an additional 53,000 unprocessed TUH slides digitized. Corpora of this size will stimulate the development of a new generation of deep learning technology. In clinical settings where resources are limited, an assistive diagnoses model could support pathologists’ workload and even help prioritize suspected cancerous cases. ACKNOWLEDGMENTS This material is supported by the National Science Foundation under grants nos. CNS-1726188 and 1925494. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. REFERENCES [1] N. Shawki et al., “The Temple University Digital Pathology Corpus,” in Signal Processing in Medicine and Biology: Emerging Trends in Research and Applications, 1st ed., I. Obeid, I. Selesnick, and J. Picone, Eds. New York City, New York, USA: Springer, 2020, pp. 67 104. https://www.springer.com/gp/book/9783030368432. [2] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning.” Major Research Instrumentation (MRI), Division of Computer and Network Systems, Award No. 1726188, January 1, 2018 – December 31, 2021. https://www. isip.piconepress.com/projects/nsf_dpath/. [3] A. Gulati et al., “Conformer: Convolution-augmented Transformer for Speech Recognition,” in Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), 2020, pp. 5036-5040. https://doi.org/10.21437/interspeech.2020-3015. [4] C.-J. Wu et al., “Machine Learning at Facebook: Understanding Inference at the Edge,” in Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2019, pp. 331–344. https://ieeexplore.ieee.org/document/8675201. [5] I. Caswell and B. Liang, “Recent Advances in Google Translate,” Google AI Blog: The latest from Google Research, 2020. [Online]. Available: https://ai.googleblog.com/2020/06/recent-advances-in-google-translate.html. [Accessed: 01-Aug-2021]. [6] V. Khalkhali, N. Shawki, V. Shah, M. Golmohammadi, I. Obeid, and J. Picone, “Low Latency Real-Time Seizure Detection Using Transfer Deep Learning,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2021, pp. 1 7. https://www.isip. piconepress.com/publications/conference_proceedings/2021/ieee_spmb/eeg_transfer_learning/. [7] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning,” Philadelphia, Pennsylvania, USA, 2020. https://www.isip.piconepress.com/publications/reports/2020/nsf/mri_dpath/. [8] I. Hunt, S. Husain, J. Simons, I. Obeid, and J. Picone, “Recent Advances in the Temple University Digital Pathology Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2019, pp. 1–4. https://ieeexplore.ieee.org/document/9037859. [9] A. P. Martinez, C. Cohen, K. Z. Hanley, and X. (Bill) Li, “Estrogen Receptor and Cytokeratin 5 Are Reliable Markers to Separate Usual Ductal Hyperplasia From Atypical Ductal Hyperplasia and Low-Grade Ductal Carcinoma In Situ,” Arch. Pathol. Lab. Med., vol. 140, no. 7, pp. 686–689, Apr. 2016. https://doi.org/10.5858/arpa.2015-0238-OA.« less
  2. This WIP presentation is intended to share and gather feedback on the development of an observation protocol for K-12 integrated STEM instruction, the STEM-OP. Specifically, the STEM-OP is being developed for use in K-12 science and/or engineering settings where integrated STEM instruction takes place. While the importance of integrated STEM education is established through national policy documents, there remains disagreement on models and effective approaches for integrated STEM instruction. Our broad definition of integrated STEM includes the use of two or more STEM disciplines to solve a real-world problem or design challenge that supports student development of 21st century skills.more »This issue is confounded by the lack of observation protocols sensitive to integrated STEM teaching and learning that can be used to inform research of the effectiveness of new models and strategies. Existing instruments most commonly used by researchers, such as the Reformed Teaching Observation Protocol (RTOP), were designed prior to the development of the Next Generation Science Standards and the integration of engineering into science standards. These instruments were also designed for use in reform-based science classrooms, not engineering or integrated STEM learning environments. While engineering-focused observation protocols do exist for K-12 classrooms, they do not evaluate beyond an engineering focus, making them limited tools to evaluate integrated STEM instruction. In order to facilitate the implementation of integrated STEM in K-12 classrooms and the development of the nascent integrated STEM education literature, our research team is developing a new integrated STEM observation protocol for use in K-12 science and engineering classrooms. This valid and reliable instrument will be designed for use in a variety of educational contexts and by different education stakeholders to increase the quality of K-12 STEM education. At the end of this project, the STEM-OP will be made available through an online platform that will include an embedded training program to facilitate its broad use. In the first year of this four-year project, we are working on the initial development of the STEM-OP through video analysis and exploratory factor analysis. We are utilizing existing classroom video from a previous project with approximately 2,000 unique classroom videos representing a variety of grade levels (4-9), science content (life, earth, and physical science), engineering design challenges, and school demographics (urban, suburban). The development of the STEM-OP is guided by published frameworks that focus on providing quality K-12 integrated STEM and engineering education, such as the Framework for Quality K-12 Engineering Education. Our anticipated results at the time the ASEE meeting will include a review of our item development process and finalized items included on the draft STEM-OP. Additionally, we anticipate being able to share findings from the exploratory factor analysis (EFA) on our video-coded data, which will identify distinct instructional dimensions responsible for integrated STEM instruction. We value the opportunity to gather feedback from the engineering education community as the integration of engineering design and practices is integral to quality integrated STEM instruction.« less
  3. As our nation’s need for engineering professionals grows, a sharp rise in P-12 engineering education programs and related research has taken place (Brophy, Klein, Portsmore, & Rogers, 2008; Purzer, Strobel, & Cardella, 2014). The associated research has focused primarily on students’ perceptions and motivations, teachers’ beliefs and knowledge, and curricula and program success. The existing research has expanded our understanding of new K-12 engineering curriculum development and teacher professional development efforts, but empirical data remain scarce on how racial and ethnic diversity of student population influences teaching methods, course content, and overall teachers’ experiences. In particular, Hynes et al. (2017)more »note in their systematic review of P-12 research that little attention has been paid to teachers’ experiences with respect to racially and ethnically diverse engineering classrooms. The growing attention and resources being committed to diversity and inclusion issues (Lichtenstein, Chen, Smith, & Maldonado, 2014; McKenna, Dalal, Anderson, & Ta, 2018; NRC, 2009) underscore the importance of understanding teachers’ experiences with complementary research-based recommendations for how to implement engineering curricula in racially diverse schools to engage all students. Our work examines the experiences of three high school teachers as they teach an introductory engineering course in geographically and distinctly different racially diverse schools across the nation. The study is situated in the context of a new high school level engineering education initiative called Engineering for Us All (E4USA). The National Science Foundation (NSF) funded initiative was launched in 2018 as a partnership among five universities across the nation to ‘demystify’ engineering for high school students and teachers. The program aims to create an all-inclusive high school level engineering course(s), a professional development platform, and a learning community to support student pathways to higher education institutions. An introductory engineering course was developed and professional development was provided to nine high school teachers to instruct and assess engineering learning during the first year of the project. This study investigates participating teachers’ implementation of the course in high schools across the nation to understand the extent to which their experiences vary as a function of student demographic (race, ethnicity, socioeconomic status) and resource level of the school itself. Analysis of these experiences was undertaken using a collective case-study approach (Creswell, 2013) involving in-depth analysis of a limited number of cases “to focus on fewer "subjects," but more "variables" within each subject” (Campbell & Ahrens, 1998, p. 541). This study will document distinct experiences of high school teachers as they teach the E4USA curriculum. Participants were purposively sampled for the cases in order to gather an information-rich data set (Creswell, 2013). The study focuses on three of the nine teachers participating in the first cohort to implement the E4USA curriculum. Teachers were purposefully selected because of the demographic makeup of their students. The participating teachers teach in Arizona, Maryland and Tennessee with predominantly Hispanic, African-American, and Caucasian student bodies, respectively. To better understand similarities and differences among teaching experiences of these teachers, a rich data set is collected consisting of: 1) semi-structured interviews with teachers at multiple stages during the academic year, 2) reflective journal entries shared by the teachers, and 3) multiple observations of classrooms. The interview data will be analyzed with an inductive approach outlined by Miles, Huberman, and Saldaña (2014). All teachers’ interview transcripts will be coded together to identify common themes across participants. Participants’ reflections will be analyzed similarly, seeking to characterize their experiences. Observation notes will be used to triangulate the findings. Descriptions for each case will be written emphasizing the aspects that relate to the identified themes. Finally, we will look for commonalities and differences across cases. The results section will describe the cases at the individual participant level followed by a cross-case analysis. This study takes into consideration how high school teachers’ experiences could be an important tool to gain insight into engineering education problems at the P-12 level. Each case will provide insights into how student body diversity impacts teachers’ pedagogy and experiences. The cases illustrate “multiple truths” (Arghode, 2012) with regard to high school level engineering teaching and embody diversity from the perspective of high school teachers. We will highlight themes across cases in the context of frameworks that represent teacher experience conceptualizing race, ethnicity, and diversity of students. We will also present salient features from each case that connect to potential recommendations for advancing P-12 engineering education efforts. These findings will impact how diversity support is practiced at the high school level and will demonstrate specific novel curricular and pedagogical approaches in engineering education to advance P-12 mentoring efforts.« less
  4. K-12 teachers serve a critical role in their students’ development of interest in engineering, especially as engineering content is emphasized in curriculum standards. However, teachers may not be comfortable teaching engineering in their classrooms as it can require a different set of skills from which they are trained. Professional development activities focused on engineering content can help teachers feel more comfortable teaching the subject in their classrooms and can increase their knowledge of engineering and thus their engineering teaching self-efficacy. There are many different types of professional development activities teachers might experience, each one with a set of established bestmore »practices. VT PEERS (Virginia Tech Partnering with Educators and Engineers in Rural Communities) is a program designed to provide recurrent hands-on engineering activities to middle school students in or near rural Appalachia. The project partners middle school teachers, university affiliates, and local industry partners throughout the state region to develop and implement engineering activities that align with state defined standards of learning (SOLs). Throughout this partnership, teachers co-facilitate engineering activities in their classrooms throughout the year with the other partners, and teachers have the opportunity to participate in a two-day collaborative workshop every year. VT PEERS held a workshop during the summer of 2019, after the second year of the partnership, to discuss the successes and challenges experienced throughout the program. Three focus groups, one for each grade level involved (grades 6-8), were held during the summit for teachers and industry partners to discuss their experiences. None of the teachers involved in the partnership have formal training in engineering. The transcripts of these focus groups were the focus of the exploratory qualitative data analyses to answer the following research question: How do middle-school teachers develop teaching engineering self-efficacy through professional development activities? Deductive coding of the focus group transcripts was completed using the four sources of self-efficacy: mastery experience, vicarious experience, verbal persuasion and physiological states. The analysis revealed that vicarious experiences can be particularly valuable to increasing teachers’ teaching engineering self-efficacy. For example, teachers valued the ability to play the role of a student in an engineering lesson and being able to share ideas about teaching engineering lessons with other teachers. This information can be useful to develop engineering-focused professional development activities for teachers. Additionally, as teachers gather information from their teaching engineering vicarious experiences, they can inform their own teaching practices and practice reflective teaching as they teach lessons.« less
  5. The lack of diversity and inclusion has been a major challenge affecting engineering programs all over the United States. This problem has been persistent over the years and has been difficult to address despite considerable amount of attention, enriched conversations, and money that has been put towards addressing it. One of the reasons behind this lack of diversity could be the presence of exclusionary behaviors, such as bias and discrimination that permeate the culture of engineering. To address this “wicked” problem, a deeper understanding of current culture and of potential change strategies toward integrating inclusion and diversity is necessary. Ourmore »larger NSF funded research project seeks to achieve this understanding through design thinking. While design thinking has been documented to successfully achieve desired outcomes for numerous other problems, its effectiveness as a tool to understand and solve the “wicked problem” of transformation of disciplinary culture related to diversity and inclusion in engineering is not yet known. This Work-in-Progress paper will address the effectiveness of using a design thinking approach by answering the research question: How did stakeholder participants perceive the impact of design sessions on their understanding and value of diversity and inclusion in the professional formation of biomedical engineers? To address this research question, our research team is coordinating six design sessions within each of two engineering schools: Electrical and Computer Engineering (ECE) and Biomedical Engineering (BME) at a large Midwest University. Currently, we have completed the initial phases of the design sessions in the BME school, and hence this paper focuses on insights from preliminary data analysis of BME Design sessions. BME design sessions were conducted with 15 key stakeholders from the program including students, faculty, staff and administrators. Each of the six design session was two hours long. The research team facilitated the inspiration and ideation phase of the design thinking process throughout. Facilitation involved providing prompts and activities to guide the stakeholders through the design thinking processes of problem identification, problem scoping, and prototype solution generation related to diversity and inclusion within the school culture. A mixed-methods approach involving both qualitative and quantitative data analysis is being used to evaluate the efficacy of design thinking as a tool to address diversity and inclusion in professional formation of engineers. Artifacts such as journey maps, culture maps, and design notebooks generated by our stakeholders throughout the design sessions will be qualitatively analyzed to evaluate the role and effectiveness of design thinking in shaping a more diverse and inclusive culture within BME and, eventually ECE. Following the design sessions, participants were interviewed one-on-one to understand how their thoughts about diversity and inclusion in professional formation of biomedical engineers may have changed, and to gather participants’ self-assessment of the design process. Coupled with the interviews, an online survey was administered to assess the participants’ ranking of the solutions generated at the conclusion design sessions in terms of their novelty, importance and feasibility for implementation within their school. This Work-in-Progress paper will discuss relevant findings from initial quantitative analyses of the data collected from the post-design session surveys and is an interim report evaluating participants’ perceptions of the impact of these design sessions on their understanding of diversity and inclusion in professional formation of biomedical engineers.« less