Many scientific domains gather sufficient labels to train machine algorithms through human-in-the-loop techniques provided by the this http URL citizen science platform. As the range of projects, task types and data rates increase, acceleration of model training is of paramount concern to focus volunteer effort where most needed. The application of Transfer Learning (TL) between Zooniverse projects holds promise as a solution. However, understanding the effectiveness of TL approaches that pretrain on large-scale generic image sets vs. images with similar characteristics possibly from similar tasks is an open challenge. We apply a generative segmentation model on two Zooniverse project-based data sets: (1) to identify fat droplets in liver cells (FatChecker; FC) and (2) the identification of kelp beds in satellite images (Floating Forests; FF) through transfer learning from the first project. We compare and contrast its performance with a TL model based on the COCO image set, and subsequently with baseline counterparts. We find that both the FC and COCO TL models perform better than the baseline cases when using >75% of the original training sample size. The COCO-based TL model generally performs better than the FC-based one, likely due to its generalized features. Our investigations provide important insights into usage of TL approaches on multi-domain data hosted across different Zooniverse projects, enabling future projects to accelerate task completion. 
                        more » 
                        « less   
                    
                            
                            Efficient Label Gathering for Machine Training: Results from Muon Hunter 2
                        
                    
    
            In 2017, the Muon Hunter project on the Zooniverse.org citizen science platform successfully gathered more than two million classification labels for nearly 140,000 camera images from VER- ITAS. The aim was to select and parameterize muon events for use in training convolutional neural networks. The success of this project proved that crowdsourcing labels for IACT image analy- sis is a viable avenue for further development of advanced machine-learning algorithms. These algorithms could potentially lend themselves to improving class separation between gamma-ray and hadronic event types. Nonetheless, it took two months to gather these labels from volun- teers, which could be a bottleneck for future applications of this method. Here we present Muon Hunters 2.0: the follow-on project that demonstrates the development of unsupervised clustering techniques to gather muon labels more efficiently from volunteer classifiers. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1835530
- PAR ID:
- 10208291
- Date Published:
- Journal Name:
- International Cosmic Ray Conference 2019 Proceedings of Science
- Page Range / eLocation ID:
- https://pos.sissa.it/358/678/pdf
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            To teach STEM content to K-12 students and to recruit talented and diverse K-12 students into STEM, many outreach programs at universities in the United States rely on STEM undergraduates. While the design of such outreach typically focuses on the K-12 students who are taught or recruited, an important but often overlooked consideration is the effect of the outreach on the professional development of the STEM undergraduates themselves. This proposed EAGER project seeks to determine which outreach programs in the United States provided the most transformative professional development of the participating STEM undergraduates. This project then seeks to capture the essence what practices in those programs provided transformative professional development. Next, the project seeks to disseminate these practices to a network of institutions doing outreach. Supporting this project is the NSF EArly-concept Grant for Exploratory Research (EAGER) program. In this first year of the project, we performed a systematic review of literature and university websites with follow-up survey data to identify outreach programs that may be transformative for STEM undergraduates. This review yielded a matrix of about 100 college-based outreach programs. We then invited these programs to attend one of the following workshops: a March workshop held at Tufts University in Boston or an April workshop held at the University of Nebraska in Lincoln. Nine institutions sent representatives to the Boston workshop, and five institutions sent representatives to the Lincoln workshop. In addition, we held conference calls to gather information from an additional six institutions. The purpose of the workshops and conference calls was two-fold: (1) determine best practices for outreach that used STEM undergraduates, and (2) determine what in those programs provided the most transformative development of the participating STEM undergraduates. This paper presents preliminary results from these workshops and conference calls.more » « less
- 
            Abstract IceCube is a cubic-kilometer Cherenkov telescope operating at the South Pole. The main goal of IceCube is the detection of astrophysical neutrinos and the identification of their sources. High-energy muon neutrinos are observed via the secondary muons produced in charge current interactions with nuclei in the ice. Currently, the best performing muon track directional reconstruction is based on a maximum likelihood method using the arrival time distribution of Cherenkov photons registered by the experiment's photomultipliers. A known systematic shortcoming of the prevailing method is to assume a continuous energy loss along the muon track. However at energies >1 TeV the light yield from muons is dominated by stochastic showers. This paper discusses a generalized ansatz where the expected arrival time distribution is parametrized by a stochastic muon energy loss pattern. This more realistic parametrization of the loss profile leads to an improvement of the muon angular resolution of up to 20% for through-going tracks and up to a factor 2 for starting tracks over existing algorithms. Additionally, the procedure to estimate the directional reconstruction uncertainty has been improved to be more robust against numerical errors.more » « less
- 
            This WIP presentation is intended to share and gather feedback on the development of an observation protocol for K-12 integrated STEM instruction, the STEM-OP. Specifically, the STEM-OP is being developed for use in K-12 science and/or engineering settings where integrated STEM instruction takes place. While the importance of integrated STEM education is established through national policy documents, there remains disagreement on models and effective approaches for integrated STEM instruction. Our broad definition of integrated STEM includes the use of two or more STEM disciplines to solve a real-world problem or design challenge that supports student development of 21st century skills. This issue is confounded by the lack of observation protocols sensitive to integrated STEM teaching and learning that can be used to inform research of the effectiveness of new models and strategies. Existing instruments most commonly used by researchers, such as the Reformed Teaching Observation Protocol (RTOP), were designed prior to the development of the Next Generation Science Standards and the integration of engineering into science standards. These instruments were also designed for use in reform-based science classrooms, not engineering or integrated STEM learning environments. While engineering-focused observation protocols do exist for K-12 classrooms, they do not evaluate beyond an engineering focus, making them limited tools to evaluate integrated STEM instruction. In order to facilitate the implementation of integrated STEM in K-12 classrooms and the development of the nascent integrated STEM education literature, our research team is developing a new integrated STEM observation protocol for use in K-12 science and engineering classrooms. This valid and reliable instrument will be designed for use in a variety of educational contexts and by different education stakeholders to increase the quality of K-12 STEM education. At the end of this project, the STEM-OP will be made available through an online platform that will include an embedded training program to facilitate its broad use. In the first year of this four-year project, we are working on the initial development of the STEM-OP through video analysis and exploratory factor analysis. We are utilizing existing classroom video from a previous project with approximately 2,000 unique classroom videos representing a variety of grade levels (4-9), science content (life, earth, and physical science), engineering design challenges, and school demographics (urban, suburban). The development of the STEM-OP is guided by published frameworks that focus on providing quality K-12 integrated STEM and engineering education, such as the Framework for Quality K-12 Engineering Education. Our anticipated results at the time the ASEE meeting will include a review of our item development process and finalized items included on the draft STEM-OP. Additionally, we anticipate being able to share findings from the exploratory factor analysis (EFA) on our video-coded data, which will identify distinct instructional dimensions responsible for integrated STEM instruction. We value the opportunity to gather feedback from the engineering education community as the integration of engineering design and practices is integral to quality integrated STEM instruction.more » « less
- 
            Many algorithms for analyzing parallel programs, for example to detect deadlocks or data races or to calculate the execution cost, are based on a model variously known as a cost graph, computation graph or dependency graph, which captures the parallel structure of threads in a program. In modern parallel programs, computation graphs are highly dynamic and depend greatly on the program inputs and execution details. As such, most analyses that use these graphs are either dynamic analyses or are specialized static analyses that gather a subset of dependency information for a specific purpose. This paper introduces graph types, which compactly represent all of the graphs that could arise from program execution. Graph types are inferred from a parallel program using a graph type system and inference algorithm, which we present drawing on ideas from Hindley-Milner type inference, affine logic and region type systems. We have implemented the inference algorithm over a subset of OCaml, extended with parallelism primitives, and we demonstrate how graph types can be used to accelerate the development of new graph-based static analyses by presenting proof-of-concept analyses for deadlock detection and cost analysis.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    