Search for: All records

Creators/Authors contains: "Fortson, Lucy"

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Artificial Intelligence and the Future of Citizen Science

https://doi.org/10.5334/cstp.812

Fortson, Lucy; Crowston, Kevin; Kloetzer, Laure; Ponti, Marisa (December 2024, Citizen Science: Theory and Practice)

Artificial Intelligence (AI) and citizen science (CS) are two approaches to tackling data challenges related to scale and complexity. CS by its very definition relies on the joint effort of typically a distributed group of non-expert people to solve problems in a manner that relies on human intelligence. As AI capabilities increasingly augment or complement human intelligence, if not replicate it, there is a growing effort to understand the role that AI can play in CS and vice versa. With this growing interest as context, this special collection, The Future of AI and Citizen Science, illustrates the many ways that CS practitioners are integrating AI into their efforts, as well as identifies current limitations. In this spirit, our editorial briefly introduces the special collection papers to demonstrate and assess some uses of AI in CS; then, we contextualize these uses in terms of key challenges; and conclude with future directions that use AI with CS in both innovative and ethical ways.
more » « less
Full Text Available
Understanding Confusion: A Case Study of Training a Machine Model to Predict and Interpret Consensus From Volunteer Labels

https://doi.org/10.5334/cstp.731

Sankar, Ramanakumar; Mantha, Kameswara; Nesmith, Cooper; Fortson, Lucy; Brueshaber, Shawn; Hansen-Koharcheck, Candice; Orton, Glenn (December 2024, Citizen Science: Theory and Practice)
Fortson, Lucy; Crowston, Kevin; Kloetzer, Laure; Ponti, Marisa (Ed.)
Citizen science has become a valuable and reliable method for interpreting and processing big datasets, and is vital in the era of ever-growing data volumes. However, there are inherent difficulties in the generating labels from citizen scientists, due to the inherent variability between the members of the crowd, leading to variability in the results. Sometimes, this is useful — such as with serendipitous discoveries, which corresponds to rare/unknown classes in the data — but it might also be due to ambiguity between classes. The primary issue is then to distinguish between the intrinsic variability in the dataset and the uncertainty in the citizen scientists’ responses, and leveraging that to extract scientifically useful relationships. In this paper, we explore using a neural network to interpret volunteer confusion across the dataset, to increase the purity of the downstream analysis. We focus on the use of learned features from the network to disentangle feature similarity across the classes, and the ability of the machines’ “attention” in identifying features that lead to confusion. We use data from Jovian Vortex Hunter, a citizen science project to study vortices in Jupiter’s atmosphere, and find that the latent space from the model helps effectively identify different sources of image-level features that lead to low volunteer consensus. Furthermore, the machine’s attention highlights features corresponding to specific classes. This provides meaningful image-level feature-class relationships, which is useful in our analysis for identifying vortex-specific features to better understand vortex evolution mechanisms. Finally, we discuss the applicability of this method to other citizen science projects.
more » « less
Full Text Available
Through the Citizen Scientists’ Eyes: Insights into Using Citizen Science with Machine Learning for Effective Identification of Unknown-Unknowns in Big Data

https://doi.org/10.5334/cstp.740

Mantha, Kameswara Bharadwaj; Roberts, Hayley; Fortson, Lucy; Lintott, Chris; Dickinson, Hugh; Keel, William; Sankar, Ramanakumar; Krawczyk, Coleman; Simmons, Brooke; Walmsley, Mike; et al (December 2024, Citizen Science: Theory and Practice)
Fortson, Lucy; Crowston, Kevin; Kloetzer, Laure; Ponti, Marisa (Ed.)
In the era of rapidly growing astronomical data, the gap between data collection and analysis is a significant barrier, especially for teams searching for rare scientific objects. Although machine learning (ML) can quickly parse large data sets, it struggles to robustly identify scientifically interesting objects, a task at which humans excel. Human-in-the-loop (HITL) strategies that combine the strengths of citizen science (CS) and ML offer a promising solution, but first, we need to better understand the relationship between human- and machine-identified samples. In this work, we present a case study from the Galaxy Zoo: Weird & Wonderful project, where volunteers inspected ~200,000 astronomical images—processed by an ML-based anomaly detection model—to identify those with unusual or interesting characteristics. Volunteer-selected images with common astrophysical characteristics had higher consensus, while rarer or more complex ones had lower consensus. This suggests low-consensus choices shouldn’t be dismissed in further explorations. Additionally, volunteers were better at filtering out uninteresting anomalies, such as image artifacts, which the machine struggled with. We also found that a higher ML-generated anomaly score that indicates images’ low-level feature anomalousness was a better predictor of the volunteers’ consensus choice. Combining a locus of high volunteer-consensus images within the ML learnt feature space and anomaly score, we demonstrated a decision boundary that can effectively isolate images with unusual and potentially scientifically interesting characteristics. Using this case study, we lay important guidelines for future research studies looking to adapt and operationalize human-machine collaborative frameworks for efficient anomaly detection in big data.
more » « less
Full Text Available
Application of Long Short-Term Memory (LSTM) Deep Learning Networks on Very-High-Energy Gamma-Ray Classification with VERITAS

https://doi.org/10.22323/1.444.0692

Oie, Grant; Fortson, Lucy; Sankar, Ramana; Mantha, Kameswara; Ribeiro, Deivid (July 2023, Pos proceedings of science)

Full Text Available
Long-term monitoring of the radio-galaxy M87 in gamma-rays: joint analysis of MAGIC, VERITAS and Fermi-LAT data

https://doi.org/10.22323/1.444.0572

Molero_Gonzalez, Miguel; Fortson, Lucy; Nievas, Mireia; Pueschel, Elisa; Ribeiro, Deivid; Vázquez, Mónica; Batista, Pedro Ivo (August 2023, Pos proceedings of science)

Full Text Available
Transfer learning for galaxy feature detection: Finding giant star-forming clumps in low-redshift galaxies using Faster Region-based Convolutional Neural Network

https://doi.org/10.1093/rasti/rzae013

Popp, Jürgen_J; Dickinson, Hugh; Serjeant, Stephen; Walmsley, Mike; Adams, Dominic; Fortson, Lucy; Mantha, Kameswara; Mehta, Vihang; Dawson, James_M; Kruk, Sandor; et al (April 2024, RAS Techniques and Instruments)

Abstract Giant star-forming clumps (GSFCs) are areas of intensive star-formation that are commonly observed in high-redshift (z ≳ 1) galaxies but their formation and role in galaxy evolution remain unclear. Observations of low-redshift clumpy galaxy analogues are rare but the availability of wide-field galaxy survey data makes the detection of large clumpy galaxy samples much more feasible. Deep Learning (DL), and in particular Convolutional Neural Networks (CNNs), have been successfully applied to image classification tasks in astrophysical data analysis. However, one application of DL that remains relatively unexplored is that of automatically identifying and localizing specific objects or features in astrophysical imaging data. In this paper, we demonstrate the use of DL-based object detection models to localize GSFCs in astrophysical imaging data. We apply the Faster Region-based Convolutional Neural Network object detection framework (FRCNN) to identify GSFCs in low-redshift (z ≲ 0.3) galaxies. Unlike other studies, we train different FRCNN models on observational data that was collected by the Sloan Digital Sky Survey and labelled by volunteers from the citizen science project ‘Galaxy Zoo: Clump Scout’. The FRCNN model relies on a CNN component as a ‘backbone’ feature extractor. We show that CNNs, that have been pre-trained for image classification using astrophysical images, outperform those that have been pre-trained on terrestrial images. In particular, we compare a domain-specific CNN – ‘Zoobot’ – with a generic classification backbone and find that Zoobot achieves higher detection performance. Our final model is capable of producing GSFC detections with a completeness and purity of ≥0.8 while only being trained on ∼5000 galaxy images.
more » « less
Galaxy Zoo DESI: Detailed morphology measurements for 8.7M galaxies in the DESI Legacy Imaging Surveys

https://doi.org/10.1093/mnras/stad2919

Walmsley, Mike; Géron, Tobias; Kruk, Sandor; Scaife, Anna M. M.; Lintott, Chris; Masters, Karen L.; Dawson, James M.; Dickinson, Hugh; Fortson, Lucy; Garland, Izzy L.; et al (September 2023, Monthly Notices of the Royal Astronomical Society)

ABSTRACT We present detailed morphology measurements for 8.67 million galaxies in the DESI Legacy Imaging Surveys (DECaLS, MzLS, and BASS, plus DES). These are automated measurements made by deep learning models trained on Galaxy Zoo volunteer votes. Our models typically predict the fraction of volunteers selecting each answer to within 5–10 per cent for every answer to every GZ question. The models are trained on newly collected votes for DESI-LS DR8 images as well as historical votes from GZ DECaLS. We also release the newly collected votes. Extending our morphology measurements outside of the previously released DECaLS/SDSS intersection increases our sky coverage by a factor of 4 (5000–19 000 deg2) and allows for full overlap with complementary surveys including ALFALFA and MaNGA.
more » « less
Citizen science, computing, and conservation: How can “Crowd AI” change the way we tackle large-scale ecological challenges?

https://doi.org/10.15346/hc.v8i2.123

Palmer, Meredith S.; Huebner, Sarah E.; Willi, Marco; Fortson, Lucy; Packer, Craig (July 2021, Human Computation)
null (Ed.)
Camera traps - remote cameras that capture images of passing wildlife - have become a ubiquitous tool in ecology and conservation. Systematic camera trap surveys generate ‘Big Data’ across broad spatial and temporal scales, providing valuable information on environmental and anthropogenic factors affecting vulnerable wildlife populations. However, the sheer number of images amassed can quickly outpace researchers’ ability to manually extract data from these images (e.g., species identities, counts, and behaviors) in timeframes useful for making scientifically-guided conservation and management decisions. Here, we present ‘Snapshot Safari’ as a case study for merging citizen science and machine learning to rapidly generate highly accurate ecological Big Data from camera trap surveys. Snapshot Safari is a collaborative cross-continental research and conservation effort with 1500+ cameras deployed at over 40 eastern and southern Africa protected areas, generating millions of images per year. As one of the first and largest-scale camera trapping initiatives, Snapshot Safari spearheaded innovative developments in citizen science and machine learning. We highlight the advances made and discuss the issues that arose using each of these methods to annotate camera trap data. We end by describing how we combined human and machine classification methods (‘Crowd AI’) to create an efficient integrated data pipeline. Ultimately, by using a feedback loop in which humans validate machine learning predictions and machine learning algorithms are iteratively retrained on new human classifications, we can capitalize on the strengths of both methods of classification while mitigating the weaknesses. Using Crowd AI to quickly and accurately ‘unlock’ ecological Big Data for use in science and conservation is revolutionizing the way we take on critical environmental issues in the Anthropocene era.
more » « less
Full Text Available
Practical galaxy morphology tools from deep supervised representation learning

https://doi.org/10.1093/mnras/stac525

Walmsley, Mike; Scaife, Anna M; Lintott, Chris; Lochner, Michelle; Etsebeth, Verlon; Géron, Tobias; Dickinson, Hugh; Fortson, Lucy; Kruk, Sandor; Masters, Karen L; et al (May 2022, Monthly Notices of the Royal Astronomical Society)

ABSTRACT Astronomers have typically set out to solve supervised machine learning problems by creating their own representations from scratch. We show that deep learning models trained to answer every Galaxy Zoo DECaLS question learn meaningful semantic representations of galaxies that are useful for new tasks on which the models were never trained. We exploit these representations to outperform several recent approaches at practical tasks crucial for investigating large galaxy samples. The first task is identifying galaxies of similar morphology to a query galaxy. Given a single galaxy assigned a free text tag by humans (e.g. ‘#diffuse’), we can find galaxies matching that tag for most tags. The second task is identifying the most interesting anomalies to a particular researcher. Our approach is 100 per cent accurate at identifying the most interesting 100 anomalies (as judged by Galaxy Zoo 2 volunteers). The third task is adapting a model to solve a new task using only a small number of newly labelled galaxies. Models fine-tuned from our representation are better able to identify ring galaxies than models fine-tuned from terrestrial images (ImageNet) or trained from scratch. We solve each task with very few new labels; either one (for the similarity search) or several hundred (for anomaly detection or fine-tuning). This challenges the longstanding view that deep supervised methods require new large labelled data sets for practical use in astronomy. To help the community benefit from our pretrained models, we release our fine-tuning code zoobot. Zoobot is accessible to researchers with no prior experience in deep learning.
more » « less
Full Text Available
Investigating Clumpy Galaxies in the Sloan Digital Sky Survey Stripe 82 Using the Galaxy Zoo

https://doi.org/10.3847/1538-4357/abed5b

Mehta, Vihang; Scarlata, Claudia; Fortson, Lucy; Dickinson, Hugh; Adams, Dominic; Chevallard, Jacopo; Charlot, Stéphane; Beck, Melanie; Kruk, Sandor; Simmons, Brooke (May 2021, The Astrophysical Journal)

Abstract Giant, star-forming clumps are a common feature prevalent among high-redshift star-forming galaxies and play a critical role in shaping their chaotic morphologies and yet, their nature and role in galaxy evolution remains to be fully understood. A majority of the effort to study clumps has been focused at high redshifts, and local clump studies have often suffered from small sample sizes. In this work, we present an analysis of clump properties in the local universe, and for the first time, performed with a statistically significant sample. With the help of the citizen science-powered Galaxy Zoo: Hubble project, we select a sample of 92 z < 0.06 clumpy galaxies in Sloan Digital Sky Survey Stripe 82 galaxies. Within this sample, we identify 543 clumps using a contrast-based image analysis algorithm and perform photometry as well as estimate their stellar population properties. The overall properties of our z < 0.06 clump sample are comparable to the high-redshift clumps. However, contrary to the high-redshift studies, we find no evidence of a gradient in clump ages or masses as a function of their galactocentric distances. Our results challenge the inward migration scenario for clump evolution for the local universe, potentially suggesting a larger contribution of ex situ clumps and/or longer clump migration timescales.
more » « less
Full Text Available

« Prev Next »