skip to main content

Title: Cosmic-CoNN: A Cosmic-Ray Detection Deep-learning Framework, Data Set, and Toolkit

Rejecting cosmic rays (CRs) is essential for the scientific interpretation of CCD-captured data, but detecting CRs in single-exposure images has remained challenging. Conventional CR detectors require experimental parameter tuning for different instruments, and recent deep-learning methods only produce instrument-specific models that suffer from performance loss on telescopes not included in the training data. We present Cosmic-CoNN, a generic CR detector deployed for 24 telescopes at the Las Cumbres Observatory, which has been made possible by the three contributions in this work: (1) We build a large and diverse ground-based CR data set leveraging thousands of images from a global telescope network. (2) We propose a novel loss function and a neural network optimized for telescope imaging data to train generic CR-detection models. At 95% recall, our model achieves a precision of 93.70% on Las Cumbres imaging data and maintains a consistent performance on new ground-based instruments never used for training. Specifically, the Cosmic-CoNN model trained on the Las Cumbres CR data set maintains high precisions of 92.03% and 96.69% on Gemini GMOS-N/S 1 × 1 and 2 × 2 binning images, respectively. (3) We build a suite of tools including an interactive CR mask visualization and editing interface, console more » commands, and Python APIs to make automatic, robust CR detection widely accessible by the community of astronomers. Our data set, open-source code base, and trained models are available at

« less
; ; ; ;
Publication Date:
Journal Name:
The Astrophysical Journal
Page Range or eLocation-ID:
Article No. 73
DOI PREFIX: 10.3847
Sponsoring Org:
National Science Foundation
More Like this
  1. Context. Fast radio bursts (FRBs) are extremely energetic pulses of millisecond duration and unknown origin. To understand the phenomenon that emits these pulses, targeted and un-targeted searches have been performed for multiwavelength counterparts, including the optical. Aims. The objective of this work is to search for optical transients at the positions of eight well-localized (< 1″) FRBs after the arrival of the burst on different timescales (typically at one day, several months, and one year after FRB detection). We then compare this with known optical light curves to constrain progenitor models. Methods. We used the Las Cumbres Observatory Global Telescope (LCOGT) network to promptly take images with its network of 23 telescopes working around the world. We used a template subtraction technique to analyze all the images collected at differing epochs. We have divided the difference images into two groups: In one group we use the image of the last epoch as a template, and in the other group we use the image of the first epoch as a template. We then searched for optical transients at the localizations of the FRBs in the template subtracted images. Results. We have found no optical transients and have therefore set limiting magnitudesmore »to the optical counterparts. Typical limits in apparent and absolute magnitudes for our LCOGT data are ∼22 and −19 mag in the r band, respectively. We have compared our limiting magnitudes with light curves of super-luminous supernovae (SLSNe), Type Ia supernovae (SNe Ia), supernovae associated with gamma-ray bursts (GRB-SNe), a kilonova, and tidal disruption events (TDEs). Conclusions. Assuming that the FRB emission coincides with the time of explosion of these transients, we rule out associations with SLSNe (at the ∼99.9% confidence level) and the brightest subtypes of SNe Ia, GRB-SNe, and TDEs (at a similar confidence level). However, we cannot exclude scenarios where FRBs are directly associated with the faintest of these subtypes or with kilonovae.« less
  2. ABSTRACT We present and study a large suite of high-resolution cosmological zoom-in simulations, using the FIRE-2 treatment of mechanical and radiative feedback from massive stars, together with explicit treatment of magnetic fields, anisotropic conduction and viscosity (accounting for saturation and limitation by plasma instabilities at high β), and cosmic rays (CRs) injected in supernovae shocks (including anisotropic diffusion, streaming, adiabatic, hadronic and Coulomb losses). We survey systems from ultrafaint dwarf ($M_{\ast }\sim 10^{4}\, \mathrm{M}_{\odot }$, $M_{\rm halo}\sim 10^{9}\, \mathrm{M}_{\odot }$) through Milky Way/Local Group (MW/LG) masses, systematically vary uncertain CR parameters (e.g. the diffusion coefficient κ and streaming velocity), and study a broad ensemble of galaxy properties [masses, star formation (SF) histories, mass profiles, phase structure, morphologies, etc.]. We confirm previous conclusions that magnetic fields, conduction, and viscosity on resolved ($\gtrsim 1\,$ pc) scales have only small effects on bulk galaxy properties. CRs have relatively weak effects on all galaxy properties studied in dwarfs ($M_{\ast } \ll 10^{10}\, \mathrm{M}_{\odot }$, $M_{\rm halo} \lesssim 10^{11}\, \mathrm{M}_{\odot }$), or at high redshifts (z ≳ 1–2), for any physically reasonable parameters. However, at higher masses ($M_{\rm halo} \gtrsim 10^{11}\, \mathrm{M}_{\odot }$) and z ≲ 1–2, CRs can suppress SF and stellar masses by factorsmore »∼2–4, given reasonable injection efficiencies and relatively high effective diffusion coefficients $\kappa \gtrsim 3\times 10^{29}\, {\rm cm^{2}\, s^{-1}}$. At lower κ, CRs take too long to escape dense star-forming gas and lose their energy to collisional hadronic losses, producing negligible effects on galaxies and violating empirical constraints from spallation and γ-ray emission. At much higher κ CRs escape too efficiently to have appreciable effects even in the CGM. But around $\kappa \sim 3\times 10^{29}\, {\rm cm^{2}\, s^{-1}}$, CRs escape the galaxy and build up a CR-pressure-dominated halo which maintains approximate virial equilibrium and supports relatively dense, cool (T ≪ 106 K) gas that would otherwise rain on to the galaxy. CR ‘heating’ (from collisional and streaming losses) is never dominant.« less
  3. null (Ed.)
    ABSTRACT We study the impact of cosmic rays (CRs) on the structure of virial shocks, using a large suite of high-resolution cosmological FIRE-2 simulations accounting for CR injection by supernovae. In Milky Way-mass, low-redshift (z ≲ 1−2) haloes, which are expected to form ‘hot haloes’ with slowly cooling gas in quasi-hydrostatic equilibrium (with a stable virial shock), our simulations without CRs do exhibit clear virial shocks. The cooler phase condensing out from inflows becomes pressure confined to overdense clumps, embedded in low-density, volume-filling hot gas with volume-weighted cooling time longer than inflow time. The gas thus transitions sharply from cool free-falling inflow, to hot and thermal-pressure supported at approximately the virial radius (≈Rvir), and the shock is quasi-spherical. With CRs, we previously argued that haloes in this particular mass and redshift range build up CR-pressure-dominated gaseous haloes. Here, we show that when CR pressure dominates over thermal pressure, there is no significant virial shock. Instead, inflowing gas is gradually decelerated by the CR pressure gradient and the gas is relatively subsonic out to and even beyond Rvir. Rapid cooling also maintains subvirial temperatures in the inflowing gas within ∼Rvir.
  4. Abstract Background

    Microbiomes are now recognized as the main drivers of ecosystem function ranging from the oceans and soils to humans and bioreactors. However, a grand challenge in microbiome science is to characterize and quantify the chemical currencies of organic matter (i.e., metabolites) that microbes respond to and alter. Critical to this has been the development of Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS), which has drastically increased molecular characterization of complex organic matter samples, but challenges users with hundreds of millions of data points where readily available, user-friendly, and customizable software tools are lacking.


    Here, we build on years of analytical experience with diverse sample types to develop MetaboDirect, an open-source, command-line-based pipeline for the analysis (e.g., chemodiversity analysis, multivariate statistics), visualization (e.g., Van Krevelen diagrams, elemental and molecular class composition plots), and presentation of direct injection high-resolution FT-ICR MS data sets after molecular formula assignment has been performed. When compared to other available FT-ICR MS software, MetaboDirect is superior in that it requires a single line of code to launch a fully automated framework for the generation and visualization of a wide range of plots, with minimal coding experience required. Among the tools evaluated, MetaboDirect is alsomore »uniquely able to automatically generate biochemical transformation networks (ab initio) based on mass differences (mass difference network-based approach) that provide an experimental assessment of metabolite connections within a given sample or a complex metabolic system, thereby providing important information about the nature of the samples and the set of microbial reactions or pathways that gave rise to them. Finally, for more experienced users, MetaboDirect allows users to customize plots, outputs, and analyses.


    Application of MetaboDirect to FT-ICR MS-based metabolomic data sets from a marine phage-bacterial infection experiment and aSphagnumleachate microbiome incubation experiment showcase the exploration capabilities of the pipeline that will enable the research community to evaluate and interpret their data in greater depth and in less time. It will further advance our knowledge of how microbial communities influence and are influenced by the chemical makeup of the surrounding system. The source code and User’s guide of MetaboDirect are freely available through ( and (, respectively.

    « less
  5. Abstract

    In pursuit of scientific discovery, vast collections of unstructured structural and functional images are acquired; however, only an infinitesimally small fraction of this data is rigorously analyzed, with an even smaller fraction ever being published. One method to accelerate scientific discovery is to extract more insight from costly scientific experiments already conducted. Unfortunately, data from scientific experiments tend only to be accessible by the originator who knows the experiments and directives. Moreover, there are no robust methods to search unstructured databases of images to deduce correlations and insight. Here, we develop a machine learning approach to create image similarity projections to search unstructured image databases. To improve these projections, we develop and train a model to include symmetry-aware features. As an exemplar, we use a set of 25,133 piezoresponse force microscopy images collected on diverse materials systems over five years. We demonstrate how this tool can be used for interactive recursive image searching and exploration, highlighting structural similarities at various length scales. This tool justifies continued investment in federated scientific databases with standardized metadata schemas where the combination of filtering and recursive interactive searching can uncover synthesis-structure-property relations. We provide a customizable open-source package ( of thismore »interactive tool for researchers to use with their data.

    « less