skip to main content


Title: Automated Detection of Antenna Malfunctions in Large‐ N Interferometers: A Case Study With the Hydrogen Epoch of Reionization Array
Abstract

We present a framework for identifying and flagging malfunctioning antennas in large radio interferometers. We outline two distinct categories of metrics designed to detect outliers along known failure modes of large arrays: cross‐correlation metrics, based on all antenna pairs, and auto‐correlation metrics, based solely on individual antennas. We define and motivate the statistical framework for all metrics used, and present tailored visualizations that aid us in clearly identifying new and existing systematics. We implement these techniques using data from 105 antennas in the Hydrogen Epoch of Reionization Array (HERA) as a case study. Finally, we provide a detailed algorithm for implementing these metrics as flagging tools on real data sets.

 
more » « less
Award ID(s):
1836019
NSF-PAR ID:
10446848
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  more » ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;   « less
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Radio Science
Volume:
57
Issue:
1
ISSN:
0048-6604
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The Canadian Hydrogen Intensity Mapping Experiment (CHIME) is a drift scan radio telescope operating across the 400–800 MHz band. CHIME is located at the Dominion Radio Astrophysical Observatory near Penticton, BC, Canada. The instrument is designed to map neutral hydrogen over the redshift range 0.8–2.5 to constrain the expansion history of the universe. This goal drives the design features of the instrument. CHIME consists of four parallel cylindrical reflectors, oriented north–south, each 100 m × 20 m and outfitted with a 256-element dual-polarization linear feed array. CHIME observes a two-degree-wide stripe covering the entire meridian at any given moment, observing three-quarters of the sky every day owing to Earth’s rotation. An FX correlator utilizes field-programmable gate arrays and graphics processing units to digitize and correlate the signals, with different correlation products generated for cosmological, fast radio burst, pulsar, very long baseline interferometry, and 21 cm absorber back ends. For the cosmology back end, theNfeed2correlation matrix is formed for 1024 frequency channels across the band every 31 ms. A data receiver system applies calibration and flagging and, for our primary cosmological data product, stacks redundant baselines and integrates for 10 s. We present an overview of the instrument, its performance metrics based on the first 3 yr of science data, and we describe the current progress in characterizing CHIME’s primary beam response. We also present maps of the sky derived from CHIME data; we are using versions of these maps for a cosmological stacking analysis, as well as for investigation of Galactic foregrounds.

     
    more » « less
  2. Abstract Motivation

    Single-cell RNA sequencing (scRNAseq) technologies allow for measurements of gene expression at a single-cell resolution. This provides researchers with a tremendous advantage for detecting heterogeneity, delineating cellular maps or identifying rare subpopulations. However, a critical complication remains: the low number of single-cell observations due to limitations by rarity of subpopulation, tissue degradation or cost. This absence of sufficient data may cause inaccuracy or irreproducibility of downstream analysis. In this work, we present Automated Cell-Type-informed Introspective Variational Autoencoder (ACTIVA): a novel framework for generating realistic synthetic data using a single-stream adversarial variational autoencoder conditioned with cell-type information. Within a single framework, ACTIVA can enlarge existing datasets and generate specific subpopulations on demand, as opposed to two separate models [such as single-cell GAN (scGAN) and conditional scGAN (cscGAN)]. Data generation and augmentation with ACTIVA can enhance scRNAseq pipelines and analysis, such as benchmarking new algorithms, studying the accuracy of classifiers and detecting marker genes. ACTIVA will facilitate analysis of smaller datasets, potentially reducing the number of patients and animals necessary in initial studies.

    Results

    We train and evaluate models on multiple public scRNAseq datasets. In comparison to GAN-based models (scGAN and cscGAN), we demonstrate that ACTIVA generates cells that are more realistic and harder for classifiers to identify as synthetic which also have better pair-wise correlation between genes. Data augmentation with ACTIVA significantly improves classification of rare subtypes (more than 45% improvement compared with not augmenting and 4% better than cscGAN) all while reducing run-time by an order of magnitude in comparison to both models.

    Availability and implementation

    The codes and datasets are hosted on Zenodo (https://doi.org/10.5281/zenodo.5879639). Tutorials are available at https://github.com/SindiLab/ACTIVA.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  3. SUMMARY

    Crustal seismic velocity models provide essential information for many applications including earthquake source properties, simulations of ground motion and related derivative products. We present a systematic workflow for assessing the accuracy of velocity models with full-waveform simulations. The framework is applied to four regional seismic velocity models for southern California: CVM-H15.11, CVM-S4.26, CVM-S4.26.M01 that includes a shallow geotechnical layer, and the model of Berg et al. For each model, we perform 3-D viscoelastic wave propagation simulations for 48 virtual seismic noise sources (down to 2 s) and 44 moderate-magnitude earthquakes (down to 2 s generally and 0.5 s for some cases) assuming a minimum shear wave velocity of 200 m s–1. The synthetic waveforms are compared with observations associated with both earthquake records and noise cross-correlation data sets. We measure, at multiple period bands for well-isolated seismic phases, traveltime delays and normalized zero-lag cross-correlation coefficients between the synthetic and observed data. The obtained measurements are summarized using the mean absolute derivation of time delay and the mean correlation coefficient. These two metrics provide reliable statistical representations of model quality with consistent results in all data sets. In addition to assessing the overall (average) performance of different models in the entire study area, we examine spatial variations of the models’ quality. All examined models show good phase and waveform agreements for surface waves at periods longer than 5 s, and discrepancies at shorter periods reflecting small-scale heterogeneities and near-surface structures. The model performing best overall is CVM-S4.26.M01. The largest misfits for both body and surface waves are in basin structures and around large fault zones. Inaccuracies generated in these areas may affect tomography and model simulation results at other regions. The seismic velocity models for southern California can be improved by adding better resolved structural representations of the shallow crust and volumes around the main faults.

     
    more » « less
  4. Abstract

    Climate change is already having profound effects on biodiversity, but climate change adaptation has yet to be fully incorporated into area‐based management tools used to conserve biodiversity, such as protected areas. One main obstacle is the lack of consensus regarding how impacts of climate change can be included in spatial conservation plans. We propose a climate‐smart framework that prioritizes the protection of climate refugia—areas of low climate exposure and high biodiversity retention—using climate metrics. We explore four aspects of climate‐smart conservation planning: (1) climate model ensembles; (2) multiple emission scenarios; (3) climate metrics; and (4) approaches to identifying climate refugia. We illustrate this framework in the Western Pacific Ocean, but it is equally applicable to terrestrial systems. We found that all aspects of climate‐smart conservation planning considered affected the configuration of spatial plans. The choice of climate metrics and approaches to identifying refugia have large effects in the resulting climate‐smart spatial plans, whereas the choice of climate models and emission scenarios have smaller effects. As the configuration of spatial plans depended on climate metrics used, a spatial plan based on a single measure of climate change (e.g., warming) will not necessarily be robust against other measures of climate change (e.g., ocean acidification). We therefore recommend using climate metrics most relevant for the biodiversity and region considered based on a single or multiple climate drivers. To include the uncertainty associated with different climate futures, we recommend using multiple climate models (i.e., an ensemble) and emission scenarios. Finally, we show that the approaches we used to identify climate refugia feature trade‐offs between: (1) the degree to which they are climate‐smart, and (2) their efficiency in meeting conservation targets. Hence, the choice of approach will depend on the relative value that stakeholders place on climate adaptation. By using this framework, protected areas can be designed with improved longevity and thus safeguard biodiversity against current and future climate change. We hope that the proposed climate‐smart framework helps transition conservation planning toward climate‐smart approaches.

     
    more » « less
  5. Abstract

    We present a method for constructing the average waveform shape (hereafter called “empirical wavelet”) of seismic shear waves on an event‐by‐event basis for the purpose of constructing a high‐quality travel time data set with information about waveform quality and shape. A global data set was assembled from 360 earthquakes between 1994 and 2017. The empirical wavelet approach permits documentation of the degree of similarity of every observed wave with the empirical wavelet. We adapt the empirical wavelet to all pulse widths, thus identifying broadened (e.g., attenuated) pulses. Several measures of goodness of fit of the empirical wavelet to each record are documented, as well as signal‐to‐noise ratios, permitting users of the data set to employ flexible weighting schemes. We demonstrate the approach on transversely polarized SH waves and build a global travel time data set for the waves S, SS, SSS, Sdiff, ScS, and ScSScS. Onset arrival times of the waves were determined through a correlation scheme with best‐fitting empirical wavelets. Over 250,000 travel times were picked, from over 1.4 million records, all of which were human‐checked for accuracy via a Portable Document Format (PDF) catalog file making system. Many events were specifically selected to bolster southern hemisphere coverage. Coverage maps show that, while the northern hemisphere is more densely sampled, the southern hemisphere coverage is robust. The travel time data set, empirical wavelets, and all measurement metrics are publicly available and well suited for global tomography, as well as forward modeling experiments.

     
    more » « less