skip to main content


Title: Bits Missing: Finding Exotic Pulsars Using bfloat16 on NVIDIA GPUs
Abstract

The Fourier domain acceleration search (FDAS) is an effective technique for detecting faint binary pulsars in large radio astronomy data sets. This paper quantifies the sensitivity impact of reducing numerical precision in the graphics processing unit (GPU)-accelerated FDAS pipeline of the AstroAccelerate (AA) software package. The prior implementation used IEEE-754 single-precision in the entire binary pulsar detection pipeline, spending a large fraction of the runtime computing GPU-accelerated fast Fourier transforms. AA has been modified to use bfloat16 (and IEEE-754 double-precision to provide a “gold standard” comparison) within the Fourier domain convolution section of the FDAS routine. Approximately 20,000 synthetic pulsar filterbank files representing binary pulsars were generated using SIGPROC with a range of physical parameters. They have been processed using bfloat16, single-precision, and double-precision convolutions. All bfloat16 peaks are within 3% of the predicted signal-to-noise ratio of their corresponding single-precision peaks. Of 14,971 “bright” single-precision fundamental peaks above a power of 44.982 (our experimentally measured highest noise value), 14,602 (97.53%) have a peak in the same acceleration and frequency bin in the bfloat16 output plane, while in the remaining 369 the nearest peak is located in the adjacent acceleration bin. There is no bin drift measured between the single- and double-precision results. The bfloat16 version of FDAS achieves a speedup of approximately 1.6× compared to single-precision. A comparison between AA and the PRESTO software package is presented using observations collected with the GMRT of PSR J1544+4937, a 2.16 ms black widow pulsar in a 2.8 hr compact orbit.

 
more » « less
Award ID(s):
2020265
NSF-PAR ID:
10398612
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
DOI PREFIX: 10.3847
Date Published:
Journal Name:
The Astrophysical Journal Supplement Series
Volume:
265
Issue:
1
ISSN:
0067-0049
Format(s):
Medium: X Size: Article No. 13
Size(s):
["Article No. 13"]
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT

    With unparalleled rotational stability, millisecond pulsars (MSPs) serve as ideal laboratories for numerous astrophysical studies, many of which require precise knowledge of the distance and/or velocity of the MSP. Here, we present the astrometric results for 18 MSPs of the ‘MSPSR$\pi$’ project focusing exclusively on astrometry of MSPs, which includes the re-analysis of three previously published sources. On top of a standardized data reduction protocol, more complex strategies (i.e. normal and inverse-referenced 1D interpolation) were employed where possible to further improve astrometric precision. We derived astrometric parameters using sterne, a new Bayesian astrometry inference package that allows the incorporation of prior information based on pulsar timing where applicable. We measured significant (${>}3\, \sigma$) parallax-based distances for 15 MSPs, including 0.81 ± 0.02 kpc for PSR J1518+4904 – the most significant model-independent distance ever measured for a double neutron star system. For each MSP with a well-constrained distance, we estimated its transverse space velocity and radial acceleration. Among the estimated radial accelerations, the updated ones of PSR J1012+5307 and PSR J1738+0333 impose new constraints on dipole gravitational radiation and the time derivative of Newton’s gravitational constant. Additionally, significant angular broadening was detected for PSR J1643−1224, which offers an independent check of the postulated association between the HII region Sh 2-27 and the main scattering screen of PSR J1643−1224. Finally, the upper limit of the death line of γ-ray-emitting pulsars is refined with the new radial acceleration of the hitherto least energetic γ-ray pulsar PSR J1730−2304.

     
    more » « less
  2. Abstract

    Spider pulsars are compact binary systems composed of a millisecond pulsar and a low-mass companion. The relativistic magnetically dominated pulsar wind impacts onto the companion, ablating it and slowly consuming its atmosphere. The interaction forms an intrabinary shock, a proposed site of particle acceleration. We perform global fully kinetic particle-in-cell simulations of the intrabinary shock, assuming that the pulsar wind consists of plane-parallel stripes of alternating polarity and that the shock wraps around the companion. We find that particles are efficiently accelerated via shock-driven reconnection. We extract first-principles synchrotron spectra and light curves, which are in good agreement with X-ray observations: (1) the synchrotron spectrum is nearly flat,Fν∝ const; (2) when the pulsar spin axis is nearly aligned with the orbital angular momentum, the light curve displays two peaks, just before and after the pulsar eclipse (pulsar superior conjunction), separated in phase by ∼0.8 rad; (3) the peak flux exceeds the one at the inferior conjunction by a factor of 10.

     
    more » « less
  3. Abstract We study sky maps and light curves of gamma-ray emission from neutron stars in compact binaries, and in isolation. We briefly review some gamma-ray emission models, and reproduce sky maps from a standard isolated pulsar in the Separatrix Layer model. We consider isolated pulsars with several variations of a dipole magnetic field, including superpositions, and predict their gamma-ray emission. Our results provide new heuristics on what can and cannot be inferred about the magnetic field configuration of pulsars from high-energy observations. We find that typical double-peak light curves can be produced by pulsars with significant multipole structure beyond a single dipole. For binary systems, we also present a simple approximation that is useful for rapid explorations of binary magnetic field structure. Finally, we predict the gamma-ray emission pattern from a compact black hole-neutron star binary moments before merger by applying the Separatrix Layer model to data simulated in full general relativity; we find that face-on observers receive little emission, equatorial observers see one broad peak, and more generic observers typically see two peaks. 
    more » « less
  4. ABSTRACT

    We present the discovery of 37 pulsars from ∼ 20 yr old archival data of the Parkes Multibeam Pulsar Survey using a new FFT-based search pipeline optimized for discovering narrow-duty cycle pulsars. When developing our pulsar search pipeline, we noticed that the signal-to-noise ratios of folded and optimized pulsars often exceeded that achieved in the spectral domain by a factor of two or greater, in particular for narrow duty cycle ones. Based on simulations, we verified that this is a feature of search codes that sum harmonics incoherently and found that many promising pulsar candidates are revealed when hundreds of candidates per beam even with modest spectral signal-to-noise ratios of S/N∼5–6 in higher-harmonic folds (up to 32 harmonics) are folded. Of these candidates, 37 were confirmed as new pulsars and a further 37 would have been new discoveries if our search strategies had been used at the time of their initial analysis. While 19 of these newly discovered pulsars have also been independently discovered in more recent pulsar surveys, 18 are exclusive to only the Parkes Multibeam Pulsar Survey data. Some of the notable discoveries include: PSRs J1635−47 and J1739−31, which show pronounced high-frequency emission; PSRs J1655−40 and J1843−08 belong to the nulling/intermittent class of pulsars; and PSR J1636−51 is an interesting binary system in a ∼0.75 d orbit and shows hints of eclipsing behaviour – unusual given the 340 ms rotation period of the pulsar. Our results highlight the importance of reprocessing archival pulsar surveys and using refined search techniques to increase the normal pulsar population.

     
    more » « less
  5. Abstract Background Bioinformatic workflows frequently make use of automated genome assembly and protein clustering tools. At the core of most of these tools, a significant portion of execution time is spent in determining optimal local alignment between two sequences. This task is performed with the Smith-Waterman algorithm, which is a dynamic programming based method. With the advent of modern sequencing technologies and increasing size of both genome and protein databases, a need for faster Smith-Waterman implementations has emerged. Multiple SIMD strategies for the Smith-Waterman algorithm are available for CPUs. However, with the move of HPC facilities towards accelerator based architectures, a need for an efficient GPU accelerated strategy has emerged. Existing GPU based strategies have either been optimized for a specific type of characters (Nucleotides or Amino Acids) or for only a handful of application use-cases. Results In this paper, we present ADEPT, a new sequence alignment strategy for GPU architectures that is domain independent, supporting alignment of sequences from both genomes and proteins. Our proposed strategy uses GPU specific optimizations that do not rely on the nature of sequence. We demonstrate the feasibility of this strategy by implementing the Smith-Waterman algorithm and comparing it to similar CPU strategies as well as the fastest known GPU methods for each domain. ADEPT’s driver enables it to scale across multiple GPUs and allows easy integration into software pipelines which utilize large scale computational systems. We have shown that the ADEPT based Smith-Waterman algorithm demonstrates a peak performance of 360 GCUPS and 497 GCUPs for protein based and DNA based datasets respectively on a single GPU node (8 GPUs) of the Cori Supercomputer. Overall ADEPT shows 10x faster performance in a node-to-node comparison against a corresponding SIMD CPU implementation. Conclusions ADEPT demonstrates a performance that is either comparable or better than existing GPU strategies. We demonstrated the efficacy of ADEPT in supporting existing bionformatics software pipelines by integrating ADEPT in MetaHipMer a high-performance denovo metagenome assembler and PASTIS a high-performance protein similarity graph construction pipeline. Our results show 10% and 30% boost of performance in MetaHipMer and PASTIS respectively. 
    more » « less