skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Data Reduction Integrated Python Protocol for the Arecibo Pisces-Perseus Supercluster Survey(DRIPP for APPSS)
Developments in open-source high-level programming languages enable undergraduate students to make vital contributions to modern astronomical surveys. The Arecibo Pisces-Perseus Supercluster Survey (APPSS) currently uses data analysis software written in Interactive Data Language (IDL). We discuss the conversion of this software to the Python programming language, which uses freely available standard libraries, and the conversion of the data to a standard form of the Single-Dish FITS (SDFITS) standard. Data Reduction Integrated Python Protocol (DRIPP) provides user-guided data reduction with an interface similar to the former software written in IDL. Converting to DRIPP would provide researchers with more accessible data processing capabilities for APPSS (or any similar radio spectral survey). This work has been supported by NSF AST-1637339.  more » « less
Award ID(s):
1637339
PAR ID:
10168217
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
American Astronomical Society meeting
Volume:
235
ISSN:
2152-887X
Page Range / eLocation ID:
279.11
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The Arecibo Pisces-Perseus Supercluster Survey (APPSS) is an observing project undertaken by the Undergraduate ALFALFA Team that aims to detect HI in galaxies in the Pisces-Perseus neighborhood and analyze the dynamics and the properties of the galaxies. The galaxies targeted in APPSS are suspected from their optical properties (color, morphology, surface brightness) to lie in the Pisces-Perseus Supercluster (PPS) but are below the detection threshold of the ALFALFA blind HI survey. Here we present results for galaxies targeted in a strip across the PPS region in declination from 30o to 32o. This region is along the main filament of the supercluster and includes objects such as the Pisces Cluster. The data was recorded by the L-Band Wide receiver of the Arecibo Observatory. Data reduction was done using routines derived for the APPSS in IDL. After baselining the spectra and sifting out radio interference, we fit either a gaussian or two-horned profile to their 21-centimeter line to measure the HI line flux density, velocity, and velocity width. From these parameters we calculate distances, hydrogen gas mass, and rotational velocities. As expected, the galaxies analyzed in this slice of declination have consistently lower mass than the ALFALFA detections thus extending the sampling of galaxies within the PPS. The combined ALFALFA and APPSS HI line detections will be used for future applications of the Baryonic Tully-Fisher Relation in this region. This research has been supported by NSF grant NSF/AST-1714828 to M.P. Haynes and by the Brinson Foundation for the Arecibo Pisces-Perseus Supercluster Survey (APPSS). 
    more » « less
  2. Aldrich, Jonathan; Salvaneschi, Guido (Ed.)
    Tensor processing infrastructures such as deep learning frameworks and specialized hardware accelerators have revolutionized how computationally intensive code from domains such as deep learning and image processing is executed and optimized. These infrastructures provide powerful and expressive abstractions while ensuring high performance. However, to utilize them, code must be written specifically using the APIs / ISAs of such software frameworks or hardware accelerators. Importantly, given the fast pace of innovation in these domains, code written today quickly becomes legacy as new frameworks and accelerators are developed, and migrating such legacy code manually is a considerable effort. To enable developers in leveraging such DSLs while preserving their current programming paradigm, we present Tenspiler, a verified-lifting-based compiler that uses program synthesis to translate sequential programs written in general-purpose programming languages (e.g., C++ or Python code that does not leverage any specialized framework or accelerator) into tensor operations. Central to Tenspiler is our carefully crafted yet simple intermediate language, named TensIR, that expresses tensor operations. TensIR enables efficient lifting, verification, and code generation. Unlike classical pattern-matching-based compilers, Tenspiler uses program synthesis to translate input code into TensIR, which is then compiled to the target API / ISA. Currently, Tenspiler already supports six DSLs, spanning a broad spectrum of software and hardware environments. Furthermore, we show that new backends can be easily supported by Tenspiler by adding simple pattern-matching rules for TensIR. Using 10 real-world code benchmark suites, our experimental evaluation shows that by translating code to be executed on 6 different software frameworks and hardware devices, Tenspiler offers on average 105× kernel and 9.65× end-to-end execution time improvement over the fully-optimized sequential implementation of the same benchmarks. 
    more » « less
  3. Testing is an integral but often neglected part of the software development process. Classical test generation tools such as EvoSuite generate behavioral test suites by optimizing for coverage, but tend to produce tests that are hard to understand. Language models trained on code can generate code that is highly similar to that written by humans, but current models are trained to generate each file separately, as is standard practice in natural language processing, and thus fail to consider the code- under-test context when producing a test file. In this work, we propose the Aligned Code And Tests Language Model (CAT- LM), a GPT-style language model with 2.7 Billion parameters, trained on a corpus of Python and Java projects. We utilize a novel pretraining signal that explicitly considers the mapping between code and test files when available. We also drastically increase the maximum sequence length of inputs to 8,192 tokens, 4x more than typical code generation models, to ensure that the code context is available to the model when generating test code. We analyze its usefulness for realistic applications, showing that sampling with filtering (e.g., by compilability, coverage) allows it to efficiently produce tests that achieve coverage similar to ones written by developers while resembling their writing style. By utilizing the code context, CAT-LM generates more valid tests than even much larger language models trained with more data (CodeGen 16B and StarCoder) and substantially outperforms a recent test-specific model (TeCo) at test completion. Overall, our work highlights the importance of incorporating software-specific insights when training language models for code and paves the way to more powerful automated test generation. 
    more » « less
  4. The Undergraduate ALFALFA team is currently focusing on the analysis of the Pisces-Perseus Supercluster to test current supercluster formation models. The primary goal of our research is to reduce L-band HI data from the Arecibo telescope. To reduce the data we use IDL programs written by our collaborators to reduce the data and find potential sources whose mass can be estimated by the baryonic Tully-Fisher relation, which relates the luminosity to the rotational velocity profile of spiral galaxies. Thus far we have reduced data and estimated HI masses for several galaxies in the supercluster region. We will give examples of data reduction and preliminary results for both the fall 2015 and 2016 observing seasons. We will also describe the data reduction process and the process of learning the associated software, and the use of virtual observatory tools such as the SDSS databases, Aladin, TOPCAT and others. This research was supported by the NSF grant AST-1211005. (Student Poster Presentation) 
    more » « less
  5. The Undergraduate ALFALFA team is currently focusing on the analysis of the Pisces-Perseus Supercluster to test current supercluster formation models. The primary goal of our research is to reduce L-band HI data from the Arecibo telescope. To reduce the data we use IDL programs written by our collaborators to reduce the data and find potential sources whose mass can be estimated by the baryonic Tully-Fisher relation, which relates the luminosity to the rotational velocity profile of spiral galaxies. Thus far we have reduced data and estimated HI masses for several galaxies in the supercluster region.We will give examples of data reduction and preliminary results for both the fall 2015 and 2016 observing seasons. We will also describe the data reduction process and the process of learning the associated software, and the use of virtual observatory tools such as the SDSS databases, Aladin, TOPCAT and others.This research was supported by the NSF grant AST-1211005. 
    more » « less