skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: An optimized transient detection pipeline for the ASKAP Variables and Slow Transients (VAST) survey
ABSTRACT In this paper, we present an optimized version of the detection pipeline for the ASKAP Variables and Slow Transients (VAST) survey, offering significant performance improvement. The key to this optimization is the replacement of the original w-projection algorithm integrated in the Common Astronomy Software Applications package with the w-stacking algorithm implemented in the WSClean software. Our experiments demonstrate that this optimization improves the overall processing efficiency of the pipeline by approximately a factor of 3. Moreover, the residual images generated by the optimized pipeline exhibit lower noise levels and fewer artefact sources, suggesting that our optimized pipeline not only enhances detection accuracy but also improves imaging fidelity. This optimized VAST detection pipeline is integrated into the Data Activated Liu Graph Engine (DALiuGE) execution framework, specifically designed for SKA-scale big data processing. Experimental results show that the performance and scalability advantages of the pipeline using DALiuGE over traditional MPI or BASH techniques increase with the data size. In summary, the optimized transient detection pipeline significantly reduces runtime, increases operational efficiency, and decreases implementation costs, offering a practical optimization solution for other ASKAP imaging pipelines as well.  more » « less
Award ID(s):
1816492
PAR ID:
10545358
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Monthly Notices of the Royal Astronomical Society
Date Published:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Volume:
526
Issue:
2
ISSN:
0035-8711
Page Range / eLocation ID:
1809 to 1821
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Topology optimization has emerged as a versatile design tool embraced across diverse domains. This popularity has led to great efforts in the development of education-centric topology optimization codes with various focuses, such as targeting beginners seeking user-friendliness and catering to experienced users emphasizing computational efficiency. In this study, we introduce , a novel 2D and 3D topology optimization software developed in Python and built upon the open-source library, designed to harmonize usability with computational efficiency and post-processing for fabrication. employs a modular architecture, offering a unified input script for defining topology optimization problems and six replaceable modules to streamline subsequent optimization tasks. By enabling users to express problems in the weak form, eliminates the need for matrix manipulations, thereby simplifying the modeling process. The software also integrates automatic differentiation to mitigate the intricacies associated with chain rules in finite element analysis and sensitivity analysis. Furthermore, provides access to a comprehensive array of readily available solvers and preconditioners, bolstering flexibility in problem-solving. is designed for scalability, furnishing robust support for parallel computing that seamlessly adapts to diverse computing platforms, spanning from laptops to distributed computing clusters. It also facilitates effortless transitions for various spatial dimensions, mesh geometries, element types and orders, and quadrature degrees. Apart from the computational benefits, facilitates the automated exportation of optimized designs, compatible with open-source software for post-processing. This functionality allows for visualizing optimized designs across diverse mesh geometries and element shapes, automatically smoothing 3D designs, and converting smoothed designs into STereoLithography (STL) files for 3D printing. To illustrate the capabilities of , we present five representative examples showcasing topology optimization across 2D and 3D geometries, structured and unstructured meshes, solver switching, and complex boundary conditions. We also assess the parallel computational efficiency of by examining its performance across diverse computing platforms, process counts, problem sizes, and solver configurations. Finally, we demonstrate a physical 3D-printed model utilizing the STL file derived from the design optimized by . These examples showcase not only ’s rich functionality but also its parallel computing performance. The open-source is given in Appendix B and will be available to download athttps://github.com/missionlab/fenitop. 
    more » « less
  2. Convolutional neural network (CNN)-based object detection has achieved very high accuracy; e.g., single-shot multi-box detectors (SSDs) can efficiently detect and localize various objects in an input image. However, they require a high amount of computation and memory storage, which makes it difficult to perform efficient inference on resource-constrained hardware devices such as drones or unmanned aerial vehicles (UAVs). Drone/UAV detection is an important task for applications including surveillance, defense, and multi-drone self-localization and formation control. In this article, we designed and co-optimized an algorithm and hardware for energy-efficient drone detection on resource-constrained FPGA devices. We trained an SSD object detection algorithm with a custom drone dataset. For inference, we employed low-precision quantization and adapted the width of the SSD CNN model. To improve throughput, we use dual-data rate operations for DSPs to effectively double the throughput with limited DSP counts. For different SSD algorithm models, we analyze accuracy or mean average precision (mAP) and evaluate the corresponding FPGA hardware utilization, DRAM communication, and throughput optimization. We evaluated the FPGA hardware for a custom drone dataset, Pascal VOC, and COCO2017. Our proposed design achieves a high mAP of 88.42% on the multi-drone dataset, with a high energy efficiency of 79 GOPS/W and throughput of 158 GOPS using the Xilinx Zynq ZU3EG FPGA device on the Open Vision Computer version 3 (OVC3) platform. Our design achieves 1.1 to 8.7× higher energy efficiency than prior works that used the same Pascal VOC dataset, using the same FPGA device, but at a low-power consumption of 2.54 W. For the COCO dataset, our MobileNet-V1 implementation achieved an mAP of 16.8, and 4.9 FPS/W for energy-efficiency, which is ∼ 1.9× higher than prior FPGA works or other commercial hardware platforms. 
    more » « less
  3. Abstract BackgroundLive imaging is the gold standard for determining how cells give rise to organs. However, tracking many cells across whole organs over large developmental time windows is extremely challenging. In this work, we provide a comparably simple method for confocal live imaging entireArabidopsis thalianafirst leaves across early development. Our imaging method works for both wild-type leaves and the complex curved leaves of thejaw-1Dmutant. ResultsWe find that dissecting the cotyledons, affixing a coverslip above the samples and mounting samples with perfluorodecalin yields optimal imaging series for robust cellular and organ level analysis. We provide details of our complementary image processing steps in MorphoGraphX software for segmenting, tracking lineages, and measuring a suite of cellular properties. We also provide MorphoGraphX image processing scripts we developed to automate analysis of segmented images and data presentation. ConclusionsOur imaging techniques and processing steps combine into a robust imaging pipeline. With this pipeline we are able to examine important nuances in the cellular growth and differentiation ofjaw-Dversus WT leaves that have not been demonstrated before. Our pipeline is approachable and easy to use for leaf development live imaging. 
    more » « less
  4. Mackelprang, Rachel (Ed.)
    ABSTRACT Increasing data volumes on high-throughput sequencing instruments such as the NovaSeq 6000 leads to long computational bottlenecks for common metagenomics data preprocessing tasks such as adaptor and primer trimming and host removal. Here, we test whether faster recently developed computational tools (Fastp and Minimap2) can replace widely used choices (Atropos and Bowtie2), obtaining dramatic accelerations with additional sensitivity and minimal loss of specificity for these tasks. Furthermore, the taxonomic tables resulting from downstream processing provide biologically comparable results. However, we demonstrate that for taxonomic assignment, Bowtie2’s specificity is still required. We suggest that periodic reevaluation of pipeline components, together with improvements to standardized APIs to chain them together, will greatly enhance the efficiency of common bioinformatics tasks while also facilitating incorporation of further optimized steps running on GPUs, FPGAs, or other architectures. We also note that a detailed exploration of available algorithms and pipeline components is an important step that should be taken before optimization of less efficient algorithms on advanced or nonstandard hardware. IMPORTANCE In shotgun metagenomics studies that seek to relate changes in microbial DNA across samples, processing the data on a computer often takes longer than obtaining the data from the sequencing instrument. Recently developed software packages that perform individual steps in the pipeline of data processing in principle offer speed advantages, but in practice they may contain pitfalls that prevent their use, for example, they may make approximations that introduce unacceptable errors in the data. Here, we show that differences in choices of these components can speed up overall data processing by 5-fold or more on the same hardware while maintaining a high degree of correctness, greatly reducing the time taken to interpret results. This is an important step for using the data in clinical settings, where the time taken to obtain the results may be critical for guiding treatment. 
    more » « less
  5. Recent rapid advances in deep pre-trained language models and the introduction of large datasets have powered research in embedding-based neural retrieval. While many excellent research papers have emerged, most of them come with their own implementations, which are typically optimized for some particular research goals instead of efficiency or code organization. In this paper, we introduce Tevatron, a neural retrieval toolkit that is optimized for efficiency, flexibility, and code simplicity. Tevatron enables model training and evaluation for a variety of ranking components such as dense retrievers, sparse retrievers, and rerankers. It also provides a standardized pipeline that includes text processing, model training, corpus/query encoding, and search. In addition, Tevatron incorporates well-studied methods for improving retriever effectiveness such as hard negative mining and knowledge distillation. We provide an overview of Tevatron in this paper, demonstrating its effectiveness and efficiency on multiple IR and QA datasets. We highlight Tevatron’s flexible design, which enables easy generalization across datasets, model architectures, and accelerator platforms (GPUs and TPUs). Overall, we believe that Tevatron can serve as a solid software foundation for research on neural retrieval systems, including their design, modeling, and optimization. 
    more » « less