skip to main content

Title: Calibration schemes with O(N log N) scaling for large-N radio interferometers built on a regular grid
ABSTRACT Future generations of radio interferometers targeting the 21 cm signal at cosmological distances with N ≫ 1000 antennas could face a significant computational challenge in building correlators with the traditional architecture, whose computational resource requirement scales as $\mathcal {O}(N^2)$ with array size. The fundamental output of such correlators is the cross-correlation products of all antenna pairs in the array. The FFT-correlator architecture reduces the computational resources scaling to $\mathcal {O}(N\log {N})$ by computing cross-correlation products through a spatial Fourier transform. However, the output of the FFT-correlator is meaningful only when the input antenna voltages are gain- and phase-calibrated. Traditionally, interferometric calibration has used the $\mathcal {O}(N^2)$ cross-correlations produced by a standard correlator. This paper proposes two real-time calibration schemes that could work in parallel with an FFT-correlator as a self-contained $\mathcal {O}(N\log {N})$ correlator system that can be scaled to large-N redundant arrays. We compare the performance and scalability of these two calibration schemes and find that they result in antenna gains whose variance decreases as 1/log N with increase in the size of the array.
; ;
Award ID(s):
Publication Date:
Journal Name:
Monthly Notices of the Royal Astronomical Society
Page Range or eLocation-ID:
66 to 81
Sponsoring Org:
National Science Foundation
More Like this

    Next-generation aperture arrays are expected to consist of hundreds to thousands of antenna elements with substantial digital signal processing to handle large operating bandwidths of a few tens to hundreds of MHz. Conventionally, FX correlators are used as the primary signal processing unit of the interferometer. These correlators have computational costs that scale as $\mathcal {O}(N^2)$ for large arrays. An alternative imaging approach is implemented in the E-field Parallel Imaging Correlator (EPIC) that was recently deployed on the Long Wavelength Array station at the Sevilleta National Wildlife Refuge (LWA-SV) in New Mexico. EPIC uses a novel architecture that produces electric field or intensity images of the sky at the angular resolution of the array with full or partial polarization and the full spectral resolution of the channelizer. By eliminating the intermediate cross-correlation data products, the computational costs can be significantly lowered in comparison to a conventional FX or XF correlator from $\mathcal {O}(N^2)$ to $\mathcal {O}(N \log N)$ for dense (but otherwise arbitrary) array layouts. EPIC can also lower the output data rates by directly yielding polarimetric image products for science analysis. We have optimized EPIC and have now commissioned it at LWA-SV as a commensal all-sky imaging back-end that can potentiallymore »detect and localize sources of impulsive radio emission on millisecond timescales. In this article, we review the architecture of EPIC, describe code optimizations that improve performance, and present initial validations from commissioning observations. Comparisons between EPIC measurements and simultaneous beam-formed observations of bright sources show spectral-temporal structures in good agreement.

    « less
  2. A bstract We study the large charge sector of the defect CFT defined by the half-BPS Wilson loop in planar N = 4 supersymmetric Yang-Mills theory. Specifically, we consider correlation functions of two large charge insertions and several light insertions in the double-scaling limit where the ’t Hooft coupling λ and the large charge J are sent to infinity, with the ratio J/ $$ \sqrt{\lambda } $$ λ held fixed. They are holographically dual to the expectation values of light vertex operators on a classical string solution with large angular momentum, which we evaluate in the leading large J limit. We also compute the two-point function of large charge insertions by evaluating the on-shell string action, supplemented by the boundary terms that generalize the one introduced by Drukker, Gross and Ooguri for the Wilson loop without insertions. For a special class of correlation functions, we reproduce the string results from field theory by using supersymmetric localization. The results are given by correlation functions in an “emergent” matrix model whose matrix size is proportional to J and whose spectral curve coincides with that of the classical string. Similar matrix models appeared in the study of extremal correlators in rank-1 $$ \mathcal{N}more »$$ N = 2 superconformal field theories, but our results hold also for non-extremal cases.« less
  3. Large-scale deep neural networks (DNNs) are both compute and memory intensive. As the size of DNNs continues to grow, it is critical to improve the energy efficiency and performance while maintaining accuracy. For DNNs, the model size is an important factor affecting performance, scalability and energy efficiency. Weight pruning achieves good compression ratios but suffers from three drawbacks: 1) the irregular network structure after pruning, which affects performance and throughput; 2) the increased training complexity; and 3) the lack of rigirous guarantee of compression ratio and inference accuracy. To overcome these limitations, this paper proposes CirCNN, a principled approach to represent weights and process neural networks using block-circulant matrices. CirCNN utilizes the Fast Fourier Transform (FFT)-based fast multiplication, simultaneously reducing the computational complexity (both in inference and training) from O(n2) to O(n log n) and the storage complexity from O(n2) to O(n), with negligible accuracy loss. Compared to other approaches, CirCNN is distinct due to its mathematical rigor: the DNNs based on CirCNN can converge to the same "effectiveness" as DNNs without compression. We propose the CirCNN architecture, a universal DNN inference engine that can be implemented in various hardware/software platforms with configurable network architecture (e.g., layer type, size, scales,more »etc.). In CirCNN architecture: 1) Due to the recursive property, FFT can be used as the key computing kernel, which ensures universal and small-footprint implementations. 2) The compressed but regular network structure avoids the pitfalls of the network pruning and facilitates high performance and throughput with highly pipelined and parallel design. To demonstrate the performance and energy efficiency, we test CirCNN in FPGA, ASIC and embedded processors. Our results show that CirCNN architecture achieves very high energy efficiency and performance with a small hardware footprint. Based on the FPGA implementation and ASIC synthesis results, CirCNN achieves 6 - 102X energy efficiency improvements compared with the best state-of-the-art results.« less
  4. Deep neural networks (DNNs) have emerged as the most powerful machine learning technique in numerous artificial intelligent applications. However, the large sizes of DNNs make themselves both computation and memory intensive, thereby limiting the hardware performance of dedicated DNN accelerators. In this paper, we propose a holistic framework for energy-efficient high-performance highly-compressed DNN hardware design. First, we propose block-circulant matrix-based DNN training and inference schemes, which theoretically guarantee Big-O complexity reduction in both computational cost (from O(n2) to O(n log n)) and storage requirement (from O(n2) to O(n)) of DNNs. Second, we dedicatedly optimize the hardware architecture, especially on the key fast Fourier transform (FFT) module, to improve the overall performance in terms of energy efficiency, computation performance and resource cost. Third, we propose a design flow to perform hardware-software co-optimization with the purpose of achieving good balance between test accuracy and hardware performance of DNNs. Based on the proposed design flow, two block-circulant matrix-based DNNs on two different datasets are implemented and evaluated on FPGA. The fixed-point quantization and the proposed block-circulant matrix-based inference scheme enables the network to achieve as high as 3.5 TOPS computation performance and 3.69 TOPS/W energy efficiency while the memory is saved by 108Xmore »~ 116X with negligible accuracy degradation.« less
  5. A bstract We study the four-point function of the lowest-lying half-BPS operators in the $$ \mathcal{N} $$ N = 4 SU( N ) super-Yang-Mills theory and its relation to the flat-space four-graviton amplitude in type IIB superstring theory. We work in a large- N expansion in which the complexified Yang-Mills coupling τ is fixed. In this expansion, non-perturbative instanton contributions are present, and the SL(2 , ℤ) duality invariance of correlation functions is manifest. Our results are based on a detailed analysis of the sphere partition function of the mass-deformed SYM theory, which was previously computed using supersymmetric localization. This partition function determines a certain integrated correlator in the undeformed $$ \mathcal{N} $$ N = 4 SYM theory, which in turn constrains the four-point correlator at separated points. In a normalization where the two-point functions are proportional to N 2 − 1 and are independent of τ and $$ \overline{\tau} $$ τ ¯ , we find that the terms of order $$ \sqrt{N} $$ N and $$ 1/\sqrt{N} $$ 1 / N in the large N expansion of the four-point correlator are proportional to the non-holomorphic Eisenstein series $$ E\left(\frac{3}{2},\tau, \overline{\tau}\right) $$ E 3 2 τ τ ¯ and $$more »E\left(\frac{5}{2},\tau, \overline{\tau}\right) $$ E 5 2 τ τ ¯ , respectively. In the flat space limit, these terms match the corresponding terms in the type IIB S-matrix arising from R 4 and D 4 R 4 contact inter-actions, which, for the R 4 case, represents a check of AdS/CFT at finite string coupling. Furthermore, we present striking evidence that these results generalize so that, at order $$ {N}^{\frac{1}{2}-m} $$ N 1 2 − m with integer m ≥ 0, the expansion of the integrated correlator we study is a linear sum of non-holomorphic Eisenstein series with half-integer index, which are manifestly SL(2 , ℤ) invariant.« less