Nextgeneration aperture arrays are expected to consist of hundreds to thousands of antenna elements with substantial digital signal processing to handle large operating bandwidths of a few tens to hundreds of MHz. Conventionally, FX correlators are used as the primary signal processing unit of the interferometer. These correlators have computational costs that scale as $\mathcal {O}(N^2)$ for large arrays. An alternative imaging approach is implemented in the Efield Parallel Imaging Correlator (EPIC) that was recently deployed on the Long Wavelength Array station at the Sevilleta National Wildlife Refuge (LWASV) in New Mexico. EPIC uses a novel architecture that produces electric field or intensity images of the sky at the angular resolution of the array with full or partial polarization and the full spectral resolution of the channelizer. By eliminating the intermediate crosscorrelation data products, the computational costs can be significantly lowered in comparison to a conventional FX or XF correlator from $\mathcal {O}(N^2)$ to $\mathcal {O}(N \log N)$ for dense (but otherwise arbitrary) array layouts. EPIC can also lower the output data rates by directly yielding polarimetric image products for science analysis. We have optimized EPIC and have now commissioned it at LWASV as a commensal allsky imaging backend that can potentiallymore »
Calibration schemes with O(N log N) scaling for largeN radio interferometers built on a regular grid
ABSTRACT Future generations of radio interferometers targeting the 21 cm signal at cosmological distances with N ≫ 1000 antennas could face a significant computational challenge in building correlators with the traditional architecture, whose computational resource requirement scales as $\mathcal {O}(N^2)$ with array size. The fundamental output of such correlators is the crosscorrelation products of all antenna pairs in the array. The FFTcorrelator architecture reduces the computational resources scaling to $\mathcal {O}(N\log {N})$ by computing crosscorrelation products through a spatial Fourier transform. However, the output of the FFTcorrelator is meaningful only when the input antenna voltages are gain and phasecalibrated. Traditionally, interferometric calibration has used the $\mathcal {O}(N^2)$ crosscorrelations produced by a standard correlator. This paper proposes two realtime calibration schemes that could work in parallel with an FFTcorrelator as a selfcontained $\mathcal {O}(N\log {N})$ correlator system that can be scaled to largeN redundant arrays. We compare the performance and scalability of these two calibration schemes and find that they result in antenna gains whose variance decreases as 1/log N with increase in the size of the array.
 Award ID(s):
 1701536
 Publication Date:
 NSFPAR ID:
 10208133
 Journal Name:
 Monthly Notices of the Royal Astronomical Society
 Volume:
 500
 Issue:
 1
 Page Range or eLocationID:
 66 to 81
 ISSN:
 00358711
 Sponsoring Org:
 National Science Foundation
More Like this

ABSTRACT 
A bstract We study the large charge sector of the defect CFT defined by the halfBPS Wilson loop in planar N = 4 supersymmetric YangMills theory. Specifically, we consider correlation functions of two large charge insertions and several light insertions in the doublescaling limit where the ’t Hooft coupling λ and the large charge J are sent to infinity, with the ratio J/ $$ \sqrt{\lambda } $$ λ held fixed. They are holographically dual to the expectation values of light vertex operators on a classical string solution with large angular momentum, which we evaluate in the leading large J limit. We also compute the twopoint function of large charge insertions by evaluating the onshell string action, supplemented by the boundary terms that generalize the one introduced by Drukker, Gross and Ooguri for the Wilson loop without insertions. For a special class of correlation functions, we reproduce the string results from field theory by using supersymmetric localization. The results are given by correlation functions in an “emergent” matrix model whose matrix size is proportional to J and whose spectral curve coincides with that of the classical string. Similar matrix models appeared in the study of extremal correlators in rank1 $$ \mathcal{N}more »

Largescale deep neural networks (DNNs) are both compute and memory intensive. As the size of DNNs continues to grow, it is critical to improve the energy efficiency and performance while maintaining accuracy. For DNNs, the model size is an important factor affecting performance, scalability and energy efficiency. Weight pruning achieves good compression ratios but suffers from three drawbacks: 1) the irregular network structure after pruning, which affects performance and throughput; 2) the increased training complexity; and 3) the lack of rigirous guarantee of compression ratio and inference accuracy. To overcome these limitations, this paper proposes CirCNN, a principled approach to represent weights and process neural networks using blockcirculant matrices. CirCNN utilizes the Fast Fourier Transform (FFT)based fast multiplication, simultaneously reducing the computational complexity (both in inference and training) from O(n2) to O(n log n) and the storage complexity from O(n2) to O(n), with negligible accuracy loss. Compared to other approaches, CirCNN is distinct due to its mathematical rigor: the DNNs based on CirCNN can converge to the same "effectiveness" as DNNs without compression. We propose the CirCNN architecture, a universal DNN inference engine that can be implemented in various hardware/software platforms with configurable network architecture (e.g., layer type, size, scales,more »

Deep neural networks (DNNs) have emerged as the most powerful machine learning technique in numerous artificial intelligent applications. However, the large sizes of DNNs make themselves both computation and memory intensive, thereby limiting the hardware performance of dedicated DNN accelerators. In this paper, we propose a holistic framework for energyefficient highperformance highlycompressed DNN hardware design. First, we propose blockcirculant matrixbased DNN training and inference schemes, which theoretically guarantee BigO complexity reduction in both computational cost (from O(n2) to O(n log n)) and storage requirement (from O(n2) to O(n)) of DNNs. Second, we dedicatedly optimize the hardware architecture, especially on the key fast Fourier transform (FFT) module, to improve the overall performance in terms of energy efficiency, computation performance and resource cost. Third, we propose a design flow to perform hardwaresoftware cooptimization with the purpose of achieving good balance between test accuracy and hardware performance of DNNs. Based on the proposed design flow, two blockcirculant matrixbased DNNs on two different datasets are implemented and evaluated on FPGA. The fixedpoint quantization and the proposed blockcirculant matrixbased inference scheme enables the network to achieve as high as 3.5 TOPS computation performance and 3.69 TOPS/W energy efficiency while the memory is saved by 108Xmore »

A bstract We study the fourpoint function of the lowestlying halfBPS operators in the $$ \mathcal{N} $$ N = 4 SU( N ) superYangMills theory and its relation to the flatspace fourgraviton amplitude in type IIB superstring theory. We work in a large N expansion in which the complexified YangMills coupling τ is fixed. In this expansion, nonperturbative instanton contributions are present, and the SL(2 , ℤ) duality invariance of correlation functions is manifest. Our results are based on a detailed analysis of the sphere partition function of the massdeformed SYM theory, which was previously computed using supersymmetric localization. This partition function determines a certain integrated correlator in the undeformed $$ \mathcal{N} $$ N = 4 SYM theory, which in turn constrains the fourpoint correlator at separated points. In a normalization where the twopoint functions are proportional to N 2 − 1 and are independent of τ and $$ \overline{\tau} $$ τ ¯ , we find that the terms of order $$ \sqrt{N} $$ N and $$ 1/\sqrt{N} $$ 1 / N in the large N expansion of the fourpoint correlator are proportional to the nonholomorphic Eisenstein series $$ E\left(\frac{3}{2},\tau, \overline{\tau}\right) $$ E 3 2 τ τ ¯ and $$more »