skip to main content

Title: Optimizing the LO Distribution Architecture of mm-Wave Massive MIMO Receivers
Wireless networks at millimeter wavelengths have significant implementation difficulties. The path loss at these frequencies naturally leads us to consider antenna arrays with many elements. In these arrays, local oscillator (LO) generation is particularly challenging since the LO specifications affect the system architecture, signal processing design, and circuit implementation. We thoroughly analyze the effect of LO ar- chitecture design choices on the performance of a mm-wave massive MIMO uplink. This investigation focuses on the tradeoffs involved in centralized and distributed LO generation, correlated and uncorrelated phase noise sources, and the bandwidths of PLLs and carrier recovery loops. We show that, from both a performance and implementation complexity standpoint, the op- timal LO architecture uses several distributed subarrays locked to a single intermediate-frequency reference in the low GHz range. Additionally, we show that the choice of PLL and carrier recovery loop bandwidths strongly affects the performance; for typical system parameters, loop bandwidths on the order of tens of MHz achieve SINRs suitable for high-order constellations. Finally, we present system simulations incorporating a complete model of the LO generation system and consider the case of a 128-element array with 16x-spatial multiplexing and a 2 GHz channel bandwidth at 75 GHz carrier. Using more » our optimization procedure we show that the system can support 16-way spatial multiplexing with 64-QAM modulation. « less
; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents a two-layer RF/analog weighting MIMO transceiver that comprises fully-connected (FC) multi-stream beamforming tiles in the RF-domain first layer, followed by a fully connected analog- or digital-domain baseband layer. The architecture mitigates the complexity versus spectral-efficiency tradeoffs of existing hybrid MIMO architectures and enables MIMO stream/user scalability, superior energy-efficiency, and spatial-processing flexibility. Moreover, multi-layer architectures with FC tiles inherently enable the co-existence of MIMO with carrier-aggregation and full-duplex beamforming. A compact, reconfigurable bidirectional circuit architecture is introduced, including a new Cartesian-combining/splitting beamforming receiver/transmitter, dual-band bidirectional beamforming network, dual-band frequency translation chains, and baseband Cartesian beamforming with an improved programmable gain amplifier design. A 28/37 GHz band, two-layer, eight-element, four-stream (with two FC-tiles) hybrid MIMO transceiver prototype is designed in 65-nm CMOS to demonstrate the above features. The prototype achieves accurate beam/null-steering capability, excellent area/power efficiency, and state-of-the-art TX/RX mode performance in two simultaneous bands while demonstrating multi-antenna (up to eight) multi-stream (up to four) over-the-air spatial multiplexing operation using proposed energy-efficient two-layer hybrid beamforming scheme.
  2. Millimeter wave (mmW) communications is viewed as the key enabler of 5G cellular networks due to vast spectrum availability that could boost peak rate and capacity. Due to increased propagation loss in mmW band, transceivers with massive antenna array are required to meet a link budget, but their power consumption and cost become limiting factors for commercial systems. Radio designs based on hybrid digital and analog array architectures and the usage of radio frequency (RF) signal processing via phase shifters have emerged as potential solutions to improve radio energy efficiency and deliver performances close to the conventional digital antenna arrays. In this paper, we provide an overview of the state-of-the-art mmW massive antenna array designs and comparison among three array architectures, namely digital array, partially-connected hybrid array (sub-array), and fully-connected hybrid array. The comparison of performance, power, and area for these three architectures is performed for three representative 5G downlink use cases, which cover a range of pre-beamforming signal-to-noise-ratios (SNR) and multiplexing regimes. This is the first study to comprehensively model and quantitatively analyze all design aspects and criteria including: 1) optimal linear precoder, 2) impact of quantization error in digital-to-analog converter (DAC) and phase shifters, 3) RF signal distributionmore »network, 4) power and area estimation based on state-of-the-art mmW circuits including baseband digital precoding, digital signal distribution network, high-speed DACs, oscillators, mixers, phase shifters, RF signal distribution network, and power amplifiers. Our simulation results show that the fully-digital array architecture is the most power and area efficient compared against optimized designs for sub-array and hybrid array architectures. Our analysis shows that digital array architecture benefits greatly from multi-user multiplexing. The analysis also reveals that sub-array architecture performance is limited by reduced beamforming gain due to array partitioning, while the system bottleneck of the fully-connected hybrid architecture is the excessively complicated and power hungry RF signal distribution network.« less
  3. Controller design and their software implementations are usually done in isolated design spaces using respective COTS design tools. However, this separation of concerns can lead to long debugging and integration phases. This is because assumptions made about the implementation platform during the design phase—e.g., related to timing—might not hold in practice, thereby leading to unacceptable control performance. In order to address this, several control/architecture co-design techniques have been proposed in the literature. However, their adoption in practice has been hampered by the lack of design flows using commercial tools. To the best of our knowledge, this is the first article that implements such a co-design method using commercially available design tools in an automotive setting, with the aim of minimally disrupting existing design flows practiced in the industry. The goal of such co-design is to jointly determine controller and platform parameters in order to avoid any design-implementation gap , thereby minimizing implementation time testing and debugging. Our setting involves distributed implementations of control algorithms on automotive electronic control units ( ECUs ) communicating via a FlexRay bus. The co-design and the associated toolchain Co-Flex jointly determines controller and FlexRay parameters (that impact signal delays) in order to optimize specified designmore »metrics. Co-Flex seamlessly integrates the modeling and analysis of control systems in MATLAB/Simulink with platform modeling and configuration in SIMTOOLS/SIMTARGET that is used for configuring FlexRay bus parameters. It automates the generation of multiple Pareto-optimal design options with respect to the quality of control and the resource usage, that an engineer can choose from. In this article, we outline a step-by-step software development process based on Co-Flex tools for distributed control applications. While our exposition is automotive specific, this design flow can easily be extended to other domains.« less

    Next-generation aperture arrays are expected to consist of hundreds to thousands of antenna elements with substantial digital signal processing to handle large operating bandwidths of a few tens to hundreds of MHz. Conventionally, FX correlators are used as the primary signal processing unit of the interferometer. These correlators have computational costs that scale as $\mathcal {O}(N^2)$ for large arrays. An alternative imaging approach is implemented in the E-field Parallel Imaging Correlator (EPIC) that was recently deployed on the Long Wavelength Array station at the Sevilleta National Wildlife Refuge (LWA-SV) in New Mexico. EPIC uses a novel architecture that produces electric field or intensity images of the sky at the angular resolution of the array with full or partial polarization and the full spectral resolution of the channelizer. By eliminating the intermediate cross-correlation data products, the computational costs can be significantly lowered in comparison to a conventional FX or XF correlator from $\mathcal {O}(N^2)$ to $\mathcal {O}(N \log N)$ for dense (but otherwise arbitrary) array layouts. EPIC can also lower the output data rates by directly yielding polarimetric image products for science analysis. We have optimized EPIC and have now commissioned it at LWA-SV as a commensal all-sky imaging back-end that can potentiallymore »detect and localize sources of impulsive radio emission on millisecond timescales. In this article, we review the architecture of EPIC, describe code optimizations that improve performance, and present initial validations from commissioning observations. Comparisons between EPIC measurements and simultaneous beam-formed observations of bright sources show spectral-temporal structures in good agreement.

    « less
  5. R-tree is a foundational data structure used in spatial databases and scientific databases. With the advancement of networks and computer architectures, in-memory data processing for R-tree in distributed systems has become a common platform. We have observed new performance challenges to process R-tree as the amount of multidimensional datasets become increasingly high. Specifically, an R-tree server can be heavily overloaded while the network and client CPU are lightly loaded, and vice versa. In this article, we present the design and implementation of Catfish, an RDMA-enabled R-tree for low latency and high throughput by adaptively utilizing the available network bandwidth and computing resources to balance the workloads between clients and servers. We design and implement two basic mechanisms of using RDMA for a client-server R-tree data processing system. First, in the fast messaging design, we use RDMA writes to send R-tree requests to the server and let server threads process R-tree requests to achieve low query latency. Second, in the RDMA offloading design, we use RDMA reads to offload tree traversal from the server to the client, which rescues the server as it is overloaded. We further develop an adaptive scheme to effectively switch an R-tree search between fast messaging andmore »RDMA offloading, maximizing the overall performance. Our experiments show that the adaptive solution of Catfish on InfiniBand significantly outperforms R-tree that uses only fast messaging or only RDMA offloading in both latency and throughput. Catfish can also deliver up to one order of magnitude performance over the traditional schemes using TCP/IP on 1 and 40 Gbps Ethernet. We make a strong case to use RDMA to effectively balance workloads in distributed systems for low latency and high throughput.« less