skip to main content

Title: Neural Network DPD via Backpropagation through a Neural Network Model of the PA
We demonstrate digital predistortion (DPD) using a novel, neural-network (NN) method to combat the nonlinearities in power amplifiers (PAs), which limit the power efficiency of mobile devices, increase the error vector magnitude, and cause inadequate spectral containment. DPD is commonly done with polynomial-based methods that use an indirect-learning architecture (ILA) which can be computationally intensive, especially for mobile devices, and overly sensitive to noise. Our approach using NNs avoids the problems associated with ILAs by first training a NN to model the PA then training a predistorter by backpropagating through the PA NN model. The NN DPD effectively learns the unique PA distortions, which may not easily fit a polynomial-based model, and hence may offer a favorable tradeoff between computation overhead and DPD performance. We demonstrate the performance of our NN method using two different power amplifier systems and investigate the complexity tradeoffs.
Authors:
; ; ;
Award ID(s):
1717218
Publication Date:
NSF-PAR ID:
10202730
Journal Name:
2019 IEEE 53rd Asilomar Conference on Signals, Systems and Computers
Page Range or eLocation-ID:
358 to 362
Sponsoring Org:
National Science Foundation
More Like this
  1. The primary source of nonlinear distortion in wireless transmitters is the power amplifier (PA). Conventional digital predistortion (DPD) schemes use high-order polynomials to accurately approximate and compensate for the nonlinearity of the PA. This is not practical for scaling to tens or hundreds of PAs in massive multiple-input multiple-output (MIMO) systems. There is more than one candidate precoding matrix in a massive MIMO system because of the excess degrees-of-freedom (DoFs), and each precoding matrix requires a different DPD polynomial order to compensate for the PA nonlinearity. This paper proposes a low-order DPD method achieved by exploiting massive DoFs of next-generation front ends. We propose a novel indirect learning structure which adapts the channel and PA distortion iteratively by cascading adaptive zero forcing precoding and DPD. Our solution uses a 3rd order polynomial to achieve the same performance as the conventional DPD using an 11th order polynomial for a 10010 massive MIMO configuration. Experimental results show a 70% reduction in computational complexity, enabling ultra-low latency communications.
  2. Abstract

    In the last decade, much work in atmospheric science has focused on spatial verification (SV) methods for gridded prediction, which overcome serious disadvantages of pixelwise verification. However, neural networks (NN) in atmospheric science are almost always trained to optimize pixelwise loss functions, even when ultimately assessed with SV methods. This establishes a disconnect between model verification during versus after training. To address this issue, we develop spatially enhanced loss functions (SELF) and demonstrate their use for a real-world problem: predicting the occurrence of thunderstorms (henceforth, “convection”) with NNs. In each SELF we use either a neighborhood filter, which highlights convection at scales larger than a threshold, or a spectral filter (employing Fourier or wavelet decomposition), which is more flexible and highlights convection at scales between two thresholds. We use these filters to spatially enhance common verification scores, such as the Brier score. We train each NN with a different SELF and compare their performance at many scales of convection, from discrete storm cells to tropical cyclones. Among our many findings are that (i) for a low or high risk threshold, the ideal SELF focuses on small or large scales, respectively; (ii) models trained with a pixelwise loss function performmore »surprisingly well; and (iii) nevertheless, models trained with a spectral filter produce much better-calibrated probabilities than a pixelwise model. We provide a general guide to using SELFs, including technical challenges and the final Python code, as well as demonstrating their use for the convection problem. To our knowledge this is the most in-depth guide to SELFs in the geosciences.

    Significance Statement

    Gridded predictions, in which a quantity is predicted at every pixel in space, should be verified with spatially aware methods rather than pixel by pixel. Neural networks (NN), which are often used for gridded prediction, are trained to minimize an error value called the loss function. NN loss functions in atmospheric science are almost always pixelwise, which causes the predictions to miss rare events and contain unrealistic spatial patterns. We use spatial filters to enhance NN loss functions, and we test our novel spatially enhanced loss functions (SELF) on thunderstorm prediction. We find that different SELFs work better for different scales (i.e., different-sized thunderstorm complexes) and that spectral filters, one of the two filter types, produce unexpectedly well calibrated thunderstorm probabilities.

    « less
  3. Abstract Deep neural networks (DNNs) have substantial computational requirements, which greatly limit their performance in resource-constrained environments. Recently, there are increasing efforts on optical neural networks and optical computing based DNNs hardware, which bring significant advantages for deep learning systems in terms of their power efficiency, parallelism and computational speed. Among them, free-space diffractive deep neural networks (D 2 NNs) based on the light diffraction, feature millions of neurons in each layer interconnected with neurons in neighboring layers. However, due to the challenge of implementing reconfigurability, deploying different DNNs algorithms requires re-building and duplicating the physical diffractive systems, which significantly degrades the hardware efficiency in practical application scenarios. Thus, this work proposes a novel hardware-software co-design method that enables first-of-its-like real-time multi-task learning in D 2 2NNs that automatically recognizes which task is being deployed in real-time. Our experimental results demonstrate significant improvements in versatility, hardware efficiency, and also demonstrate and quantify the robustness of proposed multi-task D 2 NN architecture under wide noise ranges of all system components. In addition, we propose a domain-specific regularization algorithm for training the proposed multi-task architecture, which can be used to flexibly adjust the desired performance for each task.
  4. The energy and latency demands of critical workload execution, such as object detection, in embedded systems vary based on the physical system state and other external factors. Many recent mobile and autonomous System-on-Chips (SoC) embed a diverse range of accelerators with unique power and performance characteristics. The execution flow of the critical workloads can be adjusted to span into multiple accelerators so that the trade-off between performance and energy fits to the dynamically changing physical factors. In this study, we propose running neural network (NN) inference on multiple accelerators of an SoC. Our goal is to enable an energy-performance trade-off with an by distributing layers in a NN between a performance- and a power-efficient accelerator. We first provide an empirical modeling methodology to characterize execution and inter-layer transition times. We then find an optimal layers-to-accelerator mapping by representing the trade-off as a linear programming optimization constraint. We evaluate our approach on the NVIDIA Xavier AGX SoC with commonly used NN models. We use the Z3 SMT solver to find schedules for different energy consumption targets, with up to 98% prediction accuracy.
  5. Abstract. Advances in ambient environmental monitoring technologies are enabling concerned communities and citizens to collect data to better understand their local environment and potential exposures. These mobile, low-cost tools make it possible to collect data with increased temporal and spatial resolution, providing data on a large scale with unprecedented levels of detail. This type of data has the potential to empower people to make personal decisions about their exposure and support the development of local strategies for reducing pollution and improving health outcomes. However, calibration of these low-cost instruments has been a challenge. Often, a sensor package is calibrated via field calibration. This involves colocating the sensor package with a high-quality reference instrument for an extended period and then applying machine learning or other model fitting technique such as multiple linear regression to develop a calibration model for converting raw sensor signals to pollutant concentrations. Although this method helps to correct for the effects of ambient conditions (e.g., temperature) and cross sensitivities with nontarget pollutants, there is a growing body of evidence that calibration models can overfit to a given location or set of environmental conditions on account of the incidental correlation between pollutant levels and environmental conditions, including diurnalmore »cycles. As a result, a sensor package trained at a field site may provide less reliable data when moved, or transferred, to a different location. This is a potential concern for applications seeking to perform monitoring away from regulatory monitoring sites, such as personal mobile monitoring or high-resolution monitoring of a neighborhood. We performed experiments confirming that transferability is indeed a problem and show that it can be improved by collecting data from multiple regulatory sites and building a calibration model that leverages data from a more diverse data set. We deployed three sensor packages to each of three sites with reference monitors (nine packages total) and then rotated the sensor packages through the sites over time. Two sites were in San Diego, CA, with a third outside of Bakersfield, CA, offering varying environmental conditions, general air quality composition, and pollutant concentrations. When compared to prior single-site calibration, the multisite approach exhibits better model transferability for a range of modeling approaches. Our experiments also reveal that random forest is especially prone to overfitting and confirm prior results that transfer is a significant source of both bias and standard error. Linear regression, on the other hand, although it exhibits relatively high error, does not degrade much in transfer. Bias dominated in our experiments, suggesting that transferability might be easily increased by detecting and correcting for bias. Also, given that many monitoring applications involve the deployment of many sensor packages based on the same sensing technology, there is an opportunity to leverage the availability of multiple sensors at multiple sites during calibration to lower the cost of training and better tolerate transfer. We contribute a new neural network architecture model termed split-NN that splits the model into two stages, in which the first stage corrects for sensor-to-sensor variation and the second stage uses the combined data of all the sensors to build a model for a single sensor package. The split-NN modeling approach outperforms multiple linear regression, traditional two- and four-layer neural networks, and random forest models. Depending on the training configuration, compared to random forest the split-NN method reduced error 0 %–11 % for NO2 and 6 %–13 % for O3.« less