## A 41.5 pJ/b, 2.4GHz Digital-Friendly Orthogonally Tunable Transceiver SoC with 3-decades of Energy-Performance Scalability

Baibhab Chatterjee, *Student Member, IEEE* and Shreyas Sen, *Senior Member, IEEE* School of ECE, Purdue University, West Lafayette, IN 47907, USA *E-mail: bchatte@purdue.edu, shreyas@purdue.edu* 

Abstract— Adaptive communication for Internet of Things (IoT) and Wireless Body Area Network (WBAN) technologies is becoming increasingly popular due to the large powerperformance trade-offs and highly dynamic channel conditions. Path loss, low signal to noise ratio (SNR) in the channel and network congestion adversely affect the data communication, each of which can be taken care of using different strategies such as reducing the data rate (for reducing congestion), increasing the output power (for increased path loss) and application of error correction coding (ECC, for low SNR). In this paper, we present a digital-friendly Transceiver SoC consisting of an RF-DAC based transmitter with orthogonally tunable output power, data rate and ECC that enables optimum system level bit error rate (BER) and energy for over 3-orders of energy-performance scalability, along with an ultra-low-power OOK receiver that receives the transmitter's control bits from a nearby base station for closedloop control. The data rate and ECC control is achieved through a digital baseband, while a tapped capacitor matching network controls the output power. The energy efficiency of the transmitter is 27.6pJ/b at 10MSps and at 0.8V supply (~9X improvement over state-of-the-art), while the entire SoC (Transmitter+OOK receiver for controller feedback) consumes only 41.5pJ/b.

Keywords— adaptive, orthogonal tunability, reconfigurable, digital-friendly, system optimization, transceiver, system-on-achip (SoC), low-power, CMOS, sub-50pJ/b, Pout control, ECC.

#### I. INTRODUCTION

### A. Background and Motivation

Energy-Performance Scalability has been widely adopted in Digital systems (e.g. DVFS [1]-[3]). However, while Communication Systems suffer from much wider dynamic variations in channel conditions and data-rate needs, wide energy-performance scalability has been relatively less common, primarily due to the challenges of reconfiguring radio-frequency (RF) circuits on-the-fly. Traditional techniques of such adaptation involve adaptive modulation and coding (AMC) [4], which increases the order of modulation (for example, QPSK to 16-QAM to 64-QAM) as the channel quality improves, and vice versa. This increases the spectral efficiency of the overall transmission by packing more bits in each symbol. However, the overall power consumption of such systems effectively remain constant, posing serious limitations on the battery lifetime. Along with the modulation technique, data rates (DR), output power (Pout) and possible error correction coding (ECC) mechanisms need to adapt in a way that scales down the system power when requirements are lower. This is shown in Fig. 1, for a system level WBAN example wherein the data from a multi-channel neural recorder is sent to a nearby base station, which in turn communicates with a remote server for further storage and processing. As shown in [5], high-speed applications such as neural recording often require 10s of Mbps data rates, which may create network congestion during the communication with either the



Optimization Need: Minimize BER with Minimum Energy

Energy-performance Scalability achieved in this paper:

FCC

|           | Max                     | Min                        | (Max/Min) Ratio |
|-----------|-------------------------|----------------------------|-----------------|
| Pout      | -0.3dBm (0.93mW)        | -13.5dBm (0.045mW)         | ~20X            |
|           | @ 29.3% Tx efficiency   | @ 7.9% Tx efficiency       |                 |
| Data-Rate | 10 MSps                 | 40 kSps                    | ~250X           |
| ECC       | On                      | Off                        | -               |
| Energy/b  | 79.872nJ/b              | 27.6pJ/b                   | ~3000X          |
|           | @40kSps,-0.3dBm, ECC on | @10MSps, -13.5dBm, ECC off |                 |

Fig. 1. System-level challenges and reconfigurable solutions in IoT/WBAN communication: 1) For communication to nearby base station, output power ( $P_{out}$ ) is dynamically increased or data rate (DR) is lowered for high channel loss; 2) Error Control Coding (ECC) is employed when signal to noise ratio (SNR) is low; 3) For Network congestion, DR is lowered through feedback from the server and the base station; 4) An energy-optimized, reconfigurable RF-DAC based transmitter reduces system power. This constitutes an optimization space, containing 3 given quantities at system-level (path loss, SNR, network congestion), 3 orthogonally tunable/controllable quantities ( $P_{out}$ , ECC, DR) and 2 quantities to be optimized (system BER and energy).

base station or the server, and may require reducing the data rates from time to time. Reduction of data rates may also be a viable solution to increasing path loss and low SNR/BER (bit error rate) at the cost of longer communication time and hence higher energy. Controlling the output power (P<sub>out</sub>) with good back-off efficiency and utilizing ECC are other promising alternatives to deal with variable channel loss and noise, without changing the data rates. At the system level, this



Fig. 2. System-level block diagram of (a) reconfigurable RF-DAC based Tx with digital baseband, 2.4GHz LO generator and on-chip tapped capacitor based matching network, (b) ultra-low-power (ULP) OOK Rx with 4-stage gate-biased envelope detector.



Fig. 3. Circuit diagrams of (a) Reconfigurable RF-DAC based PA with programmable on-chip tapped-capacitor matching network for Tx (b) 2.4GHz, low-power LO generator for Tx and (c) 4-stage gate-biased envelope detector for ULP Rx.

constitutes an optimization space, containing 1) three known quantities - path loss, SNR, network congestion, 2) three orthogonally tunable quantities - P<sub>out</sub>, ECC, DR, 3) two quantities to be optimized - system BER and energy. Recent advances in digital friendly transceivers [6]-[10] have made such energy-performance scalability possible.

#### B. Our Contribution

In this paper, we embrace the philosophy of multi-dimensional orthogonal tuning knobs to achieve 3 decades of energy-scalability (compared to previous best of ~30X [6]) through a very digital-like radio transceiver (TRx) that gracefully degrades performance with energy by adaptively controlling DR, Pout and ECC according to the channel conditions that enable best system level energy and BER. Along with the scalability, we demonstrate a  $\sim 9X$  improvement in energy-efficiency compared to [7] @ 10MSps and >15x improvement compared to [6] at 1 MSps. An ultra-low-power (ULP) receiver (Rx) is also implemented on the same SoC that receives the control signals from nearby base station to configure the Transmitter (Tx) for optimum DR, Pout and ECC modes. Quantitatively, this is the most energy-efficient TRx reported at 2.4GHz (27.6pJ/b QPSK transmitter at 10MSps, 13.9pJ/b OOK receiver at 10Mbps) using a proposed RF digital to analog converter (RF-DAC) architecture and a digital baseband that supports variable data rates and ECC options. The RF-DAC can be optionally reconfigured for different modulations. A fully on-chip programmable tapped capacitor matching network controls the Pout by changing the load impedance as seen by the RF-DAC.

## C. Related Work

Supply modulation [11] and switched-capacitor based power amplifiers (SCPA) [12] are traditionally used in literature to obtain variable  $P_{out}$ . However, the range of  $P_{out}$  achievable through supply modulation is small (2X reduction in supply reduces the  $P_{out}$  by ~6dB, which is why [11] reports an efficiency of 36% at  $P_{out}$ =4.5dBm and 14% at  $P_{out}$ =-4dBm). On the other hand, SCPAs consume large power while turning on/off the multiple switches in the design. [6] reported a class-D voltage mode power amplifier (PA) with reconfigurable on-chip tapped capacitor matching network that programs the capacitors to control  $P_{out}$ . However, the architecture presented in [6] does not support higher order modulation schemes such as 16-QAM or 64-QAM. In this

# work. We present an RF-DAC based Transmitter (Tx) that utilizes the tapped-capacitor matching network for changing $P_{out}$ and can be extended to higher order modulations.

## II. OPERATING PRINCIPLES AND CIRCUIT DESIGN

## A. Energy-Efficient RF-DAC based Tx

The system-level block diagram for the adaptive RF subsystem is shown in Fig. 2, and contains a reconfigurable RF-DAC based Inverse class-D PA with adaptive closed-loop control on DR (8 bits - for a 256 step control of symbol rate from 40kSps to 10MSps), ECC (1 bit control for enabling [8,4] Hamming Code) and Output power (3 bits). The controls on data rate and ECC are employed in the digital baseband in the Tx, while the output power control (along with digital amplitude pulse-shaping) is performed using an on-chip tapped capacitor matching network (MN) with tunable capacitor banks for low MN loss and high back-off efficiency. Fig. 3 (a) shows the circuit diagram of the RF-DAC based PA. The 4X switches are turned on (transistors M1 and M<sub>2</sub>) for the default QPSK mode, while only the 4X switch for the inphase path (I-path transistor M<sub>1</sub>) is turned on for OOK. For 16-QAM and 64-QAM, the 2X and 1X switches in both in-phase and quadrature paths are turned on. This architecture also eliminates the need for separate I-path and Q-path DACs and mixers, which lowers the overall system power. The 4 LO phases (which are 90° apart from each other) are generated using an injection locked inverter based quadrature LO generator as shown in Fig. 3(b). The 4 phases (Clk0, Clk90, Clk180 and Clk270) are multiplexed according to the LO Ctrl signals coming from the I-Q signal mapper, and are fed to the I and Q-paths as shown in Fig. 3(a). As shown in [13], the inverse class-D configuration achieves a higher maximum output power  $\left(P_{out,D^{-1}} = \frac{V_{DD}^2}{4} \times \frac{20Z_L}{(5R_{on}+Z_L)^2}\right)$  as compared to the class-D configuration  $\left(P_{out,D} = \frac{4V_{DD}^2}{5} \times \frac{Z_L}{(2R_{out}+Z_L)^2}\right)$ 

for same supply, which is why we adopted the inverse class-D topology.

## B. ULP OOK Rx using 4-stage Gate-Biased Envelope Detector

The information about the channel and network conditions as part of the closed-loop system-level feedback are received through an OOK Rx with 2 stages of RF gain (~20 dB), an Envelope detector (ED) and 2 stages of BB VGA (~20-40 dB) as shown in Fig. 2(b). The ED is



Fig. 4. Design of the tapped-capacitor matching network that offers variable load  $R_{MN}$  to the RF-DAC through choice of  $C_1$  and  $C_2$ . The off-chip interface with the antenna (pad-bondwire-PCB) is taken care of as a part of the design.

implemented in a 4-stage gate biased topology [14] that enhances the output impedance of the ED by  $\sim$ 16X, which results in  $\sim$ 4X higher signal voltages at the output of the ED, thereby compensating for the loss from traditional single-stage EDs.

## C. Tapped-Capacitor Matching Network for power delivery

The circuit schematic for the tapped capacitor MN is shown in Fig. 4. For passive components with infinite quality factor (Q), the load resistance seen by the RF-DAC would be  $R_{MN,ideal} = \left(1 + \frac{C_1}{C_2}\right)^2 \times 50\Omega$ , while the power delivered to the 50 $\Omega$  antenna (Pout) would be  $I_{avg,RF-DAC}^2 \times R_{MN,ideal}$  where  $I_{avg,RF-DAC}$  is the average current through the RF-DAC. By changing the ratio  $\left(\frac{C_1}{C_2}\right)$ , Pout can be modified. The effect of finite Q of the on-chip passives is considered during the design, and the design equations for finite Q is provided in Fig. 4, along with the expected values of  $R_{MN,real}$ . It is to be noted that the off-chip interface with the antenna at the tapping point consists a pad capacitance (C<sub>pad</sub>), bond-wire inductance (L<sub>bond</sub>) and PCB capacitance (C<sub>PCB</sub>), all of which should be considered during the design process.

#### III. MEASUREMENT RESULTS

### A. Tx Efficiency, Output Power, pJ/b and Data/Symbol Rates:

Fig. 5(a)-(d) presents the measurement results for the TRx SoC in the default (QPSK) mode. Fig. 5(a) shows the Tx efficiency ( $\eta$ ) vs. Output power (Pout) for the RF-DAC, and exhibits a range of -13.5dBm to -0.3dBm in terms of P<sub>out</sub> and 29.5% to 7.9% in terms of  $\eta$ . Fig. 5(b) presents the Tx energy efficiency (pJ/b) w.r.t. Output power (Pout). For -0.3dBm Pout, the energy efficiency is 158pJ/b at 10MSps symbol rate (i.e. 20Mbps data rate). For a Pout of -13.5dBm, the energy efficiency improves to 27.6pJ/b at 10MSps symbol rate. Fig. 5(c) plots the Tx energy efficiency (pJ/b) vs Data Rate, showing a linear trend over 3 orders of energy and symbol rates (which translates to better BER at lower symbol rates at system level), while Fig. 5(d) represents the Rx energy efficiency (pJ/b) vs Data Rate. Fig. 5(e) shows the Chip Micrograph for the SoC (total silicon area = 0.75mm<sup>2</sup>).

## B. Effect of ECC and Improvement in Rx Sensitivity

Fig. 6 presents the effect of ECC on the sensitivity of the Rx. At 10Mbps data rate, the sensitivity of the Rx for a bit error rate (BER) of  $10^{-3}$  is -62dBm without ECC. When [8,4]-Hamming code based ECC is enabled, the sensitivity for the same conditions improves to -67dBm. However, the number of information-carrying bits with [8,4]-Hamming code reduce to half of that without ECC, resulting in an effective data rate of 5Mbps. It is also important to note that ECC results in a significant (~10X) improvement in BER when the received signal strength at the Rx is in the range of -40 to -55dBm. For lower amplitudes (for example, -75dBm), burst errors are observed in the received datastream, for which Hamming codes show an insignificant improvement over the case without ECC.

#### COMPARISON WITH PREVIOUS WORKS

Table I presents a comparison table with relevant TRx SoCs in the 2.4GHz frequency band, The energy-efficiency of The Tx is 27.6pJ/b at 0.8V supply, 10MSps symbol rate and -13.5dBm P<sub>out</sub>, which is >9X improvement over [7], which is the reported state-of-the-art for transmitters with a similar P<sub>out</sub>. The Rx energy efficiency of 13.9pJ/b at 0.8V supply and 10Mbps data rate offers >7.5X improvement over [7]. The transmit efficiency of the Tx also reaches ~29.5% due to low LO generation power and power-optimized RF-DAC which is close to



Fig. 5. Measurement Results for the TRx SoC in the default QPSK mode: (a) Tx efficiency ( $\eta$ ) vs. Output power ( $P_{out}$ ), showing a range of -13.5dBm to - 0.3dBm in  $P_{out}$  and 29.5% to 7.9% in  $\eta$ , (b) Tx energy efficiency (pJ/b) vs Output power ( $P_{out}$ ), showing energy numbers as low as 158pJ/b (for -0.3dBm  $P_{out}$ ) and 27.6pJ/b (for -13.5dBm  $P_{out}$ ), (c) Tx energy efficiency (pJ/b) vs symbol rate (1symbol = 2bits for the default QPSK mode), (d) Rx energy efficiency (pJ/b) vs Data Rate, (e) Chip Micrograph (silicon area = 0.75mm<sup>2</sup>).

| Tuble 1. Performance Summary of the implemented That Boes with Felevant Boes in the 2. (STEE Requency State. |                                                                                   |                                                     |                                        |                                       |                                            |                                     |  |  |
|--------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|-----------------------------------------------------|----------------------------------------|---------------------------------------|--------------------------------------------|-------------------------------------|--|--|
| Parameter                                                                                                    | This Work                                                                         | JSSC'13<br>A. Paidimarri [6]                        | ISSCC'11<br>M. Vidojkovic [7]          | JSSC'19<br>X. Chen [8]                | ISSCC'15<br>A. Klinefelter [9]             | ISSCC'18<br>H. Liu [10]             |  |  |
| Technology                                                                                                   | 65 nm CMOS                                                                        | 65 nm CMOS                                          | 65 nm CMOS                             | 65 nm CMOS                            | 130 nm CMOS                                | 65 nm CMOS                          |  |  |
| Supply Voltage                                                                                               | 0.8-1V (Tx), 0.8-1.2V (Rx)                                                        | 0.5-1V (Tx only)                                    | 1V                                     | 0.6-0.9V                              | 0.5V(D), 1.2V(A)                           | 1V                                  |  |  |
| Operating Freq.                                                                                              | 2.48GHz                                                                           | 2.48GHz                                             | 2.4GHz                                 | 2.4GHz                                | 2.4-2.48GHz                                | 2.4GHz                              |  |  |
| Architecture                                                                                                 | RF-DAC Tx,<br>ED-based Rx                                                         | FBAR Oscillator<br>+ class-D PA                     | Direct Modulation<br>(Tx)              | RO+ADPLL+EC                           | XO+ADPLL                                   | ADPLL based<br>Polar Tx and Rx      |  |  |
| Modulation                                                                                                   | QPSK (default), supports<br>OOK, 16-QAM, 64-QAM                                   | OOK, supports BPSK and<br>GMSK                      | ООК                                    | GFSK (BLE)                            | оок                                        | GFSK (BLE)                          |  |  |
| Data Rate                                                                                                    | 0.04-10MSps                                                                       | 1Mbps                                               | 1-10Mbps                               | 1Mbps                                 | 187.5kbps                                  | 1Mbps                               |  |  |
| Tx Power                                                                                                     | Min: 0.54mW (0.8V,-13.5dBm),<br>Max: 3.12mW (1V, -0.3dBm)                         | 0.44mW (OOK, 1Mbps,<br>-12.5 dBm)                   | 2.53mW                                 | 0.486mW                               | 4.18µW                                     | 3.1mW                               |  |  |
| Tx Energy<br>Efficiency                                                                                      | Min: 27.6pJ/b (0.8V,-13.5dBm),<br>Max: 156pJ/b (1V, -0.3dBm)<br>Data Rate: 10MSps | 440pJ/b (OOK,<br>1Mbps,<br>-12.5 dBm)               | 253pJ/b (OOK,<br>10Mbps,<br>-12.5 dBm) | 486pJ/b (GFSK,<br>1Mbps,<br>-3.3 dBm) | 22.3pJ/b (OOK,<br>187.5kbps,<br>-28.9 dBm) | 3.1nJ/b (GFSK,<br>1Mbps,<br>-3 dBm) |  |  |
| Tx Output Power                                                                                              | Min: -13.5dBm, Max: -0.3dBm                                                       | -17dBm to -2.5dBm                                   | -2.2dBm                                | -9.4dBm to -3.3dBm                    | -28.9dBm                                   | -3dBm                               |  |  |
| Tx Efficiency                                                                                                | Max: 29.5% (1V, -0.3dBm),<br>Min: 7.9% (0.8V,-13.5dBm)                            | Max: 33% (0.5V,-9.5dBm),<br>Min: 8.2% (1V,-10.9dBm) | 24%                                    | 32%@-3dBm                             | -                                          | -                                   |  |  |
| Rx Power                                                                                                     | Min: 0.139mW (0.8V),<br>Max: 0.621mW (1.2V)                                       | NA                                                  | 0.534mW                                | NA                                    | 112nW                                      | 2.6mW                               |  |  |
| Rx Energy<br>Efficiency                                                                                      | Min: 13.9pJ/b (0.8V),<br>Max: 62.1pJ/b (1.2V)<br>Data Rate: 10Mbps                | NA                                                  | 107pJ/b (OOK,<br>5Mbps)                | NA                                    | 161pJ/b (15-bit OOK<br>wake-up sequence)   | 2.6nJ/b (GFSK,<br>1Mbps)            |  |  |
| Adaptability                                                                                                 | Data Rate, Output Power, ECC                                                      | Output Power, Modulation                            | Data Rate                              | NA                                    | NA                                         | NA                                  |  |  |
| Energy-Scalability                                                                                           | ~3000X                                                                            | ~30X                                                | ~10X                                   | ~4X                                   | NA                                         | NA                                  |  |  |

Table I. Performance summary of the implemented TRx SoC with relevant SoCs in the 2.4GHz frequency band



Fig. 6. Measured BER of the Rx for variable received signal strength, with and without ECC. The sensitivity for a BER of  $10^{-3}$  at a data rate of 10Mbps is increased from -62dBm (without ECC) to -67dBm (with ECC).

the 32-33% efficiency numbers as reported by [6] and [8]. However, it should be noted that [6] and [8] implements simpler modulation schemed such as OOK and GFSK. This work, on the other hand, implements an RF-DAC topology which can be extended to higher order modulation schemes such as 16-QAM or 64-QAM. The current work also presents adaptability in terms of data rate, P<sub>out</sub> ad ECC as opposed to the other reported works, resulting in an overall energy scalability of ~3000X (100X more than [6]).

#### CONCLUSIONS AND FUTURE WORK

In conclusion, The RF-DAC based architecture eliminates the need for I-path and Q-path DAC and mixers, which lowers the overall system power, resulting in 156 pJ/b energy efficiency at -0.3dBm P<sub>out</sub> at 10MSps (27.6pJ/b at at -13.5dBm P<sub>out</sub> at 10MSps). The NMOS-only current-mode RF-DAC allows combination of I and Q-paths in current domain which supports a higher output power than its voltage-mode counterpart, while the on-chip tapped-capacitor matching network allows reconfigurable output power with low matching network loss and high back-off efficiency (29.5% at -0.3 dBm, 19.2% at -6.5 dBm, 7.9% at -13.7dBm), without the need for modifying the inductor value (unlike a  $\pi$ -matching network). At the system-level, this TRx enables BER and energy optimization with 3 orders of scalability in DR, P<sub>out</sub>

and ECC (all orthogonally tunable quantities). As part of the future work, the spectral efficiency and energy efficiency of different modulation schemes would be analysed theoretically and would be verified with measurements for further adaptability and better spectral /energy efficiency with higher order modulations.

#### ACKNOWLEDGMENT

This work was supported in part by the Semiconductor Research Corporation under Grant 2720.001 and in part by the National Science Foundation CRII Award under Grant CNS 1657455.

#### REFERENCES

- M. Horowitz et al., "Low-power Digital Design," IEEE Symp. On Low Power Electronics, 1994.
- [2] J. Howard et al., "A 48-Core IA-32 Processor in 45 nm CMOS Using On-Die Message-Passing and DVFS for Performance and Power Scaling", JSSC, 2011.
- [3] Dan Ernst et al., "Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation", IEEE/ACM MICRO, 2003
- [4] A. J. Goldsmith and S. Chua, "Adaptive coded modulation for fading channels," *IEEE Transactions on Communications*, 1998.
- [5] H. Ando *et al.*, "Wireless Multichannel Neural Recording With a 128-Mbps UWB Transmitter for an Implantable Brain-Machine Interfaces," *IEEE TBioCAS*, 2016.
- [6] A. Paidimarri *et al.*, "A 2.4 GHz Multi-Channel FBAR-based Transmitter With an Integrated Pulse-Shaping Power Amplifier," *JSSC* 2013.
- [7] M. Vidojkovic et al., "A 2.4GHz ULP OOK single-chip transceiver for healthcare applications," ISSCC, 2011.
- [8] X. Chen et al., "Analysis and Design of an Ultra-Low-Power Bluetooth Low-Energy Transmitter With Ring Oscillator-Based ADPLL and 4X Frequency Edge Combiner," JSSC, 2019.
- [9] A. Klinefelter et al., "A 6.45µW self-powered IoT SoC with integrated energyharvesting power management and ULP asymmetric radios," ISSCC, 2015.
- [10] H. Liu et al., "An ADPLL-centric bluetooth low-energy transceiver with 2.3mW interference-tolerant hybrid-loop receiver and 2.9mW single-point polar transmitter in 65nm CMOS," ISSCC, 2018.
- [11] M. Lee et al., "A CMOS MedRadio Transceiver With Supply-Modulated Power Saving Technique for Implantable Brain–Machine Interface System," JSSC, 2019.
- [12] L. G. Salem et al., "A Recursive Switched-Capacitor House-of-Cards Power Amplifier," JSSC, 2017.
- [13] N-C Kuo *et al.*, "A Wideband All-Digital CMOS RF Transmitter on HDI Interposers With High Power and Efficiency", TMTT, 2017.
- [14] V. Mangal and P. R. Kinget, "A 0.42nW 434MHz -79.1dBm Wake-Up Receiver with a Time-Domain Integrator,", ISSCC, 2019.

Authorized licensed use limited to: Purdue University. Downloaded on April 18,2021 at 17:14:18 UTC from IEEE Xplore. Restrictions apply.