

# Full-Duplex Receiver With Wideband, High-Power RF Self-Interference Cancellation Based on Capacitor Stacking in Switched-Capacitor Delay Lines

Sasank Garikapati<sup>ID</sup>, *Student Member, IEEE*, Aravind Nagulu<sup>ID</sup>, *Member, IEEE*, Igor Kadota<sup>ID</sup>, *Member, IEEE*, Mostafa Essawy<sup>ID</sup>, *Graduate Student Member, IEEE*, Tingjun Chen<sup>ID</sup>, *Member, IEEE*, Shibo Wang, Tanvi Pande, Arun S. Natarajan<sup>ID</sup>, *Senior Member, IEEE*, Gil Zussman<sup>ID</sup>, *Fellow, IEEE*, and Harish Krishnaswamy<sup>ID</sup>, *Senior Member, IEEE*

**Abstract**—The self-interference (SI) channels in full-duplex (FD) radios have large nano-second-scale delay spreads, which poses a significant challenge in designing SI cancelers that can emulate the SI channel over wide bandwidths. Passive implementations of high delay lines have a prohibitively large form factor and loss when implemented on silicon, whereas active implementations suffer from noise and linearity penalties. In this work, we leverage time-interleaved multi-path switched-capacitor (SC) circuits to provide large wideband delays with a small form factor and low power (LP) consumption to implement RF and baseband (BB) cancelers in an FD receiver (RX). We utilize capacitor stacking to obtain passive voltage gain to compensate for the loss of these delay elements, thus permitting an increased number of interleaved paths and, hence, a higher delay. Furthermore, to reduce the RX noise figure (NF) penalty due to injecting the cancellation signal into the receiver, we introduce a novel low-noise trans-impedance amplifier (LNTA) architecture, which injects the cancellation signal into RX and also accomplishes finite impulse response (FIR) filter weighting and summation. The FD receiver is implemented in a standard 65-nm CMOS process and operates from 0.1 to 1 GHz. The RF/BB canceler delay cells have real-/complex-valued weighting with delays ranging

Manuscript received 30 May 2023; revised 25 September 2023, 10 December 2023, and 17 December 2023; accepted 17 December 2023.

This article was approved by Associate Editor Yunzhi Dong. This work was supported in part by Defense Advanced Research Projects Agency (DARPA) Signal Processing at RF (SPAR); in part by NSF under Grant EEC-2133516, Grant AST-2232455, Grant CNS-2148128, Grant CNS-2128638, and Grant AST-2232458; and in part by the Federal Agency and Industry Partners through the NSF Resilient and Intelligent NextG Systems (RINGS) Program. (*Sasank Garikapati and Aravind Nagulu contributed equally to this work.*) (*Corresponding author: Harish Krishnaswamy.*)

Sasank Garikapati, Shibo Wang, Tanvi Pande, Gil Zussman, and Harish Krishnaswamy are with the Department of Electrical Engineering, Columbia University, New York, NY 10027 USA (e-mail: harish@ee.columbia.edu).

Aravind Nagulu is with the Department of Electrical and Systems Engineering, Washington University in St. Louis, St. Louis, MO 63130 USA.

Igor Kadota is with the Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL 60208 USA.

Mostafa Essawy and Arun S. Natarajan are with the Department of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331 USA.

Tingjun Chen is with the Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708 USA.

Color versions of one or more figures in this article are available at <https://doi.org/10.1109/JSSC.2024.3351615>.

Digital Object Identifier 10.1109/JSSC.2024.3351615

from 0 to 7.75 ns/0 to 85 ns while consuming 7.4- and 1.9-mW dc power per tap, respectively. These large tunable delays enable 41/38-dB integrated SI cancellation for 40-/80-MHz bandwidth over 29-dB isolation provided by a CMOS circulator operating at 0.95 GHz. The canceler handles a transmitter (TX) power of up to +10/+15 dBm in LP/high-power (HP) modes with 0.8-/2.8-dB RX NF degradation.

**Index Terms**—Capacitor stacking, delay lines, full duplex (FD), multipath switched capacitors (SCs), self-interference (SI), time-domain equalization (TDE).

## I. INTRODUCTION

**F**ULL-DUPLEX (FD) wireless, where the transmitter (TX) and the receiver (RX) operate at the same time and at the same frequency, has the potential to double network capacity at the physical layer while offering several advantages at the higher layers [1], [2], [3], [4], [5], [6]. Over the last few years, several integrated circuit (IC) implementations of FD transceivers [7], [8], [9], [10], [11], [12], [13], [14], [15], [16] as well as antenna interfaces supporting FD operation [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27] have been demonstrated.

FD transceivers remain a significant challenge as they require large levels of suppression of the self-interference (SI) signal (>100 dB) leaking from the TX to the receiver. To achieve this across a wide bandwidth, the SI channel must be re-created across the entire bandwidth to generate a copy of the SI from the TX signal. The large SI channel delay spreads and the need for real-time canceller adaptation further exacerbate this challenge. SI cancelers based on frequency-domain equalization (FDE) demand multiple widely tunable high-*Q* filters, which are power hungry when implemented using *N*-path filters [7], while those based on finite impulse response (FIR)-based time-domain equalization (TDE) [12], [13], [14] require large delays with fine resolution. In addition, supporting realistic antenna interface isolations of  $\approx$ 20 dB requires a low-loss canceller or a canceller with embedded gain, which stresses canceller noise and linearity.

Obtaining compact nano-second-scale true time delays over wide bandwidths to emulate the group delay of the SI channel



Fig. 1. Block diagram and design tradeoffs of a typical RF SI canceler.

remains a significant challenge due to limitations on the delay-bandwidth (DBW) product. Passive circuits, such as transmission lines or *LC*-based delays, are impractical to realize on ICs due to the large area associated with the implementation and the associated loss [28]. Active on-chip techniques for obtaining true time delays include *RC*-based all-pass filters with active buffering [12], [13] and Gm-C-based delay cells [12], [29], [30]. The delay offered by a single *RC*-*CR* first-order all-pass filter section is  $\tau_{Tap} < (1/2\pi f)$ , where  $f$  is the operating frequency. Cascading multiple such elements is shown to result in a total delay of up to 250–350 ps at 1.5–2-GHz frequency [12]. Using active Gm-C-based delays in the RF canceler consumes high dc power of the order of 90 mW for a delay of 550 ps up to 2.5-GHz frequency when implemented using a 140-nm CMOS process with 1.8-V power supply [30]. Recently, *N*-path switched-capacitor (SC)-based delays have been shown to provide large delays of the order of nano-seconds over gigahertz-range frequencies [14]. While the DBW product ideally increases with the number of parallel paths [31], this leads to added losses due to higher switch parasitics. Therefore, techniques to incorporate passive gain to compensate for these losses without noise or distortion penalty are of interest.

In [32], we introduced: 1) an *N*-path SC delay line with stacked-capacitor voltage gain enabling nearly ten nano-seconds of RF true time delay across a large BW (dc-to-1 GHz); 2) a new low-noise trans-impedance amplifier (LNTA)-canceler where the FIR weighting, summation, and cancellation signal injection functions of the canceler are absorbed into the LNTA; and 3) a closed-loop adaptation algorithm leveraging analytical modeling of tap non-idealities that reduces the computational complexity and data storage. In this article, we expand on [32] by: 1) elaborating on the design methodology and operation of capacitor stacking in SC circuits; 2) explaining the detailed operation of the LNTA

canceler; 3) explaining the algorithms used to model the delay taps and optimize the filter to provide maximum SIC; 4) adding measurements using a CMOS-based circulator [33]; and 5) adding cancellation measurements using modulated signals. The rest of this article is organized as follows. In Section II, we describe the system-level analysis of different tradeoffs experienced in time-domain cancellers. In Section III, we describe the implementation of the designed chip introducing the concepts of capacitor-stacking in SC-based delays, the LNTA canceler operation, and the optimization technique used to program the delays. Section IV describes the measurement results. Finally, Section V concludes this article.

## II. SYSTEM-LEVEL ANALYSIS

Time-domain cancellers use multiple programmable true-time-delay elements in parallel with tunable gains to realize an adaptive analog FIR filter. To achieve SIC, the delays and gains in these cancelers are chosen to realize a filter whose frequency response matches that of the SI channel over the desired bandwidth. Tapping a portion of the TX signal and passing it through this filter results in a copy of the SI signal, which can be subtracted from the received signal to selectively suppress the SI portion of the signal.

Fig. 1 shows the block diagram of a typical RF canceler. The gain through the canceler path must equal the amount of isolation, ISO, of the antenna interface to achieve SIC. The gain through the canceler is the product of four components: 1) the coupling loss from the TX signal to the canceler  $C_{TX}$ ; 2) the insertion loss from the delay cells  $IL_{Delay}$ ; 3) the gain of the injection circuit that couples the SI copy into the RX  $G$ ; and 4) an inherent loss in FIR filter combining due to the shape of the filter  $L_{FIR}$ . In an FIR filter, when the various taps are summed together, since the phases of the delay elements at a certain frequency need not be equal, there is an additional loss associated with the canceler. For instance, if we have two delay taps with distinct delays  $\tau_1$  and  $\tau_2$ , each with an equal gain of  $A/2$ , the sum of the delay taps is  $|(A/2) \exp(-j\omega\tau_1) + (A/2) \exp(-j\omega\tau_2)|$ . Assuming that the phase difference between the two taps follows a uniform distribution between 0 and  $\pi$ , the expected value across frequency is  $(A/\sqrt{2})$  (which corresponds to the loss of a  $90^\circ$  phase difference). Ideally, if the two taps are added with the same phase, the magnitude of the summation would be  $A$ . Hence, the factor of  $(A/\sqrt{2})$  corresponds to an additional 3-dB loss, which is the FIR loss. Extending this to  $K$  delay taps, each with an equal gain of  $A/K$  to ensure that the sum of gains remains  $A$ , the expected value of the sum of delay taps is  $A/\sqrt{K}$ , which corresponds to a loss of  $10 \log K$  dB. Thus, we see that enabling a higher number of taps results in a higher loss. We denote this filtering loss as  $L_{FIR}$ , which is dependent on the gains and delays of individual taps and the frequency range of operation. In general, the filtering loss increases with the number of required taps, which depends on the shape of the channel response to be canceled. To illustrate this, we run simulations of canceler responses with 2–6 enabled taps with a fixed delay. We assign arbitrary gains to the taps by generating random numbers from a uniform distribution.



Fig. 2. (a) Frequency responses of a canceler for  $N = 2\text{--}6$  taps enabled with arbitrary gains with the dotted line representing the integrated FIR loss of the respective solid line across the desired frequency range, (b) FIR loss versus number of enabled taps averaged across multiple random trials, and (c) maximum variation in the filter response, defined as the difference between the maximum gain and the minimum gain in the given frequency range, versus number of enabled taps averaged across multiple random trials.

The individual tap gains are then normalized to the total sum of the tap gains to ensure that the sum of the gains in different iterations, which determines the noise figure (NF) penalty, remains constant. Fig. 2(a) shows one iteration of the canceler response simulation for different number of taps, which is obtained by summing the individual tap responses generated by the aforementioned method. As shown in Fig. 2(a), the FIR loss as well as the variation in the channel response increases. Fig. 2(b) and (c) shows the FIR loss and the maximum variation in the channel response, defined as the difference between the maximum and minimum gain across the desired frequency range, for different number of enabled taps averaged across multiple random trials. From these figures, we observe an increasing trend for the FIR loss and for the maximum filter response variation as the number of taps increases, which is in agreement with our understanding. Typically, wireless environments with multiple reflections and higher delay spreads tend to have higher variations in the magnitude response due to the effects of multipath fading. Thus, the maximum filter response variation is indicative of the ability to synthesize more complex and rapidly varying channel responses as the number of taps increases.

The total gain of the canceler can be represented as  $G = \text{IL}_{\text{Delay}} - C_{\text{TX}} - L_{\text{FIR}}$ . Hence, to ensure cancellation, we require the injection circuit to have a gain of  $G = C_{\text{TX}} + \text{IL}_{\text{Delay}} - \text{ISO} + L_{\text{FIR}}$ . The noise added by the canceler into the receiver is directly proportional to the gain of the output buffer, and hence, we obtain

$$\text{Canceler Noise} \propto G \propto C_{\text{TX}} \propto \text{IL}_{\text{Delay}} \propto -\text{ISO} \propto L_{\text{FIR}}. \quad (1)$$

To reduce the NF degradation of the receiver due to the canceler, the gain of the injection circuit must be reduced. Increasing the coupling from the TX to compensate for the reduction in the gain would worsen the linearity requirements on the canceler and also compromise TX efficiency. Coupling  $N$  dB more power into the canceler would increase the 3rd order intermodulation products generated by the canceler by  $3N$  dB while relaxing the gain requirement by only  $N$  dB. For the same linearity performance, reducing the loss  $\text{IL}_{\text{Delay}}$  of the delay elements reduces the NF degradation of the receiver due to the canceler. Compensating the loss by using active transistors introduces additional noise and distortion and increases the dc power consumption. Hence, passive techniques to improve the gain of the delay elements without

added penalty in terms of noise and linearity are desirable for reducing the NF degradation of the RX.

The amount of cancellation obtained from time-domain cancelers is dependent on the delay resolution provided by the taps as well as the total number of delay configurations available (which determines the maximum delay obtained). Fig. 3(a) shows the frequency response of a typical commercial off-the-shelf (COTS) circulator terminated with a COTS antenna, while Fig. 3(b) shows the impulse response of the antenna interface, showing delay spreads of the order of 10 ns. Fig. 3(c) and (d) shows the simulated amount of cancellation obtained using time-domain cancellation with ideal delay taps for different delay resolutions and total number of available delay configurations for two different bandwidths of 40 and 80 MHz, respectively. These simulations were performed by obtaining the individual tap responses based on the delay resolution and the number of available delay configurations, followed by optimizing the tap gains to minimize the residue by using an in-built optimization algorithm in MATLAB. Using 16 available delay configurations with a 250-ps resolution, simulation results indicate that 60-/51-dB cancellation can be obtained for a 40-/80-MHz bandwidth, respectively. This cancellation will be obtained for any signal that occupies the entire bandwidth with a frequency flat magnitude. In general, we see that the SIC obtained is directly proportional to the maximum delay that is provided by the canceler for a given delay resolution. Since these results are obtained using simulation results consisting of ideal taps with frequency flat magnitude response and group delay, these might not reflect exactly in measurement results as these assumptions do not hold true in actual cancelers, as we see in Section IV. Furthermore, the amount of SIC obtained strongly depends on the profile of the SI channel. These simulations serve as tools to see trends in SIC and to predict bounds for cancellation for different delay resolution settings, which assists with design choices. Since it is desired to have high total delay and low delay resolution to obtain cancellation across wide bandwidths, delay generation techniques capable of meeting such requirements are favored.

### III. IMPLEMENTATION OF THE FD RECEIVER

Fig. 4 shows the circuit diagram and the microphotograph of the implemented chip consisting of a receiver with an RF canceler and a baseband (BB) canceler. The receiver consists



Fig. 3. (a) Measured SI channel across a commercial 700 MHz–1 GHz COTS circulator and antenna—magnitude response and (b) impulse response. Simulated SI cancellation as a function of number of delay taps and delay resolution for (c) 40- and (d) 80-MHz bandwidth, respectively.



Fig. 4. Block diagram and chip microphotograph of the FD receiver featuring time-interleaved SC delay-based RF and BB cancelers implemented in a standard 65-nm CMOS process.



Fig. 5. Circuit diagram demonstrating the operation of capacitor-stacking-based delay taps during (a) charging phase, (b) holding phase, and (c) discharging phase. (d) Time-interleaved  $N$ -phase SC-based delay element using capacitor stacking.

of a dc-to-1 GHz partial-noise-cancelling LNTA, followed by a four-phase I/Q direct down-conversion mixer and a baseband trans-impedance amplifier (TIA). The TX power

is coupled capacitively into the RF and BB cancelers using digitally programmable off-chip capacitors, which can also be incorporated on chip [34]. The RF canceler consists of



Fig. 6. (a) Transistor-level implementation of an eight-path delay tap. (b) Eight-path MUX operation with circles at the intersection of input and output clock signals indicating switches that are turned on to obtain the desired clock phases for the charging and discharging switches of the delay tap.

source-follower-based buffers at the input followed by one zero-delay tap where the input is connected directly to the output, five programmable eight-path SC low delay taps providing delays up to 2 ns over 1-GHz bandwidth when clocked at 500-MHz frequency, and ten programmable 32-path high delay taps providing delays up to 8 ns over 1-GHz bandwidth when clocked at 125-MHz frequency. Source followers are chosen to be input buffers as they have high linearity (P1dB of +10 dB) and hence do not limit the cancelers linearity. The RF canceler also utilizes the receiver's LNTA to perform FIR weighting, summation of the delay taps, as well as signal injection into RX required for cancellation. The baseband canceler consists of an I/Q direct down-conversion mixer and a baseband TIA similar to the receiver, which brings the capacitively coupled TX signal to baseband frequencies and provides quadrature baseband outputs. The BB canceler consists of similar eight-path SC delays, which provides up to 80 ns of delay over >30-MHz bandwidth when clocked at 10-MHz clocking frequency. The clock frequency of the BB canceler can be changed to obtain a higher bandwidth at the cost of lower delay. For higher bandwidths, we use a 20-MHz clocking frequency, which provides up to 40-ns delay for >60-MHz bandwidth. The quadrature baseband signals can be used to obtain complex weighting for these taps by passing both components through their respective delay elements and

combining their currents using CMOS-inverter-based vector modulators at the output, which also inject the cancellation current at the RX mixer's output.

#### A. RF Canceler

The RF canceler realizes an adaptive FIR filter using multiple capacitor-stacking-based delay elements in parallel with tunable gains. These delay elements directly interface with the receiver's LNTA to inject the desired current required for cancellation at the RX. The working principles of the capacitor-stacking-based delay element and the LNTA canceler are described as follows.

1) *Capacitor-Stacking-Based Delay Element:* Time-interleaved multi-path SC networks have been shown to exhibit different characteristics depending on the ratio of the on-period of the switches to the  $RC$  time constant of charging the capacitor. When the on-period is much larger than the time constant, these circuits behave like a simple time-interleaved sample and hold circuit. When the on-period is much smaller than the time constant, a high-quality-factor comb filter (the  $N$ -path filter) is seen at the switching frequency [35]. It has been shown that a broadband delay beyond the DBW product limit of LTI circuits can be obtained when the on-period and time constant are of the same order [31]. By staggering the clocks controlling the charging and discharging switches in these networks by  $\Delta\tau$ , a true time delay of  $\Delta\tau$  is imparted to the input signal. Delays based on this sample-hold-and-release principle of time-interleaved multipath SC circuits can enable large delays of the order of  $1/f_s$  over a bandwidth of  $Nf_s/2$ , where  $f_s$  is the clocking frequency and  $N$  is the number of interleaved paths.

However, switch parasitics and rise/fall time of the clocks result in an increased loss, which compromises the power handling of the delay elements and introduces additional noise. We propose a capacitor-stacking approach to achieve passive voltage gain in the canceler delay line, which preserves the linearity and noise performance of the SC circuits. In this scheme, multiple capacitors  $C_1, \dots, C_M$  are charged in parallel from the input source during the charging (or sampling) phase [Fig. 5(a)] where the switch is ON for a period of  $T_{S,RF}/N$ , storing the input voltage at time  $t_0$  on the capacitors. This charge is held for time  $\Delta\tau - T_{S,RF}/N$  during the hold phase [Fig. 5(b)]. During the discharging (or release) phase, the capacitors  $C_1, \dots, C_M$  are stacked over each other and connected to the output node such that the output voltage equals the sum of the voltages on the individual capacitors [Fig. 5(c)]. Hence, the output voltage during the discharge phase is  $V_{out}(t) = MV_{in}(t_0)$ , thus obtaining a voltage gain of  $M$ . It should be noted that capacitive stacking only provides a voltage gain and not a power gain. Hence, there is an impedance transformation, leading to behavior that is similar to that of a transformer. Thus, having more stacking stages reduces the amount of load that the delay element can drive. For the same load capacitance, the loss due to charge sharing between the path capacitance and load capacitance increases with the number of stacking stages. This limits the amount of stacking possible while still maintaining an improvement in gain. This effect exacerbates loss due to parasitic capacitance,



Fig. 7. Tap gain of the SC-based delay taps with two-stage capacitor stacking and without capacitor stacking for (a) eight-path structure and (b) 32-path structure. (c) Normalized canceler gain with two-stage capacitor stacking and without capacitor stacking along with the input and output buffers. (d) Receiver NF degradation assuming identical canceler gains with two-stage capacitor stacking and without capacitor stacking.

especially at higher frequency of operation. Stacking in SC circuits to obtain passive voltage gain has also been shown in the context of  $N$ -path filters [36], [37].

$N$  such SC structures are connected in parallel with non-overlapping  $1/N$  duty cycle clock signals to increase the sampling rate of the input signal and, hence, the DBW product by a factor of  $N$  [Fig. 5(d)]. Using a single set of  $N$ -phase clock signals for switches in both the charging phase and discharging phase realizes discretized delays of  $0, T_{S,RF}/N, \dots, (N-1)T_{S,RF}/N$ . An analog multiplexer is used to provide clock signals with programmable phases to the switches. This enables all the five eight-path delay taps to provide any of the eight possible delays and all ten 32-path delay taps to provide any of the 32 possible delays. As shown in Fig. 6(b), the analog multiplexer connects the input clock signals to the clock signals controlling the charging and discharging switches using RF switches, which can be programmed to select the desired clock phase. For the eight-path delay tap, the charging switch is controlled by a 2-bit input providing the phases  $\phi_0, \phi_6, \phi_4$ , and  $\phi_2$  corresponding to delays of  $\Delta\tau_{SW1} = \{0, -2, -4, -6\} \times T_{S,RF8}/8$  and the discharging switch is controlled by a 2-bit input providing the phases  $\phi_0, \phi_1, \phi_2$ , and  $\phi_3$  corresponding to delays of  $\Delta\tau_{SW2} = \{0, 1, 2, 3\} \times T_{S,RF8}/8$ . The delay of the tap is the difference in the delays of the discharging and charging switch  $\Delta\tau_{SW2} - \Delta\tau_{SW1} = \{0, 1, 2, 3, 4, 5, 6, 7\} \times T_{S,RF8}/8$ . A similar MUX has been implemented for the 32-path delay taps using 3-bit controls.

The two types of RF delay taps based on time-interleaved stacked SC-based delays that have been implemented explore the tradeoffs between delay and insertion loss: 1) eight-path delay taps provide low delays and have lower insertion loss due to lesser switch parasitics and 2) 32-path delay taps provide higher delays for the same bandwidth but have higher insertion loss due to large switch parasitics. Fig. 7(a) and (b) shows the simulation results of standalone delay taps without any loading, which indicates a +5- and +4-dB improvement in the eight-path tap gain, and a +5- and +3.6-dB improvement in the 32-path tap gain for low (0.1 GHz) and high (1 GHz) frequencies, respectively, due to capacitor stacking. The small ripple in the gain response of the delay elements is due to the delay regime of operation of the  $N$ -path circuit [31]. Fig. 7(c) shows the simulation results indicating improvement in the overall canceler gain in the presence of load capacitance with and without capacitor-stacking. We use a stacking ratio of 2 $\times$  to avoid being overly sensitive to load capacitance. Due to the

gain obtained by capacitor stacking, we can afford to reduce the transconductance of the injecting circuit, which reduces the NF. A  $K \times$  increase in the gain of the delay element can support a  $K \times$  reduction in the transconductance of the injecting circuit, thereby reducing the additional noise due to the canceler by a factor of  $K$ . As predicted by theory, for the equal overall canceler gain that is equal to the antenna interface isolation, equal area, and equal power consumption, simulation results indicate a 1.5-dB improvement in the NF degradation of the receiver due to the RF canceler with two-stage capacitor stacking at lower operation frequencies, which degrades as the frequency increases, as shown in Fig. 7(d).

The implemented eight-path RF canceler taps are shown in Fig. 6(a) (32-path implementation follows a similar approach, but with 32 interleaved paths instead). The path capacitance value depends on the load it drives [which is the gate capacitance  $M2$  in Fig. 8(a)] and should be high enough to ensure that there is not significant loss due to sharing its charge with the load capacitance. We have a load capacitance of around 100 fF, and hence, we chose  $C = 1$  pF, which results in an effective discharging capacitance of 500 fF. While designing the RF delay taps, there is a tradeoff between the ON-resistance of the switch and the OFF-capacitance of the switch. The additional loss due to the charging switch resistances is equal to  $R_{SW}/R_{OUT,SF}$ , where  $R_{SW}$  is the equivalent ON-resistance of  $M11, M12$ , and  $M13$ , and  $R_{OUT,SF}$  is the output resistance of the source follower. The input switch sizes have an optimal point for the least loss for a given  $R_{ON}C_{OFF}$  of the switch (which is usually a technology constant), which can be found using parametric sweep of the switch sizes. It must be noted that the optimal switch size needs not to be the same for 32-path and eight-path implementations due to there being a load of 32 switch OFF-capacitances for 32-path implementations and only eight switch OFF-capacitances for eight-path implementations. As shown in Fig. 6, the charging switch sizes are lower for 32-path implementations to reduce the capacitance. The output switches are sized to ensure that there is enough charge transfer between the path capacitance and the output buffer while not increasing the load capacitance of the delay element significantly.

2) *LNTA Canceler*: The receiver front end uses a partial-noise-canceling LNTA, which provides wideband matching with low NF similar to [38] and [39]. As shown in Fig. 8(a), the LNTA canceler consists of three sets of complementary transistors— $M1$  with  $M1N$  and  $M1P$  and combined transconductance  $g_{m1}$ ,  $M2$  with  $M2N$  and  $M2P$  and combined



Fig. 8. (a) LNTA canceler performing RX matching, FIR weighting, summation, and injection of the SI cancellation signal into the receiver. (b) Small-signal equivalent circuit of the LNTA canceler which is used in analysis and (c) its equivalent block diagram depicting the partial cancellation at the input followed by additional gain from the tap responses to provide maximum cancellation.

transconductance  $g_{m2}$ , and M3 with M3N and M3P and combined transconductance  $g_{m3}$ . We use  $K$  as the ratio between  $g_{m3}$  and  $g_{m2}$  in our notation, and hence,  $g_{m3} = K g_{m2}$ . This topology results in partial cancellation of the noise of M1 and was adapted to include the canceler functionality. The transconductances of transistors M1N, M1P, M3N, and M3P can be programmed using a 7-bit control with each unit cell of NMOS having a size of  $10 \mu\text{m}/120 \text{ nm}$  and of PMOS having a size of  $30 \mu\text{m}/120 \text{ nm}$ .

A TDE-based FIR canceler requires FIR weighting, summation, and injection into the receiver to cancel the SI signal. Our LNTA canceler can absorb these functions into LNTA by splitting the CMOS inverter M2 into 16 inverter pairs  $M2, 1, \dots, M2, 16$  with 6-bit programmable transconductances  $g_{m2,1}, \dots, g_{m2,15}$  with the output of each delay tap connected to the gates of these CMOS inverter pairs. Each unit cell of NMOS has a size of  $0.5 \mu\text{m}/280 \text{ nm}$  and of PMOS has a size of  $1 \mu\text{m}/280 \text{ nm}$ . Having a large number of unit cells is required to obtain a fine resolution, whereas too many cells would degrade the input matching due to the parasitic capacitance of these transistors. We ensured that the input matching of the LNTA does not go below 10 dB until 1 GHz while deciding the number of unit cells to be implemented.

Though the addition of individual tap voltages is done in the current domain within the LNTA canceler, we have collapsed this summation into an effective cancellation voltage  $V_C$ , which drives an M2 (with transconductance  $g_{m2}$ ) for ease of understanding the LNTA operation.

The small-signal equivalence of the NMOS half circuit of the LNTA canceler is shown in Fig. 8(b). The PMOS transconductances are added to their respective NMOS counterparts to perform the full circuit analysis. The looking in impedance from the input port can be computed as

$$Z_{IN} = \frac{K + 1}{g_{m1}} = Z_0. \quad (2)$$

To obtain input matching, the value of  $K$  and  $g_{m1}$  must be set such that the aforementioned condition holds.

The output voltage  $V_{OUT}$  is dependent on both the SI voltage  $V_{SI}$  and  $V_C$  and can be computed as

$$V_{OUT} = -\frac{K}{2} V_{SI} + \frac{K(K + 2)}{2} \frac{g_{m2}}{g_{m1}} V_C. \quad (3)$$

The output voltage in Fig. 8(b) can be computed as

$$V_{OUT} = -\frac{K}{2} V_{SI} + \frac{K(K + 2)}{2} \sum_{i=1}^{16} \frac{g_{m2,i}}{g_{m1}} V_{Tap,i} \quad (4)$$

where  $V_{Tap,i}$  corresponds to the voltage output of  $i$ th delay tap. We see that the output voltage of each tap is weighted, summed, and added to  $V_{SI}$  at the output of the LNTA. Similarly, the input voltage  $V_{IN}$  can be computed as

$$V_{IN} = \frac{1}{2} V_{SI} - \frac{K}{2} \sum_{i=1}^{16} \frac{g_{m2,i}}{g_{m1}} V_{Tap,i}. \quad (5)$$

SIC at the output can be achieved by satisfying the condition

$$\frac{K(K + 2)}{2} \sum_{i=1}^{16} \frac{g_{m2,i}}{g_{m1}} V_{Tap,i} = \frac{K}{2} V_{SI}. \quad (6)$$

Under this condition, the residual SI at the input is

$$V_{IN} = \frac{1}{2} \frac{2}{K + 2} V_{SI}. \quad (7)$$

For  $K = 10$ , we see 16-dB partial cancellation at the input.

This can also be thought of as obtaining partial cancellation at the input of the LNTA and additional cancellation at the output of the LNTA. This is due to obtaining different gains from the cancellation signal node ( $V_C$ ) to the input and the output of the LNTA and we tune the tap gains such that maximum cancellation is obtained at the output. This interpretation is shown in Fig. 8(c) where the additional gain to the output of the LNTA is indicated by  $G_{SI}$ . Here,  $G_{RX} = -K$ ,  $K_i = -K g_{m2,i}/2g_{m1}$ , and  $2G_{SI} = -2$ .

Assuming that the MOSFET noise current has a power spectral density of  $4kT\gamma g_m$  for a transconductance of  $g_m$ , the effective noise factor  $F$  of the LNTA is

$$F = 1 + \gamma \frac{1}{K + 1} + \gamma \frac{(K + 2)^2}{K + 1} \frac{g_{m2}}{g_{m1}} + \gamma \frac{(K + 2)^2}{K(K + 1)} \frac{g_{m2}}{g_{m1}} \quad (8)$$

where  $g_{m2} = \sum_{i=1}^{16} g_{m2,i}$ . To couple more power into the canceler,  $g_{m2}$  must be increased, which increases the amount of cancellation current injected into the RX. However, this results in an increase in the NF of the receiver. Using (8), for nominal settings of  $g_{m1} = 220 \text{ mS}$ ,  $g_{m2} = 20 \text{ mS}$ , and

$K = 10$  and assuming that  $\gamma = 1$ , we can find the baseline NF of the RX as 3.83 dB, which is in close agreement to the measurements (3.7 dB) described in Section IV. The added noise power due to the RF canceler can be computed from the output noise PSD of the RF canceler taps introduced due to the delay elements and the source follower buffer and multiplying it with the gain of the LNTA canceler from the output of the tap to the RX input. Since the gain from the tap output is dependent on the value of  $g_{m2}$ , the added noise increases with  $g_{m2}$ . Furthermore, the value of  $g_{m2}$  imposes a sum constraint on the weights associated with the RF canceler taps. Hence, an increase in the weights of the canceler taps results in an increase in the NF of the RX itself. The value of  $g_{m2}$  can be increased to enable coupling less power into the RF canceler from the TX without compromising the overall gain. Doing this enhances the power handling of the RF canceler since the amount of power at the delay tap input is reduced. However, the increase in  $g_{m2}$  comes with an inherent baseline NF penalty as well as the noise of the delay taps is amplified by a higher amount. We operate our FD RX in two modes with different  $g_{m2}$  values: 1) low power (LP) mode with a nominal  $g_{m2}$  of 20 mS having a baseline measured NF of 3.7 dB with a 0.3-dB NF degradation due to the RF canceler handling  $-12$ -dBm SI power referenced to RX input and 2) high power (HP) mode with a nominal  $g_{m2}$  of 40 mS having a baseline measured NF of 5.7 dB with a 0.3-dB NF degradation due to the RF canceler handling  $-7$ -dBm SI power referenced to RX input. We treat the baseline NF of the RX as that of the LP mode and make our comparisons with this number (3.7 dB).

### B. BB Canceler

The baseband canceler performs the second stage of cancellation, which cancels the residual SI after the RF cancellation and RX down-conversion. The baseband canceler uses similar delay taps as the RF canceler based on SC circuits and is identical to the one implemented in [14]. The required bandwidth of the delay taps is equal to the cancellation bandwidth, which permits the delay taps to be clocked at much lower frequencies leading to much larger delays compared to the RF canceler.

The TX signal is capacitively coupled into the baseband canceler, which is followed by a four-phase I/Q direct down-conversion mixer using the same clock as that of the RX mixer. The I and Q paths are separately amplified by baseband differential TIAs similar to those used in the receiver. The four single-ended outputs from the TIAs, which correspond to  $I+$  ( $0^\circ$ ),  $I-$  ( $180^\circ$ ),  $Q+$  ( $90^\circ$ ), and  $Q-$  ( $270^\circ$ ), are passed through four sets of seven time-interleaved SC delay cells following the same structure as the eight-path RF canceler taps, but without capacitor stacking. Capacitor stacking is not essential in the baseband cancelers as the residual SI after the RF canceler is significantly low, which removes the need for additional gain in the baseband taps. Operating at baseband frequencies allows us to clock the delay cells at a much lower frequency  $f_{s, \text{BB}} = 1/T_{s, \text{BB}}$ . A single set of eight-phase clocks are used for the switches and are offset by  $k$  phases for the  $k$ th delay cell, thus realizing discretized fixed delays of  $T_{s, \text{BB}}/8, 2T_{s, \text{BB}}/8, \dots, 7T_{s, \text{BB}}/8$  for the seven

delay cells. A clock frequency of 10 MHz is used for lower bandwidths up to 30 MHz providing up to 80-ns delay. The clock frequency can be increased for higher bandwidths at the cost of lower delays. All four phases of each delay cell, along with the four phases of the output of the TIA (which corresponds to an equivalent zero-delay tap), are passed through vector modulators, which realize complex weights for each of the baseband canceler delay taps. The outputs of the vector modulators are added in the current domain and are injected into the RX chain at the corresponding four output nodes of the RX mixer to cancel the residual SI post-RF cancellation. An additional zero-delay tap is implemented in the baseband canceler as well. The baseband outputs from the I/Q direct down-conversion mixer and baseband TIAs are directly fed to the vector modulators without imparting any delay to them to realize this zero-delay tap.

### C. Optimization to Program the Delay Taps

Each RF canceler tap has two programmable parameters: 1) gain and 2) delay, which have to be set such that the frequency response of the canceler matches the frequency response of the SI channel with minimal error. Similarly, the complex gains of the baseband canceler must be chosen to minimize the residual SI. Denoting the frequency response of the  $K$ th RF canceler tap for the delay setting  $\tau_K$  as  $H_{\text{RFCan}, K}(f, \tau_K)$ , we obtain the frequency response of the RF canceler  $H_{\text{RFCan}}$  as

$$H_{\text{RFCan}} = \sum_{K=1}^{16} A_K H_{\text{RFCan}, K}(f, \tau_K) \quad (9)$$

where  $A_K$  and  $\tau_K$  are the gain and delay assigned to the  $K$ th tap. Similarly, denoting the frequency response of the  $K$ th baseband canceler tap as  $H_{\text{BBCan}, K}(f)$ , we obtain the frequency response of the BB canceler  $H_{\text{BBCan}}$  as

$$H_{\text{BBCan}} = \sum_{K=1}^8 \hat{B}_K H_{\text{BBCan}, K}(f) \quad (10)$$

where  $\hat{B}_K$  is the complex weight associated with the  $K$ th baseband tap. Denoting  $H_{\text{SI}}(f)$  as the frequency response of the SI channel, we obtain the frequency response of the residual SI after RF and baseband cancellation  $H_{\text{res}}(f)$  as

$$H_{\text{Res}}(f) = (H_{\text{SI}}(f) + H_{\text{RFCan}}(f) + H_{\text{BBCan}}(f)). \quad (11)$$

The residual SI power over a bandwidth BW must be minimized to maximize the cancellation, and thus, the cost function to be minimized in the optimization problem is

$$C = \int_{f \in \text{BW}} \left| \underbrace{H_{\text{SI}}(f) + \sum_{K=1}^{16} A_K H_{\text{RFCan}, K}(f, \tau_K)}_{H_{\text{RFCan}}(f)} + \underbrace{\sum_{K=1}^8 \hat{B}_K H_{\text{BBCan}, K}(f)}_{H_{\text{BBCan}}(f)} \right|^2 df. \quad (12)$$

The cost function is a quadratic function in its optimization variables  $A_K$  and  $B_K$  and is a non-polynomial function in  $\tau_K$ . If  $\tau_K$  is fixed constant for each tap, the resulting cost function can be solved by a constrained linear least-squares solver as in [14]. The key difference in the optimization algorithm required for this work is the ability to figure out the optimal  $\tau_K$ . To accomplish this, we split the optimization task into two stages: 1) to figure out optimal delays to each tap and 2) to assign optimal gains to taps. To accomplish 1), we assume that 32 virtual RF taps are available corresponding to all of the 32 delays denoted by  $\tilde{\tau}_1, \dots, \tilde{\tau}_{32}$  and consider the cost function

$$\tilde{C} = \int_{f \in \text{BW}} \left| H_{\text{SI}}(f) + \sum_{j=1}^{32} \tilde{A}_j H_{\text{RFCan}}(f, \tilde{\tau}_j) + \sum_{K=1}^8 \tilde{B}_K H_{\text{BBCan}, K}(f) \right|^2 df. \quad (13)$$

We obtain the optimal gains corresponding to the 32 RF tap delays and rank the delays in descending order of their respective gains  $\tilde{A}_j$ . Note that  $\tilde{\tau}_j$  is fixed in this case. We consider the top 16 RF tap delays and assign these sequentially to the available RF tap with the highest gain for each delay. Here, only  $\tilde{\tau}_j$  values for  $j < 8$  can be assigned to eight-path taps, whereas  $\tilde{\tau}_j$  values for  $j \geq 8$  can only be assigned to 32-path taps as these delays cannot be realized by the eight-path taps. This fixes  $\tau_K$  for each tap. The optimization problem now reduces to (12), however, with  $\tau_K$  being fixed and the only variables being real-valued  $A_{Ks}$  and complex-valued  $B_{Ks}$ , which leads us to the second part of optimization. We impose a sum constraint on these variables as the sum determines the total amount of transconductance of the output buffer, which has a direct impact on the NF degradation due to the canceler. After imposing the sum constraints on these variables to limit the NF degradation, this problem is reduced to a constrained linear optimization problem and can be solved using a convex solver.

#### IV. MEASUREMENTS OF THE FD RECEIVER WITH FIR FILTERING-BASED RF AND BB CANCELCERS

The basic receiver measurements were performed with the RF and baseband cancelers inactive and in the absence of any SI signal. The wideband receiver exhibits a conversion gain of 15–40 dB with a nominal gain of 24 dB across a wide range of center frequencies from 0.1 to 1 GHz [Fig. 9(a)]. The measured NF of the receiver is  $\sim 3.7$  dB from 0- to 15-MHz offset baseband frequency [Fig. 9(b)]. In-band RX Output IP3 (OIP3) is measured to be +10 dBm [Fig. 9(c)] and Output P1dB (OP1dB) is measured to be  $-1$  dBm across various gain settings [Input P1dB (IP1dB) increases/decreases by 6 dB as the gain decreases/increases by 6 dB, this making the OP1dB constant, as shown in Fig. 9(d)].

##### A. RF and Baseband Canceler Measurements

Fig. 10(a) and (b) shows the magnitude response and group delay profiles of the eight-path RF taps present in the RF

canceler across seven different delay settings when clocked at 500-MHz frequency. The eight-path RF delay taps exhibit delays up to 2 ns with a low loss of  $-20$  dB at low frequencies ( $\sim 0.2$  GHz), which degrades to  $-25$  dB at high frequencies ( $\sim 1$  GHz). This is a decrease in the loss by a factor of 3–4 dB compared to the simulation results shown in Fig. 7(c) and is attributed due to the lower coupling factor in our implemented circuit board and the added routing losses connecting multiple taps to the LNTA, which has not been included in Fig. 7. This loss includes the coupling loss and the loss introduced by the LNTA canceler on top of the loss of the delay element itself. Fig. 10(c) and (d) shows the tap-to-tap variation of the magnitude response and group delay profiles of the eight-path RF taps for the highest delay setting of  $\Delta\tau = 7T_{S,\text{RF8}}/8$  showing a magnitude variation of  $< 4$  dB across the entire frequency range and  $< 0.2$  ns worst case group delay variation. Since the optimization is done using actual tap measurements instead of a dispersion-free true-time-delay model for the taps, these variations are considered while finding the tap parameters. The worst case gain flatness for any given 80-MHz range is  $< 2$  dB. This translates to a deviation of  $-14$  dB from a true-time-delay element, which is the difference between the linear gains of two taps that are 2 dB apart in a log scale. The flatness over any  $< 80$ -MHz band is sufficient for true FIR behavior, while the flatness over dc-to-1 GHz is sufficient for reconfigurable operating frequency.

Fig. 11(a) and (b) shows the magnitude response and group delay profiles of the 32-path RF taps present in the RF canceler across 11 different delay settings when clocked at 125-MHz frequency, which is derived from the 500-MHz clocks of the eight-path RF taps. The 32-path RF delay taps exhibit delays up to 8 ns with a loss of  $-20$  dB at low frequencies ( $\sim 0.2$  GHz), which degrades to  $-28$  dB at high frequencies ( $\sim 1$  GHz). As with the eight-path RF tap, this loss includes the coupling loss and the loss introduced by the LNTA canceler on top of the loss of the delay element itself. We see  $\sim 4$ -dB worst case tap-to-tap gain variation across frequency in 32-path taps, which is accounted for in the model used for optimizing the tap gains and delays [Fig. 11(c)]. The tap-to-tap group delay variation is negligible [Fig. 11(d)]. The worst case gain flatness for any given 80-MHz range is  $< 2$  dB, which translates to a deviation of  $-14$  dB from a true-time-delay element.

Fig. 12(a) and (b) shows the magnitude response and group delay profiles of seven baseband canceler taps when clocked at 10 MHz, respectively. These delay taps exhibit delays up to 85 ns with a  $-3$ -dB double-sided bandwidth of 34 MHz. A higher canceler bandwidth can be obtained with a cost of lower group delays by increasing the clocking frequency to 20 MHz as shown in Fig. 12(c) and (d), which provides a delay of 45 ns with a  $-3$ -dB bandwidth of 61 MHz. The large ripple in the group delay of the baseband canceler with 20-MHz clocking frequency occurs as increasing the clock frequency pushes the  $N$ -path circuit closer to the filtering regime, and hence, it starts resembling the response of an  $N$ -path filter (with multiple periodic peaks). This can be fixed by increasing the driving TIA's transconductance, which pushes the  $N$ -path circuit to the delay regime by reducing the  $RC$  time constant of charging. Nevertheless, these delay



Fig. 9. (a) RX down-conversion gain for different gain settings across the down-converted baseband frequency. (b) RX NF across down-converted baseband frequency. (c) RX conversion gain for different gain settings with varying input powers demonstrating P1dB. (d) RX large-signal performance showing the OIP3.



Fig. 10. Measured performance of the eight-path RF canceler taps: (a) magnitude response and (b) group delay of a single eight-path RF delay tap across seven different delay settings when clocked at  $f_{s,RF} = 500 with only one tap enabled where each of the delays responses are shown with different colors. Tap-to-tap variation of (c) magnitude response and (d) group delay of the eight-path RF delay when programmed to the maximum delay  $\Delta\tau = 7T_{s,RF8}/8$  where each of the eight-path delay taps is shown with different colors.$



Fig. 11. Measured performance of the 32-path RF canceler taps: (a) magnitude response and (b) group delay of a single 32-path RF delay tap across 11 different delay settings when clocked at  $f_{s,RF} = 125 with only one tap enabled where each of the delay responses is shown with different colors. Tap-to-tap variation of (c) magnitude response and (d) group delay of the 32-path RF delay when programmed to the maximum delay  $\Delta\tau = 31T_{s,RF32}/32$  where each of the 32-path delay taps is shown with different colors.$



Fig. 12. Measured performance of the baseband canceler taps: (a) magnitude response and (b) group delay of seven baseband canceler taps when clocked at  $f_{s,BB} = 10 with only one tap enabled, and (c) magnitude response and (d) group delay of seven baseband canceler taps when clocked at  $f_{s,BB} = 20 with only one tap enabled.$$

taps can be viewed as a different basis set for cancellation. The tap-to-tap gain variation is negligible for both modulation frequencies [Fig. 12(a) and (c)].

The RF canceler consumes 7.4-mW dc power for each tap when clocked at 500/125 MHz. For the antenna interfaces that we tested, typically 4–5 RF taps are enabled. For the 460-MHz

interface, five taps were enabled with delays [3 × 250 ps; 4 × 250 ps; 16 × 250 ps; 29 × 250 ps; 30 × 250 ps] providing 24-dB cancellation across 40-MHz bandwidth. For the 800-MHz interface, four taps were enabled with delays [1 × 250 ps; 2 × 250 ps; 29 × 250 ps; 30 × 250 ps] providing 22-dB cancellation across 40-MHz bandwidth. This



Fig. 13. (a) Details of the joint RF canceller modeling where a cubic model is selected to fit every path on each tap for reduced memory space and improved efficiency. Here, the cubic model is the one described in (14) and the quadratic model is its equivalent but with degree 2 in  $f\tau$ . (b) Closed-loop SIC calibration algorithm leveraging analytical modeling of tap nonidealities greatly reduces the computational complexity and data storage requirements of the SIC optimization. (c) Comparison of cancellation performance obtained when using complete measured tap data versus using the analytical model, which reduces memory space by up to 100 $\times$ . (d) Performance of the iterative loop in obtaining RF cancellation.



Fig. 14. Measured SIC across antenna interface, RF, and baseband domains with (a) 0.4–0.5-GHz ferrite circulator, (b) 0.7–1-GHz ferrite circulator, and (c) on-chip non-magnetic circulator tuned to operate at 0.95 GHz with a 50- $\Omega$  termination for 40-MHz bandwidth.

corresponds to a dc power consumption of 29.6–37 mW. The rest of the cancellation is provided by the baseband canceler, which consumes 1.9 mW per tap when clocked at 10 MHz.

### B. Tap Modeling and Iterative Loop

The SIC performance relies heavily on the configuration of the multistage SI canceller. There are several challenges associated with achieving high SIC: 1) the large number ( $>10^{19}$ ) of possible gain and delay configurations associated with the 16 RF taps and eight BB taps; 2) the fact that the frequency responses of the taps may deviate from the ideal model in a practical IC; and 3) the error incurred from the quantization of the configuration parameters into 6 bits. To address these challenges, we propose and

implement a joint canceler modeling and optimization process [see Fig. 13(b)] that efficiently solves for the configuration parameters.

To achieve wideband SIC, we obtain the configuration parameters (i.e., gains and delays) such that the canceler's frequency response matches that of the SI channel according to the optimization problem in Section III-C.

In our experiments,  $H_{\text{SI}}(f)$  is measured using the vector network analyzer (VNA) and fed to an in-built solver running on a laptop to obtain the optimal parameters based on our derived model fit for  $H_{\text{RFCan},K}(f, \tau_K)$  and  $H_{\text{BBCan},K}(f)$ , as discussed next.

In the ideal case, the frequency response of the RF and BB taps in (12) is given by  $H^{\text{ideal}}(f) = \exp(-i2\pi f\tau)$ . However, the practical IC implementation yields a non-flat magnitude



Fig. 15. Measured integrated SIC across antenna interface, RF, and baseband domains with (a) 0.4–0.5-GHz ferrite circulator, (b) 0.7–1-GHz ferrite circulator, and (c) on-chip non-magnetic circulator tuned to operate at 0.95 GHz with a 50- $\Omega$  termination for different bandwidths.



Fig. 16. (a) HP cancellation using HP mode demonstrating SIC at +10 and +15 dBm. (b) NF degradation of the receiver due to the cancelers in LP and HP modes for different cancellation bandwidths. (c) TX power handling (P1dB) of the receiver with and without cancellation. (d) 47-dB SI suppression across 80 MHz is achieved when the 800-MHz circulator is terminated with a COTS antenna.



Fig. 17. (a) Measurement setup of the canceler, circulator, and sub-20 board used to control the ICs. (b) Wideband cancellation using modulated TX signal.

response and a non-linear phase response across frequencies, as shown in Fig. 13(a). To solve the canceller optimization problem (12) accurately, we account for the non-idealities by modeling  $H_{RFCan,K}(f, \tau_K)$  and  $H_{BBCan,K}(f)$  using the following polynomial:

$$H^{\text{cubic}}(f) = \exp(-i2\pi f\tau) \exp(\hat{a} + \hat{b}f\tau + \hat{c}f^2\tau^2 + \hat{d}f^3\tau^3). \quad (14)$$

To fit each cubic polynomial to the corresponding measured frequency response, i.e., to obtain the complex-valued coefficients  $\hat{a}, \hat{b}, \hat{c}$ , and  $\hat{d}$  for each RF and BB taps, we use a nonlinear least-squares solver. Compared with storing the comprehensive measurements of individual taps, storing only the coefficients of the cubic model can reduce the memory space by up to 100 $\times$ . We see that this does not impact the performance at all in Fig. 13(c). Furthermore, the time limiting factor for the tap modeling is to obtain the tap measurements itself for all the possible delay configurations for each tap. We have one configuration for the zero-delay tap,

eight configurations for each of the five eight-path taps, and 32 configurations for each of the ten 32-path taps, which accounts for a total of 361 configurations that have to be programmed iteratively.

The in-built solver uses as input the measured  $H_{SI}(f)$  and the cubic models of each RF and BB tap to find the optimal vector of gains and delays. Specifically, the solver uses a gradient descent-based algorithm to iteratively update the parameters  $g^* [g_{n,k} \forall (n, k)$  in Fig. 13(b)], achieving SIC performance, as shown in Fig. 13(d).

### C. Small-Signal Cancellation Measurements

The aforementioned cancellation loop is leveraged to perform measurements for three different antenna interfaces, two of which are COTS ferrite circulators operating at 0.7–1 and 0.4–0.5 GHz, while the third is an on-chip non-magnetic circulator tuned to operate at 0.95 GHz [33]. Measurements are performed with the antenna port terminated by a 50- $\Omega$  termination for obtaining nominal performance, with a commercially available 650–950-MHz antenna (SPDA 24 700/2700) as a more realistic SI channel.

Fig. 14(a)–(c) shows the cancellation measurements showing the residual SI profile obtained using different circulators operating at different center frequencies when terminated with a 50- $\Omega$  termination for a bandwidth of 40 MHz using continuous-wave signals. We obtain a total of 57-dB suppression at a center frequency of 460 MHz with an initial isolation of 22- and 65-dB suppression at a center frequency of 800 MHz with an initial isolation of 23- and 70-dB suppression at a center frequency of 950 MHz with an initial isolation of 29 dB. Fig. 15(a)–(c) shows the cancellation measurements

TABLE I  
PERFORMANCE COMPARISON WITH PRIOR STATE-OF-THE-ART

|            |                                                  | JSSC 2015 [7]                                                                           | JSSC 2018 [12]                                                 | ISSCC 2018 [13]                                                               | JSSC 2021 [14]                                                                  | JSSC 2022 [15]                                                                        | This Work                                                                                                             |
|------------|--------------------------------------------------|-----------------------------------------------------------------------------------------|----------------------------------------------------------------|-------------------------------------------------------------------------------|---------------------------------------------------------------------------------|---------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------|
| RX metrics | Architecture                                     | RX with wideband SIC based on RF frequency-domain equalization                          | Adaptive Filter + Noise Canceling PA + LO sideband suppression | Electrical Balance Duplexer + Double-RF Adaptive FIR Filter                   | RX with SIC in RF and BB domains with switched capacitor FIR filters            | RF Canceler with Active Constant Gm Nested Vector Modulator with Baseband Group Delay | RX with SIC in RF and BB domains using stacked-capacitor switched-capacitor FIR filters and an embedded LNTA-canceler |
|            | RX Frequency Range                               | 0.8GHz - 1.4GHz                                                                         | 1.7GHz - 2.2GHz                                                | 1.6GHz - 1.9GHz                                                               | 0.1GHz - 1GHz                                                                   | 900MHz                                                                                | 0.1GHz - 1GHz                                                                                                         |
|            | Gain                                             | 27dB - 42dB                                                                             | 20dB - 36dB                                                    | 42dB                                                                          | 15 - 38dB                                                                       | N/A                                                                                   | 15 - 40dB                                                                                                             |
|            | Noise Figure                                     | 4.8dB                                                                                   | 4dB                                                            | 8.1dB                                                                         | 5.3dB                                                                           | 4.4dB <sup>a</sup>                                                                    | 3.7dB                                                                                                                 |
| FD Metrics | In band IIP3/IPIdB                               | -20dBm/-30dBm <sup>a</sup>                                                              | -5dBm/-15dBm <sup>a</sup>                                      | N/R                                                                           | -18.7dBm/-27dBm (for Gain of 27dB)                                              | N/A                                                                                   | -14dBm/-25dBm (for a Gain of 24dB)                                                                                    |
|            | Integrated SIC Domains                           | RF                                                                                      | RF + BB                                                        | RF                                                                            | RF + BB                                                                         | RF                                                                                    | RF + BB                                                                                                               |
|            | Canceler delay                                   | RF - 1ns to 28ns <sup>b</sup>                                                           | RF - 0 to 0.26ns<br>BB - 0 to 130ns                            | RF - 0 to 0.2ns,<br>BB - N/A                                                  | RF - 0.2ns to 1.1ns<br>BB - 10ns to 75ns                                        | RF (Upconverted from BB) - 80ns                                                       | RF - 0 to 7.75ns<br>BB - 0 to 85ns                                                                                    |
|            | Complex Canceler Gains                           | Yes                                                                                     | RF - No<br>BB - Yes                                            | No                                                                            | RF - No<br>BB - Yes                                                             | Yes                                                                                   | RF - No<br>BB - Yes                                                                                                   |
|            | Delay Programmability                            | Yes                                                                                     | No                                                             | No                                                                            | No                                                                              | Yes                                                                                   | RF - Yes<br>BB - No                                                                                                   |
|            | Number of taps in Canceler                       | RF - 2 bandpass filters                                                                 | RF - 5<br>BB - 14                                              | RF - 5<br>BB - N/A                                                            | RF - 7<br>BB - 7                                                                | RF - 3                                                                                | RF - 16<br>BB - 8                                                                                                     |
|            | Initial TX-RX isolation                          | >34dB (antenna pair)                                                                    | 30dB - 35dB (off-chip circ., 50Ω)                              | 39dB (on chip EBD, 500)                                                       | 22dB / 21 dB (off-chip circ. 500 / antenna)                                     | 44 - 50dB (antenna pair)                                                              | 22 - 23dB (off-chip circ. 500 / antenna)                                                                              |
|            | SIC from the Cancelers (Not including Ant. ISO.) | 18MHz (1.3%) for 20dB (1 filter)<br>26MHz (1.9%) for 20dB (2 filters)                   | 42MHz (2.1%) for 50dB                                          | 20MHz (1.1%) for 33.8dB<br>40MHz (2.2%) for 31.1dB<br>80MHz (4.4%) for 26.2dB | 20MHz (2.7%) for 29dB                                                           | 0.8MHz (0.08%) for 70dB<br>20MHz - wireline (2.2%) for 52dB                           | 40MHz (5%) for 42 dB @ 800MHz<br>40MHz (8.7%) for 35 dB @ 460MHz                                                      |
|            | NF degradation                                   | 0.9dB - 1.2dB (one filter)<br>1.1dB - 1.5dB (two filters) (for >34dB initial isolation) | 1.55dB (for 30-35 dB initial isolation)                        | 1.6dB (for 39dB initial isolation)                                            | 1.1 dB from RF Canceler<br>0.8 dB from BB Canceler (for 22dB initial isolation) | 1.75dB                                                                                | RF Can. LP (HP) modes: 0.3dB (2.3dB)<br>BB Can. LP(HP) modes: 0.5dB (0.5dB) (23dB Ant. ISO, and 40MHz BW)             |
|            | SI Power Handling Referenced to RX Input         | -8dBm                                                                                   | N/R                                                            | N/R                                                                           | -13dBm                                                                          | -25dBm                                                                                | -12dBm (LP mode)<br>-7dBm (HP mode)                                                                                   |
| Others     | RX Power                                         | 63mW - 69mW                                                                             | 22mW                                                           | 82.3mW                                                                        | 31mW                                                                            | N/A                                                                                   | 34mW                                                                                                                  |
|            | Canceler Power                                   | 88mW - 182mW                                                                            | 3.5mW (RF) + 8mW (BB)                                          | 14.3mW                                                                        | 25.5mW (RF)<br>6.5mW (BB)                                                       | 14.8mW per tap                                                                        | 7.4mW per tap (RF)<br>1.9mW per tap (BB)                                                                              |
|            | Technology                                       | 65nm CMOS                                                                               | 40nm CMOS                                                      | 40nm CMOS                                                                     | 65nm CMOS                                                                       | 65nm CMOS                                                                             | 65nm CMOS                                                                                                             |
| Others     | Active Area                                      | 4.8mm <sup>2</sup>                                                                      | 3.5mm <sup>2</sup>                                             | 4mm <sup>2</sup>                                                              | 5.15mm <sup>2</sup>                                                             | 1.2mm <sup>2</sup>                                                                    | 7.2mm <sup>2</sup>                                                                                                    |

<sup>a</sup> - Found in the journal version of their work. <sup>b</sup> - Narrowband delay due to bandpass filter(s). <sup>c</sup> - Fractional BW is calculated from the frequency of measured isolation. <sup>d</sup> - Measured with external RX. N/A - Not Applicable. N/R - Not reported.

across different bandwidths using the aforementioned circulators. For a center frequency of 460 MHz, the total suppression is 67, 57, and 43 dB across 20, 40, and 80 MHz, respectively. For a center frequency of 800 MHz, the total suppression is 65, 65, and 54 dB across 20, 40, and 80 MHz, respectively. For a center frequency of 950 MHz, the total suppression is 71, 70, and 67 dB across 20, 40, and 80 MHz, respectively. We obtain >55-dB suppression for various antenna interfaces for a wide range of bandwidths. Measurements performed with the 800-MHz circulator terminated with a COTS antenna indicate that we can obtain up to 47-dB SI suppression across 80-MHz bandwidth, as shown in Fig. 16(d). Digital cancellation adds an additional layer of cancellation that has not been explored in this work, which can provide up to 50-dB cancellation [40], [41]. Using this, we can obtain a total suppression of 100–120 dB, thus meeting the cancellation requirements to suppress the SI signal much below the noise floor of the receiver.

#### D. Large-Signal and Noise Measurements

The power of third and other higher order intermodulation products generated by the RF and baseband cancelers depends on the amount of power coupled from the TX. Coupling lesser power from TX requires having a higher gain while injecting the cancellation signal into the RX. We evaluate the large signal and noise performance of the canceler for two separate modes: 1) LP mode and 2) HP mode. In the LP mode, we couple higher power from the TX, which results in higher

intermodulation products, and in the HP mode, we couple lower power from the TX, which results in lower intermodulation products. However, in the HP mode, we require higher gain at the output and hence further degrade the NF of the RX. The SIC measurements shown in Section IV-C were performed using the LP mode and remain consistent across TX power levels of up to +10 dBm (−12 dBm at RX input). For TX power exceeding +10 dBm, we switch to the HP mode with an additional 2-dB NF degradation. In the HP mode, a TX power of up to +15 dBm (−7 dBm at RX input) can be handled without compressing the RX. Fig. 16(a) shows that 58- and 60-dB SI suppression is obtained using the HP mode at 460-MHz frequency for a TX power of +10 and +15 dBm, respectively. Fig. 16(b) shows the NF degradation of the RX in the presence of cancellation for both LP and HP modes. We see that the NF degradation is <1 and <3 dB for a 40-MHz BW for LP and HP modes, respectively, and <2 and <4 dB for a 80-MHz BW for LP and HP modes, respectively. Furthermore, using the HP mode, Fig. 16(c) shows that the RX's P1dB for a TX signal excitation has improved by +22 dB upon cancellation. We obtain an overall canceler measured IIP3 of +20 dBm. The expected amount of RX P1dB improvement should be close to the amount of RF cancellation obtained. We obtain slightly less than that due to the partial cancellation at the LNTA input, which worsens the TX induced RX P1dB.

To demonstrate realistic wideband cancellation, we performed SIC using an orthogonal frequency division multiplexing (OFDM) TX signal with a BW of 20 MHz and

peak to average power ratio (PAPR) of 10–12 dB using on-chip non-magnetic circulator. We obtain a total of 61-dB SI suppression with 28 dB obtained from the circulator and 33 dB obtained from the RF and baseband cancelers [Fig. 17(b)]. The measurement setup of this experiment, showing both the canceler and the circulator implemented on ICs, is shown in Fig. 17.

Table I compares this work with prior state-of-the-art, with higher delays being achieved in the RF domain, passive gain in delay elements and output buffer integrated into the LNTA resulting in a higher cancellation with better power handling and noise performance. Compared to the prior TDE (FDE) cancelers, we achieve 15-dB (25 dB) higher SIC for similar fractional BW and +6-dB (+1 dB) higher TX power handling with 1.1-dB (0.7 dB) better overall NF. Our chip area is considerably higher than most other works. This is mainly due to the large number of taps (16) and the large amount of delay within the taps (ten taps with up to 7.75-ns delay and five taps with up to 1.75-ns delay) that have been implemented on the chip. The increase in our area is not as significant as the increase in the total amount of delay that can be produced using this chip.

## V. CONCLUSION

This work utilizes SC circuits to obtain large nano-second-scale on-chip delays over wide bandwidths with small form factor to realize delay elements used for time-domain SI cancellation for FD wireless systems. Capacitor stacking is used to provide passive voltage gain, which improves delay line insertion loss and canceller power handling and NF degradation. The concepts described in this article were demonstrated using a prototype FD receiver with integrated RF and BB cancelers in a standard 65-nm CMOS process. The 16-tap RF canceler with time-interleaved capacitor-stacked eight-path and 32-path SC networks achieves programmable delays ranging from 0 to 7.5 ns. The eight-tap BB canceler achieves delays ranging from 0 to 85 ns and has gain and phase control for each tap. Through these cancelers, we obtained 41/38 dB of cancellation across 40-/80-MHz bandwidth on top of 29-dB isolation provided by a CMOS circulator operating at 0.95 GHz. The canceler can be operated in an LP mode with a low NF degradation of 0.8-dB handling +10-dBm TX power or in an HP mode with a higher NF degradation of 2.8-dB handling +15-dBm TX power.

## REFERENCES

- [1] D. Bharadia, E. Mcmillin, and S. Katti, “Full duplex radios,” in *Proc. ACM SIGCOMM Conf. SIGCOMM*, Aug. 2013, pp. 375–386.
- [2] A. Sabharwal, P. Schniter, D. Guo, D. W. Bliss, S. Rangarajan, and R. Wichman, “In-band full-duplex wireless: Challenges and opportunities,” *IEEE J. Sel. Areas Commun.*, vol. 32, no. 9, pp. 1637–1652, Sep. 2014.
- [3] B. Debaillie et al., “Analog/RF solutions enabling compact full-duplex radios,” *IEEE J. Sel. Areas Commun.*, vol. 32, no. 9, pp. 1662–1673, Sep. 2014.
- [4] J. Zhou et al., “Integrated full duplex radios,” *IEEE Commun. Mag.*, vol. 55, no. 4, pp. 142–151, Apr. 2017.
- [5] M. Katanbaf, K.-D. Chu, T. Zhang, C. Su, and J. C. Rudell, “Two-way traffic ahead: RFanalog self-interference cancellation techniques and the challenges for future integrated full-duplex transceivers,” *IEEE Microw. Mag.*, vol. 20, no. 2, pp. 22–35, Feb. 2019.
- [6] T. Chen et al., “A survey and quantitative evaluation of integrated circuit-based antenna interfaces and self-interference cancellers for full-duplex,” *IEEE Open J. Commun. Soc.*, vol. 2, pp. 1753–1776, 2021.
- [7] J. Zhou, T.-H. Chuang, T. Dinc, and H. Krishnaswamy, “Integrated wideband self-interference cancellation in the RF domain for FDD and full-duplex wireless,” *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 3015–3031, Dec. 2015.
- [8] D.-J. van den Broek, E. A. M. Klumperink, and B. Nauta, “An in-band full-duplex radio receiver with a passive vector modulator downmixer for self-interference cancellation,” *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 3003–3014, Dec. 2015.
- [9] D. Yang, H. Yüksel, and A. Molnar, “A wideband highly integrated and widely tunable transceiver for in-band full-duplex communication,” *IEEE J. Solid-State Circuits*, vol. 50, no. 5, pp. 1189–1202, May 2015.
- [10] S. Ramakrishnan, L. Calderin, A. Niknejad, and B. Nikolic, “An FD/FDD transceiver with RX band thermal, quantization, and phase noise rejection and >64 dB TX signal cancellation,” in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2017, pp. 352–355.
- [11] E. Kargaran, S. Tijani, G. Pini, D. Manstretta, and R. Castello, “Low power wideband receiver with RF self-interference cancellation for full-duplex and FDD wireless diversity,” in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2017, pp. 348–351.
- [12] T. Zhang, C. Su, A. Najafi, and J. C. Rudell, “Wideband dual-injection path self-interference cancellation architecture for full-duplex transceivers,” *IEEE J. Solid-State Circuits*, vol. 53, no. 6, pp. 1563–1576, Jun. 2018.
- [13] K.-D. Chu, M. Katanbaf, T. Zhang, C. Su, and J. C. Rudell, “A broadband and deep-TX self-interference cancellation technique for full-duplex and frequency-domain-duplex transceiver applications,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 170–172.
- [14] A. Nagulu et al., “A full-duplex receiver with true-time-delay cancelers based on switched-capacitor-networks operating beyond the delay-bandwidth limit,” *IEEE J. Solid-State Circuits*, vol. 56, no. 5, pp. 1398–1411, May 2021.
- [15] H. Abolmagd et al., “A hierarchical self-interference canceller for full-duplex LPWAN applications achieving 52–70-dB RF cancellation,” *IEEE J. Solid-State Circuits*, vol. 58, no. 5, pp. 1323–1336, May 2023.
- [16] X. Li, Y.-H. Huang, F. Yin, and J. C. Rudell, “A 2.4 GHz full-duplex transceiver with broadband (+120 MHz), linearity-calibrated and long-delayed self-interference cancellation,” in *Proc. IEEE 48th Eur. Solid State Circuits Conf. (ESSCIRC)*, Sep. 2022, pp. 417–420.
- [17] J.-F. Chang, J.-C. Kao, Y.-H. Lin, and H. Wang, “Design and analysis of 24-GHz active isolator and quasi-circulator,” *IEEE Trans. Microw. Theory Techn.*, vol. 63, no. 8, pp. 2638–2649, Aug. 2015.
- [18] T. Dinc, A. Chakrabarti, and H. Krishnaswamy, “A 60 GHz CMOS full-duplex transceiver and link with polarization-based antenna and RF cancellation,” *IEEE J. Solid-State Circuits*, vol. 51, no. 5, pp. 1125–1140, May 2016.
- [19] B. van Liempd et al., “A +70-dBm IIP3 electrical-balance duplexer for highly integrated tunable front-ends,” *IEEE Trans. Microw. Theory Techn.*, vol. 64, no. 12, pp. 4274–4286, Dec. 2016.
- [20] M. Elkholy, M. Mikhemar, H. Darabi, and K. Entesari, “Low-loss integrated passive CMOS electrical balance duplexers with single-ended LNA,” *IEEE Trans. Microw. Theory Techn.*, vol. 64, no. 5, pp. 1544–1559, May 2016.
- [21] T. Dinc, A. Nagulu, and H. Krishnaswamy, “A millimeter-wave non-magnetic passive SOI CMOS circulator based on spatio-temporal conductivity modulation,” *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3276–3292, Dec. 2017.
- [22] N. Reiskarimian, J. Zhou, and H. Krishnaswamy, “A CMOS passive LPTV nonmagnetic circulator and its application in a full-duplex receiver,” *IEEE J. Solid-State Circuits*, vol. 52, no. 5, pp. 1358–1372, May 2017.
- [23] N. Reiskarimian, M. B. Dastjerdi, J. Zhou, and H. Krishnaswamy, “Analysis and design of commutation-based circulator-receivers for integrated full-duplex wireless,” *IEEE J. Solid-State Circuits*, vol. 53, no. 8, pp. 2190–2201, Aug. 2018.
- [24] A. Nagulu et al., “Nonreciprocal components based on switched transmission lines,” *IEEE Trans. Microw. Theory Techn.*, vol. 66, no. 11, pp. 4706–4725, Nov. 2018.
- [25] T. Chi, J. S. Park, S. Li, and H. Wang, “A 64 GHz full-duplex transceiver front-end with an on-chip multifeed self-interference-canceling antenna and an all-passive canceler supporting 4 Gb/s modulation in one antenna footprint,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 76–78.
- [26] G. Qi, B. van Liempd, P.-I. Mak, R. P. Martins, and J. Cranckx, “A SAW-less tunable RF front end for FDD and IBFD combining an electrical-balance duplexer and a switched-LC N-path LNA,” *IEEE J. Solid-State Circuits*, vol. 53, no. 5, pp. 1431–1442, May 2018.

- [27] M. B. Dastjerdi, N. Reiskarimian, T. Chen, G. Zussman, and H. Krishnaswamy, "Full duplex circulator-receiver phased array employing self-interference cancellation via beamforming," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2018, pp. 108–111.
- [28] H. Hashemi, T.-S. Chu, and J. Roderick, "Integrated true-time-delay-based ultra-wideband array processing," *IEEE Commun. Mag.*, vol. 46, no. 9, pp. 162–172, Sep. 2008.
- [29] I. Mondal and N. Krishnapura, "A 2-GHz bandwidth, 0.25–1.7 ns true-time-delay element using a variable-order all-pass filter architecture in 0.13  $\mu$ m CMOS," *IEEE J. Solid-State Circuits*, vol. 52, no. 8, pp. 2180–2193, Aug. 2017.
- [30] S. K. Garakoui, E. A. M. Klumperink, B. Nauta, and F. E. van Vliet, "Compact cascadable  $g_m$ -C all-pass true time delay cell with reduced delay variation over frequency," *IEEE J. Solid-State Circuits*, vol. 50, no. 3, pp. 693–703, Mar. 2015.
- [31] M. Tymchenko, D. Sounas, A. Nagulu, H. Krishnaswamy, and A. Alú, "Quasielectrostatic wave propagation beyond the delay-bandwidth limit in switched networks," *Phys. Rev. X*, vol. 9, no. 3, Jul. 2019, Art. no. 031015.
- [32] A. Nagulu et al., "6.6 full-duplex receiver with wideband multi-domain FIR cancellation based on stacked-capacitor, N-path switched-capacitor delay lines achieving >54 dB SIC across 80 MHz BW and >15 dBm TX power-handling," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, vol. 64, Feb. 2021, pp. 100–102.
- [33] A. Nagulu, T. Chen, G. Zussman, and H. Krishnaswamy, "Multi-watt, 1-GHz CMOS circulator based on switched-capacitor clock boosting," *IEEE J. Solid-State Circuits*, vol. 55, no. 12, pp. 3308–3321, Dec. 2020.
- [34] S. Garamella et al., "Frequency-domain-equalization-based full-duplex receiver with passive-frequency-shifting N-path filters achieving >53 dB Si suppression across 160 MHz BW," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2023, pp. 225–228.
- [35] A. Ghaffari, E. A. M. Klumperink, M. C. M. Soer, and B. Nauta, "Tunable high-Q N-path band-pass filters: Modeling and verification," *IEEE J. Solid-State Circuits*, vol. 46, no. 5, pp. 998–1010, May 2011.
- [36] M. Khorshidian and H. Krishnaswamy, "26.7 an impedance-transforming N-path filter offering passive voltage gain," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, vol. 64, Feb. 2021, pp. 365–367.
- [37] V. K. Purushothaman, E. A. M. Klumperink, B. T. Clavera, and B. Nauta, "A fully passive RF front end with 13-dB gain exploiting implicit capacitive stacking in a bottom-plate N-path filter/mixer," *IEEE J. Solid-State Circuits*, vol. 55, no. 5, pp. 1139–1150, May 2020.
- [38] F. Beffa et al., "A receiver for WCDMA/EDGE mobile phones with inductorless front-end in 65 nm CMOS," in *Proc. IEEE Int. Solid-State Circuits Conf.*, Feb. 2011, pp. 370–372.
- [39] C. G. Tan et al., "A universal GNSS (GPS/Galileo/Glonass/Beidou) SoC with a 0.25 mm<sup>2</sup> radio in 40 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2013, pp. 334–335.
- [40] K. E. Kolodziej, B. T. Perry, and J. S. Herd, "In-band full-duplex technology: Techniques and systems survey," *IEEE Trans. Microw. Theory Techn.*, vol. 67, no. 7, pp. 3025–3041, Jul. 2019.
- [41] T. Chen, J. Zhou, M. B. Dastjerdi, J. Diakonikolas, H. Krishnaswamy, and G. Zussman, "Demo abstract: Full-duplex with a compact frequency domain equalization-based RF canceller," in *Proc. IEEE Conf. Comput. Commun. Workshops (INFOCOM WKSHPS)*, May 2017, pp. 972–973.



**Sasank Garikapati** (Student Member, IEEE) received the B.Tech. degree in electrical engineering from IIT Madras, Chennai, India, in 2019. He is currently pursuing the Ph.D. degree in electrical engineering with Columbia University, New York, NY, USA.

His research interests broadly lie in the domain of analog and RF integrated circuit design with applications in next-generation wireless communication systems.

Mr. Garikapati was a recipient of the Prof. Achim Bopp Endowment Prize from IIT Madras in 2019 and the Master of Science Award of Excellence from Columbia University in 2021.



**Aravind Nagulu** (Member, IEEE) received the B.Tech. and M.Tech. degrees in electrical engineering from IIT Madras, Chennai, India, in 2016, and the Ph.D. degree in electrical engineering from Columbia University, New York, NY, USA, in 2022.

In 2022, he joined the Department of Electrical and Systems Engineering, Washington University in St. Louis, St. Louis, MO, USA, where he is currently an Assistant Professor. His research interests focus on RF and millimeter-wave circuits, analog computing, cryogenic CMOS, and superconducting circuit

design for large-scale quantum systems.

Dr. Nagulu was a recipient of the IEEE RFIC Symposium Best Student Paper Award (First Place) in 2018, the IEEE Solid-State Circuits Society Predoctoral Achievement Award, the ISSCC Analog Devices Outstanding Student Designer Award, the IEEE MTT-S Graduate Fellowship in 2019, the IEEE Microwave Magazine Best Paper Award, and the Columbia University Electrical Engineering Collaborative Research Award in 2021, and the Eli Jury Award from Columbia University in 2022.



**Igor Kadota** (Member, IEEE) received the B.S. degree in electronic engineering and the S.M. degree in telecommunications from the Aeronautics Institute of Technology (ITA), São José dos Campos, Brazil, in 2010 and 2013, respectively, and the S.M. and Ph.D. degrees in communication networks from the Massachusetts Institute of Technology (MIT), Boston, MA, USA, in 2016 and 2020, respectively.

He was a Post-Doctoral Research Scientist with Columbia University, New York, NY, USA, from 2020 to 2023. He is currently an Assistant Professor with the Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, USA. His research is on modeling, analysis, optimization, and implementation of emerging communication networks, with an emphasis on wireless networks and time-sensitive traffic.

Dr. Kadota was a recipient of several research, teaching, and mentoring awards, including the Best Paper Award at IEEE INFOCOM 2018, the Best Paper Award Finalist at ACM MobiHoc 2019, the MIT School of Engineering Graduate Student Extraordinary Teaching and Mentoring Award of 2020, and the 2019–2020 Thomas G. Stockham Jr. Fellowship.



**Mostafa Essawy** (Graduate Student Member, IEEE) received the B.Sc. (Hons.) and M.Sc. degrees in electrical engineering from Ain Shams University, Cairo, Egypt, in 2014 and 2019, respectively. He is currently pursuing the Ph.D. degree in electrical engineering with Oregon State University, Corvallis, OR, USA.

From 2014 to 2019, he was an Analog/Mixed-Signal IC Design Engineer with Si-Ware Systems, Cairo. In 2022, he was an RFIC Design Intern with Apple Inc., San Diego, CA, USA. His current research interests include RF and millimeter-wave (mm-wave) circuits and systems in next-generation wireless communications and radar.

Mr. Essawy was a recipient of the 2019 EECS Outstanding Scholarship at Oregon State University, the 2022 Analog Devices Outstanding Student Designer Award, and the 2023–2024 IEEE Solid-State Circuits Society (SSCS) Predoctoral Achievement Award. He has served as a reviewer for *IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS* and *Circuits, Systems, and Signal Processing* (Springer).



**Tingjun Chen** (Member, IEEE) received the B.Eng. degree in electronic engineering from Tsinghua University, Beijing, China, in 2014, and the Ph.D. degree in electrical engineering from Columbia University, New York, NY, USA, in 2020.

He was a Post-Doctoral Associate at Yale University, New Haven, CT, USA, from 2020 to 2021. He is currently an Assistant Professor of electrical and computer engineering and computer science at Duke University, Durham, NC, USA. His research interests are in the area of networking and communications with a specific focus on next-generation wireless and optical networks, and Internet-of-Things systems.

Dr. Chen received the Google Research Scholars Award, the IBM Academic Award, the Facebook Fellowship, the Wei Family Private Foundation Fellowship, the Columbia Engineering Morton B. Friedman Memorial Prize for Excellence, the Columbia University Eli Jury Award and Armstrong Memorial Award, the ACM SIGMOBILE Doctoral Dissertation Award Runner-Up, and the ACM CoNEXT'16 Best Paper Award.



**Shibo Wang** received the B.S.E.E. degree in electrical engineering from the University of California at Santa Barbara, Santa Barbara, CA, USA, and the M.S.E.E. degree from Columbia University, New York, NY, USA, with a focus on high-speed circuit design.

Currently, he is engaged in the high-performance computing server industry. His expertise is centered around developing high-performance circuits for cloud-based servers.



**Tanvi Pande** received the bachelor's degree in physics from the Juniata College, Huntingdon, PA, USA, the bachelor's degree in electrical engineering from Columbia University, New York, NY, USA, and the master's degree in electrical engineering from Columbia University, with a focus on data-driven analysis and computation.

She is currently a Project Manager at a tech firm. She is interested in cloud computing, advanced data analytics, machine learning, artificial intelligence (AI), and other in data-centric advancements.



**Arun S. Natarajan** (Senior Member, IEEE) received the B.Tech. degree from IIT Madras, Chennai, India, in 2001, and the M.S. and Ph.D. degrees from the California Institute of Technology, Pasadena, CA, USA, in 2003 and 2007, respectively, all in electrical engineering.

From 2007 to 2012, he was a Research Staff Member with the IBM T. J. Watson Research Center, Yorktown Heights, NY, USA. He worked on millimeter-wave (mm-wave) phased arrays for multi-Gb/s data links/airborne radar and self-healing

circuits for increased yield in sub-micrometer process technologies. In 2012, he joined Oregon State University, Corvallis, OR, USA, where he is currently a Professor with the School of Electrical Engineering and Computer Science. He was part of the leadership team of Mixcomm Inc., Chatham, NJ, USA, a start-up focused on mm-wave beamformers for 5G and Satcom, which Sivers Semiconductors acquired in 2022. His research focuses on RF and mm-wave integrated circuits/systems for high-speed wireless communication/imaging, and low-power circuits for pervasive sensing and communication.

Dr. Natarajan was a recipient of the National Talent Search Scholarship from the Government of India (1995–2000), the Caltech Atwood Fellowship in 2001, the IBM Research Fellowship in 2005, the 2011 Pat Goldberg Memorial Award for the best paper in computer science, electrical engineering, and mathematics published by IBM Research, the CDADIC Best Faculty Project Award in 2014 and 2016, the NSF CAREER Award in 2016, the Oregon

State University Engelbrecht Young Faculty Award in 2016, and the DARPA Young Faculty Award in 2017. He has served on the Technical Program Committee of the IEEE International Solid-State Circuits Conference (ISSCC), the IEEE Radio-Frequency Integrated Circuits Conference (RFIC), and the IEEE Compound Semiconductors IC Symposium (CSICS) and as an Associate Editor for IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, and IEEE JOURNAL OF SOLID-STATE CIRCUITS.



**Gil Zussman** (Fellow, IEEE) received the Ph.D. degree in electrical engineering from Technion, Haifa, Israel, in 2004.

He was a Post-Doctoral Associate at the Massachusetts Institute of Technology from 2004 to 2007. He has been with Columbia University, New York, NY, USA, since 2007, where he is currently a Professor of electrical engineering and computer science (affiliated faculty). His research interests are in the area of networking, in particular in the areas of wireless, mobile, and

resilient networks.

Dr. Zussman received the Fulbright Fellowship, two Marie Curie Fellowships, the DTRA Young Investigator Award, and the NSF CAREER Award. He was a co-recipient of seven paper awards, including the ACM SIGMETRICS'06 Best Paper Award, the 2011 IEEE Communications Society Award for Advances in Communication, and the ACM CoNEXT'16 Best Paper Award. He was an Associate Editor of IEEE/ACM TRANSACTIONS ON NETWORKING, IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, and IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, and the TPC Chair of IEEE INFOCOM 2023, ACM MobiHoc 2015, and IFIP Performance 2011. He is the Columbia PI of the NSF PAWR COSMOS testbed.



**Harish Krishnaswamy** (Senior Member, IEEE) received the B.Tech. degree in electrical engineering from IIT Madras, Chennai, India, in 2001, and the M.S. and Ph.D. degrees in electrical engineering from the University of Southern California (USC), Los Angeles, CA, USA, in 2003 and 2009, respectively.

In 2009, he joined the Department of Electrical Engineering, Columbia University, New York, NY, USA, where he is currently a Professor and the Director of the Columbia High-Speed and Millimeter-Wave IC Laboratory (CoSMIC). In 2017, he co-founded MixComm Inc., Chatham, NJ, USA, a venture-backed start-up, to commercialize CoSMIC Laboratory's advanced wireless research. MixComm was acquired in February 2022 by Sivers Semiconductors for U.S. \$155M, where he is currently the Managing Director of the Wireless Business Unit. His research interests include integrated devices, circuits, and systems for a variety of RF, millimeter-wave (mmWave), and sub-mmWave applications.

Prof. Krishnaswamy is a member of the DARPA Microelectronics Exploratory Council. He was a recipient of the IEEE International Solid-State Circuits Conference Lewis Winner Award for Outstanding Paper in 2007, the Best Thesis in Experimental Research Award from the USC Viterbi School of Engineering in 2009, the Defense Advanced Research Projects Agency Young Faculty Award in 2011, the 2014 IBM Faculty Award, the Best Demo Award at the 2017 IEEE ISSCC, the Best Student Paper Awards at the 2015, 2018, and 2020 IEEE Radio Frequency Integrated Circuits Symposium and the 2020 IEEE International Microwave Symposium, the 2021 IEEE MTT-S Microwave Magazine Best Paper Award, and the 2019 IEEE MTT-S Outstanding Young Engineer Award. He has been a member of the Technical Program Committee of several conferences, including the IEEE International Solid-State Circuits Conference and the IEEE Radio Frequency Integrated Circuits Symposium. He has also served as a Distinguished Lecturer for the IEEE Solid-State Circuits Society.