# Multiphase Digital Low-Dropout Regulators

Ragh Kuttappa, *Student Member, IEEE*, Longfei Wang, Selçuk Köse, *Member, IEEE*, and Baris Taskin, *Senior Member, IEEE* 

Abstract-In this work, multiphase digital low-dropout (MP-DLDO) regulators are designed with resonant rotary clocks in order to improve on the trade-off of conventional DLDOs between current efficiency and transient response speed. The proposed DLDOs are multi-phased, coined MP-DLDOs, designed with a clock gated control technique to provide high current efficiencies along with transient response improvements at GHz frequency levels. The multiple-phases within the MP-DLDO are served with resonant rotary clocks that provide 1) a robust high speed low power resonant clock distribution solution for the synchronous elements in the multiphase DLDO architecture and 2) improve the transient response characteristics (DVS speed and voltage ripple) while saving power in the controller circuitry. The proposed MP-DLDOs are distributed across the chip to achieve low voltage ripple. SPICE simulations are performed on post-layout, parasitic-extracted models to evaluate the MP-DLDO architecture on open-source digital cores, with performance metrics that include the voltage ripple reduction, transient response speed improvement, and power savings in the control logic. The proposed MP-DLDO architecture, evaluated on a RISC-V design, demonstrates a DVS speed of  $6.5\,\mathrm{V}/\mu\mathrm{s}$  and output voltage ripple of 21.1 mV (38% reduction when compared to a conventional **DLDO**) with a sampling frequency of 2 GHz.

Index Terms—Digital low-dropout (DLDO), resonant rotary clock, voltage regulators, low power, VLSI.

#### I. INTRODUCTION

RULLY integrated voltage regulators (FIVR) enable onchip granular power management, critical in the efficient implementation of systems with dynamic voltage and frequency scaling (DVFS). With aggressively increasing performance demands in digital systems, and a rapid increase in the number of power domains to support such performance with efficiency, energy efficient power generation and delivery is critical. Distributing the on-chip voltage regulators in fine spatial and temporal granularity enables fast switching between different voltage domains. Multiphase switching regulators are known to reduce the output voltage ripple, however require robust multiphase clock sources. This work introduces such a multiphase digital low-dropout (MP-DLDO) regulator architecture to mitigate voltage ripple due to limit cycle oscillations (LCO). The challenge of providing multiple clock sources to distributed switching regulators in an energy efficient manner is addressed through a control circuit implementation leveraging resonant rotary clocks (ReRoC).

R. Kuttappa and B. Taskin are with the Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA 19104 USA (email: fr67@drexel.edu; taskin@coe.drexel.edu). L. Wang is currently with Qualcomm Technologies Inc., San Diego, CA, USA. S. Köse is with the Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627 USA (e-mail: selcuk.kose@rochester.edu). Corresponding author: Ragh Kuttappa.

This material is based upon work supported by the National Science Foundation under Grant 1816857 and Grant 2008629.



Fig. 1. Digital low-dropout (DLDO) regulators topology. (a) Conventional DLDO. (b) Multiphase DLDO (MP-DLDO).

As a result, multiphase DLDOs are designed with ReRoC-synchronized components, enabling i) low voltage ripple operation, ii) an energy efficient power generation solution, and iii) fast dynamic voltage scaling (DVS).

Conventional DLDOs are designed with a control logic (digital controller), feedback loop (comparator), and an array of PMOS power transistors, illustrated in Figure 1(a) [1]. The control logic is implemented with bi-directional shift registers enabled by the comparator in the feedback loop to minimize the steady-state error. Digital regulators provide high efficiency, sufficient bandwidth and ease of integration, however they do suffer from lower power supply rejection (PSR). Most recently, MP-DLDOs are shown to reduce the output voltage ripple across varying load currents [2]. At high frequencies, however, the steady-state power consumption of the conventional DLDO increases [3]. The MP-DLDOs designed in this work are illustrated in Figure 1(b). The clock sources for the individual stages of the MP-DLDOs are designed with custom clock trees to resonantly switch the synchronous elements in each stage of the MP-DLDO architecture. The multiphase properties of ReRoCs are leveraged to provide robust clock sources to the MP-DLDOs.

Conventional DLDOs suffer from slower transient responses to large load current steps due to the time taken for the sequential switching of the synchronous components. Transient response time of the regulators can be improved by increasing the sampling frequency. However, the limit cycle oscillations (LCO) increase with an increase in the sampling frequency [4]. Despite the advantages of DLDOs, LCOs due to inherent quantization error can occur, which lead to unwanted output voltage ripple at steady state. In this work, the improvements in the output voltage ripple is achieved by phase interleaving with a resonant clock source. The MP-DLDO circuit topology with N regulators, designed with N clock phases, is illustrated in Figure 1(b). The N phases for the MP-DLDO circuit topology are obtained from the ReRoC implemented on-chip. To address the current inefficiencies at high frequencies the clock gating control for the DLDOs is adopted from [5]. In addition to designing MP-DLDOs in this work, the clock design resources are opportunistically shared to provide clock distribution networks to the digital cores [6].

The contribution of this work are as follows:

- 1) Multiphase DLDO design, achieving low voltage ripple,
- 2) Fast transient response, leveraging high frequency resonant clock sources,
- 3) Clock gating architecture in the control loop to improve the current efficiency [5] of the multiphase DLDO,
- Opportunistic sharing of clock resources ReRoCs are shared between the MP-DLDOs and digital cores, (optionally) providing low-power, high-frequency resonant clocking to the digital cores, as well, if desired,
- 5) Resonant rotary clock generation and distribution power reduction as compared to an AD-PLL based implementation, and
- 6) PVT, static IR, and Ldi/dt results for the proposed architecture.

The results presented in this work are from post-layout simulation models, i.e. inclusive of extracted parasitics, for sign off accuracy.

The rest of this paper is organized as follows. Preliminary information encapsulating the circuit level aspects for DL-DOs and ReRoCs is introduced in Section II. The proposed multiphase DLDO (MP-DLDO) architecture is presented in Section III. Performance evaluation on open-source industrial designs is presented in Section IV. Conclusions are provided in Section V.

#### II. PRELIMINARIES

The following sections discuss the convention DLDO design, prior works, and resonant rotary clock background.

# A. Conventional DLDO Regulator

Digital low-dropout (DLDO) regulators are composed of N parallel PMOS transistors  $P_i$  (i=1,...,N) connected between the input voltage  $V_{in}$  and output voltage  $V_{out}$ , and feedback control loop implemented with a clocked comparator and digital controller, as illustrated in Figure 1(a). The parallel power transistors  $P_1$  to  $P_N$  provide the required load current  $I_{LOAD}$ . A bidirectional shift register (bDSR) illustrated in Figure 2(a) is utilized to digitally control the turn ON (OFF) power transistors  $P_1$  to  $P_m$  ( $P_{m+1}$  to  $P_N$ ) with the value of



Fig. 2. Digital controller for conventional DLDO. (a) Bi-directional shift register (bDSR). (b) Operation of bDSR.



Fig. 3. Resonant rotary clock (ReRoC).

m determined by the load current  $I_{LOAD}$ . At a certain step k+1,  $P_{m+1}$  ( $P_m$ ) is tuned ON (OFF) if  $V_{cmp} = H$  ( $V_{cmp} = L$ ) and the bDSR shifts right (left) as illustrated in Figure 2(b). DLDOs provide high current efficiencies with load steps  $<100\,\mathrm{mA}$  sampled at MHz frequency ranges. However, at higher sampling frequencies the current efficiency decreases.

#### B. Prior Works on DLDOs, and MPVRs

The existing works for DLDOs have focused on improving the transient response by primarily enhancing the control loop and architecture [3, 5, 7-9]. To address the negative implications of LCOs, in [4] an LCO reduction technique by adding auxiliary power transistors is proposed. In the case of LDOs, there is detailed analysis on the stability of the system and the challenges associated with the parasitics of distributed VRs on the power grid [10–12]. Multiphase voltage regulators (MPVR) have been primarily designed with switched capacitor voltage regulator (SCVR) with prototypes targeting low output voltage ripple and high conversion efficiency along with fast transient response [13–15]. Most recently, ring oscillators have been leveraged to generate multiphase clock sources for MPVRs [16]. The generation and synchronization of the input clock signal to each MPVR becomes quite costly and even unfeasible when the number of phases is very high. To address multiphase clock distribution for MPVRs, ReRoCs



Fig. 4. Proposed on-chip multiphase DLDO architecture. (a) Multiphase DLDO regulator. (b) Proposed architecture. (c) Resonant rotary clock (ReRoC).

have been shown to be an ideal fit [6], explored in greater detail in this manuscript with innovations listed in Section I.

# C. Resonant Rotary Clock Design

Resonant rotary clocks (ReRoCs) is a type of resonant clocking with constant magnitude, low power, low jitter, and multiple phases [17]. ReRoCs are designed using IC interconnects for the transmission lines and inverter pairs that are uniformly distributed along the transmission lines in anti-parallel fashion, as illustrated in Figure 3. The ReRoC generates a resonating signal along the mobius shape, where the frequency depends on the geometry of the distribution network and load, due to resonance [17]. The ReRoC is modeled as an LC oscillator, where the frequency is estimated by,

$$f_{osc} = \frac{1}{2\sqrt{L_T C_T}}. (1)$$

In Eq. (1),  $f_{osc}$  is the operating frequency of the ReRoC. The total inductance  $L_T$  depends on the geometry of the ring and  $C_T$  is the total capacitance of the ring, interconnects and devices connected to the ReRoC ring. Most efficient design of ReRoCs are sparse rotary oscillator arrays (SROA) [18], similar to a non-uniform clock mesh topology. The capacitive load and inductance affect the frequency of oscillations. The ReRoCs are implemented as SROAs through algorithmic novelties proposed in the local distribution for use in the proposed architecture in this work.

ReRoCs are multiphase clock sources as illustrated in Figure 3. CLK1 through CLK8 are the multiple phases readily available on each local ReRoC building block. In particular, CLK1 is shifted by  $0^{\circ}$  from a reference, whereas CLKN is shifted by (N-1)×360°/N. Distributing ReRoCs in SROA topology provides multiple phases of the clock source across the entire die area.

Prototype implementations of ReRoCs on silicon have been presented to explore the energy efficiency and the multiple phases [17, 19]. Rotary clock based distribution networks have been well studied in recent years focusing on optimization of the local and global clock networks [18, 20–23]. In [24], resonant rotary clocks on an average save 29% power in comparison to a phase locked loop based clocking solution implemented on open-source designs available for academic



Fig. 5. Clocking circuit topology for adaptive control of the MP-DLDOs [5].

research. The ReRoCs provide dedicated and robust clock phases for the MP-DLDOs in this work.

#### III. PROPOSED MULTIPHASE DLDO ARCHITECTURE

Proposed on-chip, multiphase DLDOs (MP-DLDO) are designed as illustrated in Figure 4. The proposed architecture is implemented with a clock gated digital controller for the DLDO to provide high current efficiency at the GHz frequency levels and resonant rotary clocks to provide the multiple phases for the MP-DLDOs. The MP-DLDOs are distributed throughout the power grid and ReRoCs are distributed locally as illustrated in Figure 4(b). The proposed architecture is detailed in the following sections: in Section III-A multiphase DLDOs are presented and in Section III-B ReRoC design for the MP-DLDOs are presented.

# A. Multiphase DLDO

The circuit topology for the proposed MP-DLDO is illustrated in Figure 4(a). Each DLDO in the MP-DLDO circuit topology is designed with a high gain comparator that is driven by a high frequency ReRoC clock, illustrated in Figure 4(a). The comparator architecture is chosen as a double-tail latch-type for speed and low kickback noise. To minimize the steady-state error, the comparator output is fed to the bi-directional shift register (bDSR). The bDSR output controls the number of the pMOS power transistors that need to be turned on. The comparator and the bDSR are sampled by the same clock signal generated from the ReRoC. Increasing

the sampling frequency  $(F_s)$  of the synchronous components, transient response speed can be improved while the mode of oscillations increase causing an increase in the output voltage ripple. Furthermore, at high F<sub>s</sub>, the current efficiencies of the conventional DLDOs decrease [25]. The proposed MP-DLDO circuit topology is designed to mitigate the output voltage ripple due to limit cycle oscillations (LCO) with phase interleaving. To provide high current efficiencies at high frequencies, the clock gating technique from [5] is adopted in this work. The clocking circuit topology for gated control is illustrated in Figure 5. The N-bit bi-directional shifter register (bDSR) is divided into N/m-bit sections, which are clockgated in a sliding manner. The bDSR outputs are tapped from the boundary flip-flops of the N/m-bit sections to generate enable signals for each N/m-bit section, illustrated in Figure 5. To turn-off the switching during the idle modes, a fine-grained clock gating is implemented to enhance power efficiency of the overall system. For example, a 256-bit bDSR is divided into eight 32-bit sections, which are clock-gated in a sliding manner. The bDSR outputs are tapped onto the flip-flops via the enable signals, as shown in Figure 5. This control logic enables the desired columns of the DLDO when required while keeping the undesired columns clock gated, while improving the overall conversion efficiency of the DLDO [5].

In the following sections, the output voltage ripple due to limit cycle oscillations (LCOs) and transient response of the MP-DLDOs are presented.

1) Output Voltage Ripple due to LCOs: Distributing onchip voltage regulators is leveraged to reduce resistive the IR noise by reducing the distances between voltage regulators (VRs) and load circuits. An increase in F<sub>s</sub> leads to an increase in the mode of oscillations resulting in output voltage ripple increase. The amplitude and mode of LCO can be investigated utilizing a nonlinear sampled feedback model for DLDO steady state operation illustrated in Figure 6 [4].  $N(A,\varphi), S(z)$ , and P(z) represent, respectively, the describing function of the clocked comparator, transfer function of the shift register with the delay between the shift register and the clocked comparator, and transfer function of the power transistor array with the load circuit and the function of zero order hold. A and  $\varphi$  denote the amplitude of limit cycle oscillation and phase shift of x(t).  $N(A, \varphi), S(z)$ , and P(z)are expressed in [4, 25] as:

$$N(A,\varphi) = \frac{2D}{MTA} \sum_{m=0}^{M-1} sin\left(\frac{\pi}{2M} + \frac{m\pi}{M}\right) \angle \left(\frac{\pi}{2M} - \varphi\right), (2)$$

$$S(z) = z^{-1} \cdot \frac{z}{z-1} = \frac{1}{z-1},$$
 (3)

$$P(z) = K_{OUT} \frac{1 - e^{-F_l T}}{F_l(z - e^{-F_l T})}$$
 (4)

where  $T,D,M,F_l$ , and  $K_{OUT}$ , respectively, are the clock period, amplitude of the clocked comparator output, mode of limit cycle oscillation, load pole, and gain of P(z).  $K_{OUT} = K_{DC}I_{pMOS}$  and  $F_l = 1/(R_L \parallel R_p)C$ .  $K_{DC},I_{pMOS},R_L,R_p$ , and C are DC constant, current provided by a single PMOS transistor, load resistance, power



Fig. 6. DLDO nonlinear sampled feedback model.

transistor array resistance, and the output capacitance, respectively.

The following Nyquist condition needs to be satisfied for a certain mode of limit cycle oscillation to exist:

$$N(A,\varphi)P(e^{j\omega T})S(e^{j\omega T}) = 1\angle(-\pi)$$
 (5)

where the angular limit cycle oscillation frequency  $\omega = \pi/TM$ . Determined by Eq. (5),

$$\varphi = \frac{\pi}{2} - \frac{\pi}{2M} - tan^{-1} \left(\frac{\pi}{MTF_l}\right), \varphi \in (0, \pi/M).$$
 (6)

For distributed on-chip voltage regulation with N DLDOs, under a certain load current  $I_{LOAD}$ , if all N DLDOs operate with the same clock, the equivalent  $I_{pMOS}$  becomes  $NI_{pMOS}$  as N power transistors are turned on/off at the rising edge of each clock cycle.  $F_l$  and T remain unchanged as compared with the case of a single DLDO. For this case, Eq. (6) becomes

$$\varphi_1 = \frac{\pi}{2} - \frac{\pi}{2M_1} - tan^{-1} \left(\frac{\pi}{M_1 T F_l}\right), \varphi_1 \in (0, \pi/M_1).$$
 (7)

When N distributed DLDOs operate with multiphase clock signals,  $F_l$  and  $I_{pMOS}$  remain unchanged while the equivalent T becomes T/N. Therefore, Eq. (6) becomes

$$\varphi_2 = \frac{\pi}{2} - \frac{\pi}{2M_2} - tan^{-1} \left(\frac{N\pi}{M_2 T F_l}\right), \varphi_2 \in (0, \pi/M_2).$$
 (8)

Comparing  $\varphi_1$  with  $\varphi_2$ , the condition  $\varphi \in (0, \pi/M)$  is more likely to be satisfied by  $\varphi_2$  with a larger value of M due to the added factor of N, which means a larger limit cycle oscillation mode. However, the amplitude of the output voltage ripple demonstrates an approximately linear relationship with the equivalent size of a single power transistor [25]. Furthermore, the effect of increased equivalent switching frequency on the voltage ripple shows nonlinear behavior and is less significant than the effect of the equivalent size of a single power transistor [4]. Compared to the case of N DLDOs operating with the same clock phase, the equivalent switching frequency of a multiphase DLDO is increased by N times while the equivalent size of a single power transistor is decreased by N times. A smaller output voltage ripple is therefore typically introduced with phase interleaved clocks.

To investigate the MP-DLDO circuit topology, two sets of results are presented. The first set of results is the voltage ripple with and without phase interleaving. A resistive mesh with 6 DLDOs is simulated in SPICE with a current profile



Fig. 7. Output voltage ripple waveforms for 6 distributed DLDOs under 100 mA load current with and without phase interleaving, sampling frequency is 100 MHz.

of  $100\,\mathrm{mA}$ . The current provided by a single power transistor  $I_{pMOS}$  is approximately  $1\,\mathrm{mA}$  and sampled at a frequency of  $100\,\mathrm{MHz}$ . In Figure 7, the output voltage ripple with and without phase interleaving of the DLDOs is presented. Uniform distribution of the DLDOs is considered. Under a load current of  $100\,\mathrm{mA}$ , the phase interleaved topology provides a 28% decrease in the voltage ripple. Increasing the number of VRs helps reduce the voltage ripple, while phase interleaving further reduces the voltage ripple.

In Figure 7, for DLDOs without phase interleaving, ripple frequency is determined by

$$F_1 = 1/T_1 = F_{s1}/(2M_1),$$
 (9)

and for DLDOs with phase interleaving, ripple frequency is determined by

$$F_2 = 1/T_2 = F_{s2}/(2M_2).$$
 (10)

In Eq. 9 and Eq. 10,  $F_{s1}$ ,  $F_{s2}$ ,  $M_1$ , and  $M_2$  are, respectively, the effective sampling frequency of DLDOs without phase interleaving, the effective sampling frequency of DLDOs with phase interleaving, the mode of limit cycle oscillation for DLDOs without phase interleaving, and the mode of limit cycle oscillation for DLDOs with phase interleaving.

Assume a single DLDO operates with a clock frequency of  $F_s$ . Then,  $F_{s1} = F_s$  and  $F_{s2} = NF_s$ , where N is the number of phases (single DLDOs). Therefore,  $F_1 = F_s/(2M_1)$  and  $F_2 = NF_s/(2M_2)$ .  $F_2/F_1 = NM_1/M_2$ . Higher equivalent sampling frequency typically leads to larger limit cycle oscillation mode  $(M_2 > M_1)$ . Whether  $F_2$  is larger than  $F_1$  is determined by the values of N,  $M_1$ , and  $M_2$ .  $F_2$  is not necessarily larger than  $F_1$ . That is, the output ripple of DLDOs with phase interleaving does not necessarily show a higher ripple frequency than the design without phase interleaving.



Fig. 8. Voltage droop with varying phases and sampling frequencies for a load current=100 mA.

For the example shown in Fig. 7, N = 6,  $M_1 = 6$ , and  $M_2 = 57$ , which leads to  $F_1 > F_2$ .

In Figure 8, the voltage droop for the MP-DLDOs designed with different number of phases and frequency is presented. The voltage ripple decreases with an increase in the number of VRs and sampling frequency. It can be observed from Figure 8 that at a sampling frequency of  $2\,\mathrm{GHz}$  for the MP-DLDOs designed with 8 phases the voltage droop decreases by 91% when compared to the same design with a sampling frequency of  $10\,\mathrm{MHz}$ . The voltage droop in the 1 phase DLDO is affected by both the sampling frequency and the LSB of the power transistor as detailed below.

Magnitude of the transient voltage droop  $\Delta V$  can be estimated as [26]

$$\Delta V = R\Delta i_{load} - I_{pMOS}F_sR^2Cln(1 + \frac{\Delta i_{load}}{I_{pMOS}F_sRC})$$
 (11)

where R, C, and  $\Delta i_{load}$  are, respectively, the average DLDO output resistance before and after  $\Delta i_{load}$ , load capacitance, and amplitude of the load change.

 $I_{pMOS}$  is the current provided by a single power transistor and is determined by the LSB of the power transistor.  $F_s$  is the sampling frequency. The major focus of this work is on the benefits of high frequency and multiphase operation of DLDOs and thus voltage droop variation under varying frequency and a constant LSB is analyzed in Figure 8.

The second set of results presented is for the current efficiency of the MP-DLDO with a maximum output current of 100 mA, presented in Figure 9. In Figure 9, the current efficiency of the DLDOs designed with an clock gated control loop with varying phases along with a conventional DLDO operating under different sampling frequencies is plotted. Conventional DLDOs (i.e., without clock gating) designed with phases greater than 2 have significant reduction in the current efficiency (approximately 90% reduction), hence ignored in the plots. The current efficiency of the DLDOs decrease with an increase in the sampling frequency, as observed in Figure 9. A conventional single phase DLDO with a sampling frequency of 2 GHz has a current efficiency of 42%, whereas the clock gated DLDO implementation has a current efficiency of 86%. The current efficiency of the clock gate controlled DLDOs



Fig. 9. Current efficiency for MP-DLDOs with varying phases and load current= $100\,\mathrm{mA}$ .



Fig. 10. Transient response time for a  $50\,\mathrm{mA} \to 100\,\mathrm{mA}$  load current under a load step time of  $1\,\mu s.$ 

employed in this work deteriorate at a slower rate than that for a conventional DLDO. The deterioration in the current efficiency for the DLDOs designed with 2 phases can be observed in Figure 9.

2) Transient Response: Increasing the sampling frequency improves the transient response speed for varying load profiles of the DLDOs. However, the conventional DLDO has larger steady-state power consumption with an increase in the sampling frequency and lower current efficiency. Adopting the clock gating topology for the DLDOs permits the increase in the sampling frequency while achieving higher current efficiencies. Leveraging the multiphase benefits along with improved transient characteristics of the MP-DLDO provides an attractive solution to address DVS requirements. The transient response time  $(T_R)$  is a measure of the speed with which the feedback loop responds to a change in the load.  $T_R$  is estimated as [26]:

$$T_R = RC \ln \left( 1 + \frac{\triangle i_{load}}{I_{pMOS} F_s RC} \right) \tag{12}$$

where R, C, and  $\triangle i_{load}$  are the average DLDO output resistance before and after  $\triangle i_{load}$ , load capacitance, and amplitude of the load change. Under transient load conditions, the bDSR activation/deactivation follows the pattern illustrated



Fig. 11. DVS transient response for step-up and step-down for a 8 phase MP-DLDO,  $F_s=2{\rm GHz}.$ 

in Figure 2(b). When  $V_{out} < V_{ref}$  ( $V_{out} > V_{ref}$ ) due to increased (decreased) load current, more power transistor is activated (deactivated) during each clock cycle. In Figure 10, the transient response time is presented with different phases and sampling frequencies for the MP-DLDOs.

In Table I, the dynamic voltage scaling (DVS) speed of the proposed MP-DLDO with different number phases are compared to the conventional DLDO at the same sampling frequency. An 8 phase MP-DLDO designed with a sampling clock frequency of  $2\,\mathrm{GHz}$  takes  $16\,\mathrm{ns}$ , which corresponds to a tracking speed of  $6.3\,\mathrm{V}/\mu\mathrm{s}$ . For a sampling frequency of  $1\,\mathrm{GHz}$ , an 8 phase MP-DLDO has a tracking speed of  $3.2\,\mathrm{V}/\mu\mathrm{s}$ . Increasing the number of phases provides a faster tracking speed for DVS along with lower voltage ripple on comparison to a conventional DLDO. In Figure 11, the DVS performance of the MP-DLDO is shown. An 8 phase MP-DLDO designed with a sampling clock frequency of  $2\,\mathrm{GHz}$  takes  $16\,\mathrm{ns}$  for a step-up from 1V to 1.1V, which corresponds to a tracking speed of  $6.3\,\mathrm{V}/\mu\mathrm{s}$ .

In Figure 12, the load transient switching between 10mA and 100mA with a period of 1  $\mu$ s, and the MP-DLDO operating with a sampling frequency of 2 GHz. For the I<sub>load</sub> switching from 10mA to 100mA, the transient response time is 17ns and for the I<sub>load</sub> switching from 100mA to 10mA, the transient response time is 15ns, as shown in Figure 12.

3) Stability concern: The stability of distributed DLDOs is affected by both the stability of each single phase DLDO and the stability concerning the interactions among DLDOs and the power delivery network (PDN). A single phase DLDO tends to be more stable under higher load current conditions and higher sampling frequency [27]. The system consisting of distributed DLDOs and PDNs is more stable with lower number of DLDOs, lower off-die impedance, smaller controller gain values, larger resistance among distributed DLDOs, and higher load current conditions [27, 28].

# B. Multiphase Resonant Rotary Clock

Designing distributed VRs require clock sources that provide a low-power and robust solution over large die sizes.

 $TABLE\ I$  Dynamic Voltage Scaling Speed of the Proposed MP-DLDO designed with sampling frequencies of  $1\,\mathrm{GHz}$  and  $2\,\mathrm{GHz}.$ 

|                                                        | Conventional DLDO                 | 2 phase MP-DLDO                   | 4 phase MP-DLDO                   | 8 phase MP-DLDO                   |
|--------------------------------------------------------|-----------------------------------|-----------------------------------|-----------------------------------|-----------------------------------|
| $\overline{V_{in}}$                                    | 0.8 -1.2 V                        | 0.8 -1.2 V                        | 0.8 -1.2 V                        | 0.8 -1.2 V                        |
| ${ m V}_{out}$                                         | 0.7 -1.1 V                        | 0.7 -1.1 V                        | 0.7 -1.1 V                        | 0.7 -1.1 V                        |
| $F_s$                                                  | $1\mathrm{GHz}$                   | 1 GHz                             | $1\mathrm{GHz}$                   | $1\mathrm{GHz}$                   |
| I <sub>load</sub> Max                                  | $100\mathrm{mA}$                  | $100\mathrm{mA}$                  | $100\mathrm{mA}$                  | $100\mathrm{mA}$                  |
| DVS speed                                              | $2.2\mathrm{V}/\mu\mathrm{s}$     | $2.4\mathrm{V}/\mu\mathrm{s}$     | $2.7\mathrm{V}/\mu\mathrm{s}$     | $3.2\mathrm{V}/\mu\mathrm{s}$     |
|                                                        |                                   |                                   |                                   |                                   |
|                                                        | Conventional DI DO                | 2 phase MD DI DO                  | 4 phase MD DI DO                  | 9 phase MD DI DO                  |
|                                                        | Conventional DLDO                 | 2 phase MP-DLDO                   | 4 phase MP-DLDO                   | 8 phase MP-DLDO                   |
| $\overline{\mathbb{V}_{in}}$                           | Conventional DLDO<br>0.8 -1.2 V   | 2 phase MP-DLDO<br>0.8 -1.2 V     | 4 phase MP-DLDO<br>0.8 -1.2 V     | 8 phase MP-DLDO<br>0.8 -1.2 V     |
| $oxed{ f V_{in} } f V_{out}$                           |                                   | 1                                 | 1                                 | 1                                 |
|                                                        | 0.8 -1.2 V                        | 0.8 -1.2 V                        | 0.8 -1.2 V                        | 0.8 -1.2 V                        |
| $egin{array}{c} {\sf V}_{out} \ {\sf F}_s \end{array}$ | 0.8 -1.2 V<br>0.7 -1.1 V          |
| $egin{array}{c} {\sf V}_{out} \ {\sf F}_s \end{array}$ | 0.8 -1.2 V<br>0.7 -1.1 V<br>2 GHz |
| $V_{out}$                                              | 0.8 -1.2 V<br>0.7 -1.1 V<br>2 GHz |



Fig. 12. Output voltage showing load transient response with  $I_{load}$  switching between  $10\,\mathrm{mA}$  and  $100\,\mathrm{mA}$  with a period of  $1\,\mathrm{\mu s},$  and the MP-DLDO sampling frequency of  $2\,\mathrm{GHz}.$ 

An increase in the number of phases for the MPVRs, results in significant overhead in clock sources and routing interconnects. The phase delay of the ReRoC is even distributed in the direction of wave propagation. The relationship between the time delay, t and phase delay,  $\theta$  is expressed as:

$$\frac{\theta}{2\pi} = \frac{t}{T}.\tag{13}$$

In Eq. (13), T is the clock period.

Distributing ReRoCs in an ROA topology provides multiple phases of the clock source across the entire die area. The synchronous elements in each DLDO of the MP-DLDO architecture requires a certain clock phase. The primary objective is to distribute the multiphase clocks to the clock pins of the DLDOs such that the skew in matched. An illustration of the comparison between a conventional clock distribution from a ring oscillator and a ReRoC is shown in Figure 13.

Conventional clock distribution requires a clock source, such a PLL with a crystal oscillator, and clock routing resources that include interconnects and buffers to ensure reliable and full swing clock delivery for the multiphase VRs, irrespective of the regulator type. The clock routing wirelength and clock buffers increase as the die size scales up with conventional clock distribution. Equally importantly, variations cause increased slew and skew over the conventional clock distribution networks. Each ReRoC ring in the SROA topology has evenly distributed phases  $0^{\circ}$  to  $360^{\circ}$  that can be tapped onto to provide the multiple phases for the MP-DLDOs. The proposed ReRoC implementation utilizes wires and interconnects not only to route the clock phases to the MP-DLDOs but also generates the clock signal, eliminating the need for a clock source. Only local routing is necessary, as shown in 13(b). The impact of variations on a ReRoC are mitigated through the resonance of the multiple connected rings [29].

The phase requirement from the ReRoC tapping point to the clock pins on the DLDO architecture must be satisfied such that additional phase incurred by clock routing wire from the tapping point to the clock pins are computed. Consider a clock pin located at  $(x_k, y_k)$  which has a phase requirement of  $\psi_k$ . The total phase  $\Theta_i(x_k, y_k)$  at the clock pin location  $(x_k, y_k)$  from each tapping point on the ReRoC is computed as [23]:

$$\Theta_i(x_k, y_k) = \theta_i + \phi[l_i(x_k, y_k)] \tag{14}$$

where  $\theta_i$  is the phase at the tapping point  $TP_i$  on the ReRoC ring and  $\phi[l_i(x_k,y_k)]$  is the phase of the tapping wire of length  $l_i(x_k,y_k)$  from the tapping point  $TP_i$  to the destination  $(x_k,y_k)$ . The phase incurred by the tapping wire is given by:

$$\phi[l_i(x_k, y_k)] = \left\{ \frac{t[l_i(x_k, y_k)]}{T} \right\} 360^{\circ}$$
 (15)

where  $[t[l_i(x_k, y_k)]$  is the delay of an interconnect of length  $l_i$  and T is the clock period.

#### C. Opportunistic Sharing of Clock Resources

Opportunistically sharing the clock resources, provides spatial flexibility, and enabling fast adaptability to local operating conditions and requirements. Distributing the ReRoC network as a sparse rotary oscillatory array (SROA) [18] structure along with the dynamic resonant frequency divider permits the



Fig. 13. Clock routing for MPVRs. (a) Conventional. (b) Proposed.

use of the ReRoCs as the clock source for the digital core [6]. The dynamic resonant frequency divider performs division from 3 to 9, permitting dynamic frequency scaling (DFS) for the cores. The opportunistic sharing of the clock resources eliminates the need for a individual clock sources for the voltage regulators and core. The performance of the proposed MP-DLDOs is not affected by the pervasive use of the ReRoCs. The readers are directed to [6] for further elaboration on the algorithms for the resonant rotary clock distribution and methodology on large-zscale digital cores.

# IV. PERFORMANCE EVALUATION OF THE PROPOSED MP-DLDO ARCHITECTURE

In the following sections the evaluation setup and post layout simulation results for the MP-DLDO architecture are presented. In section IV-A, evaluation setup encompassing the MP-DLDO and ReRoC design specifications are presented. In section IV-B, post-layout simulation results are presented for 1) proposed MP-DLDO controller power designed with ReRoCs (coined as RC-MP-DLDO) in comparison to an MP-DLDO implementation with an all digital phase locked loop (AD-PLL), 2) voltage ripple reduction in comparison to a conventional DLDO, 3) DVS speed of the MP-DLDO architecture, and 4) Frequency variation of the ReRoCs, and static IR and Ldi/dt analysis of the power grid.

## A. Evaluation Setup

The proposed MP-DLDO topology is evaluated on three different industrial designs that are publicly available: 1) AES encryption core, 2) Arm core - Cortex M0, and 3) VSCALE RISC-V. The designs are placed and routed with Cadence Innovus and subjected to timing analysis with Synopsys PrimeTime to ensure timing closure. Mentor Graphics Calibre is used to extract the designs to employ post layout models for the SPICE simulations. The design specifications for the MP-DLDOs and ReRoCs are presented in the following sections.

1) MP-DLDO Design Specification: The voltage regulators are distributed over the industrial designs evaluated in this work. The number of power transistors in each DLDO of the MP-DLDO topology is identical for each industrial design. Each DLDO provides equal current assuming a current balancing scheme [30]. The MP-DLDO designs are implemented

in the standard CMOS  $65\,\mathrm{nm}$  technology node with  $V_{in} = 0.8\,\mathrm{V} - 1.2\,\mathrm{V}$  and  $V_{out} = 0.7\,\mathrm{V} - 1.1\,\mathrm{V}$ . In this work,  $2\,\mathrm{GHz}$  and  $2.4\,\mathrm{GHz}$  sampling frequencies are considered for the MP-DLDOs, to show that the MP-DLDO architecture can operate at GHz speeds while providing substantial savings in controller power and fast DVS when compared to a conventional DLDO design.

2) ReRoC Design Specification: The on-chip resonant rotary clocks are designed using the ASIC-flow in considering [6, 24] the target frequency for the regulators along with the resonant frequency dividers for the digital cores. The ReRoCs are sized considering the total capacitance and the perimeter of the ReRoC ring. The transmission line interconnects for the ReRoCs are designed with the high frequency structural simulator (HFSS) [24, 31]. The top two metal layers of the 65 nm technology node are used to design the ReRoCs. The electric and magnetic coupling due to the multiple metal layers in the crossover segments and the substrate parasitics are utilized to ensure accurate modeling. The S-parameters are extracted from HFSS to generate the SPICE compatible models. Resonant frequency dividers are employed to provide the clocking source for the digital cores with DFS capability [6, 32].

### B. Post-layout simulation results

In the following sections the SPICE results for the power consumption of the controller, output voltage ripple, DVS speed, frequency variation of the ReRoCs, and static IR and Ldi/dt analysis of the power grid are presented for the MP-DLDO based architecture.

1) Power consumption: The AD-PLL is implemented with a standard cell library in the  $65\,\mathrm{nm}$  technology node. The MP-DLDOs are distributed assuming a balanced current sharing scheme [30] with each individual DLDO having the same number of power transistors. For the RISC-V core, the RC-MP-DLDOs are implemented with 10 clock phases distributed from the ReRoCs. The AD-PLL based MP-DLDOs are designed in a custom flow, with standard cell clock buffers and a traditional clock distribution network. The RC-MP-DLDOs are implemented with resonantly switched clock trees (unbuffered steiner clock tree) and a resonant clock source that enable the power savings. The power consumption results of the RC-MP-DLDO architecture and MP-DLDOs implemented with AD-PLLs operating at a sampling frequency of 2 GHz is presented in Table II. The RC-MP-DLDOs designed with a sampling frequency of 2 GHz consume 35% lower power when compared to the AD-PLL based MP-DLDOs. The resonant switching of the synchronous components helps reduce the overall power consumption of the control circuitry of the proposed MP-DLDO. The current efficiency of the MP-DLDOs implemented on the RISC-V core is 76% at a sampling frequency of 2 GHz. At a sampling frequency of 2 GHz, the controller power is reduced by 25% on average when compared to an AD-PLL based MP-DLDO design.

The three designs are implemented with the proposed MP-DLDO topology at an additional sampling frequency of 2.4 GHz. The power consumption results are presented in

TABLE II PROPOSED MP-DLDO CONTROLLER POWER (RC-MP-DLDO) CONSUMPTION VERSUS AD-PLL BASED MP-DLDO CONTROLLER POWER CONSUMPTION WITH A SAMPLING FREQUENCY =  $2\,\mathrm{GHz}$ ,  $V_{in}$ =1.2 Vand  $V_{out}$ =1.1 V.

| Design        | RC-MP-DLDO         | AD-PLL based MP-DLDO | % Power   | Number of | Current    | ReRoC area |
|---------------|--------------------|----------------------|-----------|-----------|------------|------------|
|               | controller power   | controller power     | reduction | ReRoC     | efficiency | overhead   |
| AES cipher    | $6.82\mathrm{mW}$  | $8.12\mathrm{mW}$    | 16%       | 4         | 76%        | 1.5%       |
| Cortex M0     | $12.43\mathrm{mW}$ | $16.62\mathrm{mW}$   | 25%       | 6         | 76%        | 2.0%       |
| VSCALE RISC-V | $17.82\mathrm{mW}$ | $26.53\mathrm{mW}$   | 33%       | 10        | 77%        | 2.7%       |
| Average       | -                  | -                    | 25%       | -         | 76%        | 3.1%       |

TABLE III PROPOSED MP-DLDO CONTROLLER POWER (RC-MP-DLDO) CONSUMPTION VERSUS AD-PLL BASED MP-DLDO CONTROLLER POWER CONSUMPTION WITH A SAMPLING FREQUENCY =  $2.4~\mathrm{GHz}$ ,  $V_{in}$ =1.2 Vand  $V_{out}$ =1.1 V.

| Design        | RC-MP-DLDO         | AD-PLL based MP-DLDO | % Power   | Number of | Current    | ReRoC area |
|---------------|--------------------|----------------------|-----------|-----------|------------|------------|
|               | controller power   | controller power     | reduction | ReRoC     | efficiency | overhead   |
| AES cipher    | $6.30\mathrm{mW}$  | $9.32\mathrm{mW}$    | 32%       | 4         | 74%        | 1.5%       |
| Cortex M0     | $11.68\mathrm{mW}$ | $17.80\mathrm{mW}$   | 34%       | 6         | 76%        | 2.0%       |
| VSCALE RISC-V | $16.52\mathrm{mW}$ | $27.73\mathrm{mW}$   | 40%       | 10        | 74%        | 2.7%       |
| Average       | -                  | -                    | 35%       | -         | 75%        | 3.1%       |

TABLE IV
VOLTAGE RIPPLE OF MP-DLDOS VERSUS CONVENTIONAL DLDOS IMPLEMENTED ON INDUSTRIAL DESIGNS

| Design        | Num.  | Sampling Frequency, $F_s = 2 \text{GHz}$ |                   | Sampling Frequency, $F_s = 2.4 \mathrm{GHz}$ |                   |                   |           |
|---------------|-------|------------------------------------------|-------------------|----------------------------------------------|-------------------|-------------------|-----------|
|               | DLDOs | Proposed MP-DLDO                         | Conventional DLDO | Reduction                                    | Proposed MP-DLDO  | Conventional DLDO | Reduction |
| AES cipher    | 6     | $58.0\mathrm{mV}$                        | $67.9\mathrm{mV}$ | 15%                                          | $57.3\mathrm{mV}$ | $68.1\mathrm{mV}$ | 16%       |
| Cortex M0     | 8     | $46.1\mathrm{mV}$                        | $68.2\mathrm{mV}$ | 32%                                          | $47.2\mathrm{mV}$ | $70.2\mathrm{mV}$ | 33%       |
| VSCALE RISC-V | 10    | $21.1\mathrm{mV}$                        | $34.1\mathrm{mV}$ | 38%                                          | $20.4\mathrm{mV}$ | $33.5\mathrm{mV}$ | 39%       |
| Average       | -     | -                                        | -                 | 28%                                          | -                 | -                 | 29%       |

Table III. Increasing the operating frequency of the ReRoCs results in lower power consumption [23, 24], this can be observed in Table III. At a sampling frequency of 2.4 GHz, the controller power is reduced by 35% on average when compared to an AD-PLL based MP-DLDO design while providing an average current efficiency of 75%. The area overhead for the ReRoCs designed in this work is listed in Table II and Table III. The area overhead for the ReRoCs comes from only the inverter pairs. To enable multiple frequencies dynamic resonant frequency dividers are employed. The resonant frequency divider can perform integer frequency division from 3-n [6]. The resonant frequency dividers are adiabatic blocks and the power consumption has a sub-linear increase in power with the increase in frequency division ratio [6]. The ReRoCs on an average increases the active area of the designs by 3.1%.

2) Output voltage ripple: The output voltage ripple for the proposed MP-DLDOs and the conventional DLDOs implemented on the industrial designs at the two sampling frequencies of 2 GHz and 2.4 GHz is presented in Table IV. The MP-DLDO architecture implemented on the RISC-V core has a voltage ripple of 16.52 mV that is a 38% reduction in the voltage ripple of a conventional DLDO with the same number of regulators designed with a sampling frequency of 2 GHz. The MP-DLDO architecture on an average reduces the voltage ripple by 28% for the industrial designs at the two different sampling frequencies.

TABLE V DVS SPEED ON INDUSTRIAL DESIGNS WITH INPUT VOLTAGE  $V_{in}=0.8\,{
m V}-1.2\,{
m V}$  and output voltage  $V_{out}=0.7\,{
m V}-1.1\,{
m V}.$ 

| Design        | Num. of ReRoC | $F_s = 2 \mathrm{GHz}$        | $F_s = 2.4 \mathrm{GHz}$      |
|---------------|---------------|-------------------------------|-------------------------------|
| AES cipher    | 4             | $4.1\mathrm{V}/\mu\mathrm{s}$ | $4.4\mathrm{V}/\mu\mathrm{s}$ |
| Cortex M0     | 6             | $5.2\mathrm{V}/\mu\mathrm{s}$ | $5.4\mathrm{V}/\mu\mathrm{s}$ |
| VSCALE RISC-V | 10            | $6.5\mathrm{V}/\mu\mathrm{s}$ | $6.9\mathrm{V}/\mu\mathrm{s}$ |

3) DVS speed: In Table V, the DVS speed of the industrial designs for the two sampling frequencies of  $2\,\mathrm{GHz}$  and  $2.4\,\mathrm{GHz}$  is presented. For the RISC-V core designed with a sampling frequency of  $2\,\mathrm{GHz}$ , the 10 phase MP-DLDO takes  $15.4\,\mathrm{ns}$  for DVS, corresponding to a tracking speed of  $6.5\,\mathrm{V}/\mu\mathrm{s}$ . Similarly, for the RISC-V core designed with a sampling frequency of  $2.4\,\mathrm{GHz}$ , the 10 phase MP-DLDO takes  $14.4\,\mathrm{ns}$  for DVS, corresponding to a tracking speed of  $6.9\,\mathrm{V}/\mu\mathrm{s}$ . The DVS speed of the MP-DLDO circuit topology increases with an increase in the number of phases of the MP-DLDO circuit topology. Furthermore, an increase in the sampling frequency helps achieve the higher DVS speed across the three designs.

4) PVT, static IR, and Ldi/dt: The robustness of the Re-RoCs providing the multiphase clock sources for the MP-DLDOs are presented in Table VI. The geometries of the ReRoC rings are varied  $\pm 10\%$ , threshold voltage by  $40\,\mathrm{mV}$ , and temperature in the range  $-40\,^\circ\mathrm{C}$  to  $125\,^\circ\mathrm{C}$  in order to represent the worst case variation scenarios. In Table VI, the deviation in frequency from the target frequency of  $2\,\mathrm{GHz}$  for  $500\,\mathrm{Monte}$  Carlo runs are presented. Average frequency

TABLE VI PVT, IR and Ldi/dt Analysis results. Reroc frequency is  $2\,\mathrm{GHz}$ .

| Design        | PVT (500 Mor     | Static IR        |      |
|---------------|------------------|------------------|------|
|               | Max. intra-die   | & Ldi/dt         |      |
| AES cipher    | 21 MHz           | 9 MHz            | 2.0% |
| Cortex M0     | $22\mathrm{MHz}$ | $10\mathrm{MHz}$ | 1.9% |
| VSCALE RISC-V | $21\mathrm{MHz}$ | $11\mathrm{MHz}$ | 2.4% |
| Average       | $21\mathrm{MHz}$ | $10\mathrm{MHz}$ | 2.1% |

variation of <1% and <0.5% are observed under intra- and inter-die variations, respectively. Worst case static IR and Ldi/dt analysis are performed on the designs. The power grid is modeled with RLC models and the average current drawn across the design is modeled as a current source. The average worst case voltage drop across the three industrial designs is 2.1% of the  $V_{dd}$ .

#### V. CONCLUSIONS

In this paper, multiphase digital low dropout (MP-DLDO) regulators are designed with resonant rotary clocks enabling a low power and robust solution to mitigate the output voltage ripple and improve the transient response speed. The proposed MP-DLDOs are implemented on three different industrial designs to validate the architecture. The MP-DLDO architecture implemented with a sampling frequency of 2.4 GHz on an average reduces the voltage ripple by 29% while providing an average 75% current efficiency for the three industrial designs. Furthermore, the ReRoCs designed to provide the multiphase clock signals are robust to intra- and inter-die variations justified via simulations on the post layout models for the industrial designs.

# REFERENCES

- [1] Yasuyuki Okuma *et al.*, "0.5-V input digital ldo with 98.7% current efficiency and 2.7uA quiescent current in 65nm CMOS," in *IEEE Custom Integrated Circuits Conference (CICC)*, Sep. 2010, pp. 1–4.
- [2] L. Wang, R. Kuttappa, B. Taskin, and S. Kose, "Distributed digital low-dropout regulators with phase interleaving for on-chip voltage noise mitigation," in *Proc. Int. Workshop on System Level Interconnect Prediction (SLIP)*, June 2019, pp. 1–5.
- [3] M. Huang, Y. Lu, S. Sin, S. U, and R. P. Martins, "A fully integrated digital LDO with coarsefine-tuning and burst-mode operation," *IEEE Trans. on Circuits and Systems II: Express Briefs*, vol. 63, no. 7, pp. 683–687, July 2016.
- [4] M. Huang, Y. Lu, S. Sin, S. U, R. P. Martins, and W. Ki, "Limit cycle oscillation reduction for digital low dropout regulators," *IEEE Trans. on Circuits and Systems II: Express Briefs*, vol. 63, no. 9, pp. 903–907, Sep. 2016.
- [5] S. B. Nasir, S. Gangopadhyay, and A. Raychowdhury, "5.6 A 0.13um fully digital low-dropout regulator with adaptive control and reduced dynamic stability for ultra-wide dynamic range," in *IEEE Int. Solid-State Circuits Conference (ISSCC)*, Feb 2015, pp. 1–3.
- [6] R. Kuttappa, S. Kose, and B. Taskin, "FOPAC: Flexible on-chip power and clock," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 66, no. 12, pp. 4628–4636, Dec 2019.
- [7] M. Onouchi et al., "A 1.39-V input fast-transient-response digital Ido composed of low-voltage MOS transistors in 40-nm cmos process," in IEEE Asian Solid-State Circuits Conference (ASSCC), Nov 2011, pp. 37–40
- [8] G. Cai, C. Zhan, and Y. Lu, "A fast-transient-response fully-integrated digital LDO with adaptive current step size control," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 66, no. 9, pp. 3610–3619, Sep. 2019.

- [9] M. A. Akram, W. Hong, and I. Hwang, "Fast transient fully standard-cell-based all digital low-dropout regulator with 99.97% current efficiency," *IEEE Trans. on Power Electronics*, vol. 33, no. 9, pp. 8011–8019, Sep. 2018.
- [10] I. Vaisband and E. G. Friedman, "Stability of distributed power delivery systems with multiple parallel on-chip ldo regulators," *IEEE Trans. on Power Electronics*, vol. 31, no. 8, pp. 5625–5634, Aug 2016.
- [11] X. Zhan, P. Li, and E. Snchez-Sinencio, "Taming the stability-constrained performance optimization challenge of distributed on-chip voltage regulation," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 38, no. 8, pp. 1571–1584, Aug 2019.
- [12] A. Ciprut and E. G. Friedman, "Stability of on-chip power delivery systems with multiple low-dropout regulators," *IEEE Trans. on Very Large Scale Integration (VLSI) Systems*, vol. 27, no. 8, pp. 1779–1789, Aug 2019.
- [13] H. P. Le, J. Crossley, S. R. Sanders, and E. Alon, "A sub-ns response fully integrated battery-connected switched-capacitor voltage regulator delivering 0.19W/mm<sup>2</sup> at 73% efficiency," in *Int. Solid-State Circuits Conference (ISSCC)*, Feb 2013, pp. 372–373.
- [14] H. P. Le, S. R. Sanders, and E. Alon, "Design techniques for fully integrated switched-capacitor dc-dc converters," *IEEE Journ. of Solid-State Circuits (JSSC)*, vol. 46, no. 9, pp. 2120–2131, Sept 2011.
- [15] G. Villar-Pique, H. J. Bergveld, and E. Alarcon, "Survey and benchmark of fully integrated switching power converters: Switched-capacitor versus inductive approach," *IEEE Trans. on Power Electronics (TPE)*, vol. 28, no. 9, pp. 4156–4167, September 2013.
- [16] Y. Lu, J. Jiang, W.-H. Ki, C. P. Yue, S.-W. Sin, S.-P. U, and R. P. Martins, "A 123-phase DC-DC converter-ring with fast-DVS for micro-processors," in *Int. Solid-State Circuits Conference (ISSCC)*, Feb 2015, pp. 1–3.
- [17] J. Wood, T. C. Edwards, and S. Lipa, "Rotary traveling-wave oscillator arrays: a new clock technology," *IEEE Journ. of Solid-State Circuits* (JSSC), vol. 36, no. 11, pp. 1654–1665, Nov 2001.
- [18] Y. Teng and B. Taskin, "Sparse-rotary oscillator array (sroa) design for power and skew reduction," in *Proc. Design, Automation and Test in Europe (DATE)*, March 2013, pp. 1229–1234.
- [19] A. Martchovsky and K. D. Pedrotti, "Amplifier innovations for improvement of rotary traveling wave oscillators," *IEEE Trans. on Circuits and Systems I: Regular Papers (TCAS-I)*, vol. PP, no. 99, pp. 1–9, 2017.
- [20] G. Venkataraman, J. Hu, F. Liu, and C. N. Sze, "Integrated placement and skew optimization for rotary clocking," in *Proc. Design Automation* and Test in Europe Conference (DATE), vol. 1, March 2006, pp. 1–6.
- [21] J. Lu, V. Honkote, X. Chen, and B. Taskin, "Steiner tree based rotary clock routing with bounded skew and capacitive load balancing," in *Proc. Design, Automation and Test in Europe (DATE)*, March 2011, pp. 1–6.
- [22] Z. Yu and X. Liu, "Design of rotary clock based circuits," in *Proc. Design Automation Conference (DAC)*, June 2007, pp. 43–48.
- [23] V. Honkote and B. Taskin, "CROA: Design and analysis of the custom rotary oscillatory array," *IEEE Trans. on Very Large Scale Integration* Systems (TVLSI), vol. 19, no. 10, pp. 1837–1847, Oct 2011.
- [24] R. Kuttappa, A. Balaji, V. Pano, B. Taskin, and H. Mahmoodi, "Rota-SYN: Rotary Traveling Wave Oscillator SYNthesizer," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 66, no. 7, pp. 2685–2698, July 2019.
- [25] S. B. Nasir and A. Raychowdhury, "On limit cycle oscillations in discrete-time digital linear regulators," in *Proc. IEEE Applied Power Electronics Conference and Exposition (APEC)*, March 2015, pp. 371–376.
- [26] S. Leitner, P. West, C. Lu, and H. Wang, "Digital Ido modeling for early design space exploration," in *Proc. Int. System-on-Chip Conference* (SOCC), Sep. 2016, pp. 7–12.
- [27] S. B. Nasir, Y. Lee, and A. Raychowdhury, "Modeling and analysis of system stability in a distributed power delivery network with embedded digital linear regulators," in *International Symposium on Quality Electronic Design*, 2014, pp. 68–75.
- [28] S. J. Kim, D. Kim, Y. Pu, C. Shi, S. B. Chang, and M. Seok, "0.5-1-v, 90-400-ma, modular, distributed, 3 x3 digital ldos based on event-driven control and domino sampling and regulation," *IEEE Journal of Solid-State Circuits*, pp. 1–1, 2021.
- [29] R. Kuttappa, B. Taskin, S. Lerner, V. Pano, and I. Savidis, "Robust low power clock synchronization for multi-die systems," in *Proc. Interna*tional Symposium on Low Power Electronics and Design (ISLPED), July 2019, pp. 1–6.
- [30] J. F. Bulzacchelli, Z. Toprak-Deniz, T. M. Rasmus, J. A. Iadanza, W. L. Bucossi, S. Kim, R. Blanco, C. E. Cox, M. Chhabra, C. D. LeBlanc, C. L. Trudeau, and D. J. Friedman, "Dual-loop system of distributed

- microregulators with high dc accuracy, load response time below 500 ps, and 85-mv dropout voltage," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 4, pp. 863–874, 2012.
- [31] V. Honkote, A. More, and B. Taskin, "3-d parasitic modeling for rotary interconnects," in *Proc. Int. Conference on VLSI Design (VLSID)*, Jan 2012, pp. 137–142.
- [32] Y. Teng and B. Taskin, "Resonant frequency divider design methodology for dynamic frequency scaling," in *Proc. Int. Conference on Computer Design (ICCD)*, Oct. 2013, pp. 479–482.



Ragh Kuttappa (S'15) received the Bachelor's degree from Visvesvaraya Technological University, India, in 2012 and the Master's degree from San Francisco State University, San Francisco, CA, in 2015. He is currently pursuing the Ph.D. degree from the Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA.

His current research interests include electronic design automation for VLSI, low-power circuits and resonant clocking. He has held intern positions at Samsung Austin Research Center (SARC) and Intel

Labs in 2017 and 2021, respectively.



Longfei Wang (S'17-M'19) received the B.S. degree in electronic information engineering from Southwest Jiaotong University, Chengdu, China, in 2010, the M.S. degree in electrical engineering from Texas Tech University, Lubbock, TX, USA, in 2013, and the Ph.D. degree in electrical engineering from the University of South Florida, Tampa, FL, USA in 2018. He was a Postdoctoral Associate at the University of Rochester, Rochester, NY, USA in 2019. He is currently with Qualcomm Technologies Inc., San Diego, CA, USA. His current research interests

include on-chip voltage regulation and power management.



**Selçuk Köse** (S'10-M'12) received the B.S. degree in electrical and electronics engineering from Bilkent University, Ankara, Turkey, in 2006 and the M.S. and Ph.D. degrees in electrical engineering from the University of Rochester, Rochester, NY, USA, in 2008 and 2012, respectively.

He was with TUBITAK, Ankara, Intel Corporation, Santa Clara, CA, USA, and Freescale Semiconductor, Tempe, AZ, USA. He is currently an Associate Professor at the Department of Electrical Engineering, University of Rochester, Rochester,

NY, USA. Prior to joining University of Rochester, he was an Associate Professor at the University of South Florida, Tampa, FL, USA. His current research interests include integrated voltage regulation, 3-D integration, hardware security, and green computing.

Dr. Kose was a recipient of the NSF CAREER Award, the Cisco Research Award, the USF College of Engineering Outstanding Junior Researcher Award, and the USF Outstanding Faculty Award. He is an Associate Editor of the Springer Nature Computer Science and Elsevier Microelectronics Journal. He has served on the Technical Program and Organization Committees of various conferences.



**Baris Taskin** (S'01-M'05-SM'12) received the B.S. degree in electrical and electronics engineering from Middle East Technical University, Ankara, Turkey, in 2000, and the M.S. and Ph.D. degrees in electrical engineering from the University of Pittsburgh, Pittsburgh, PA, in 2003 and 2005, respectively.

Dr. Taskin is currently a Professor of Electrical and Computer Engineering at Drexel University, Philadelphia, PA. His current research interests include electronic design automation for VLSI, lowpower circuits, resonant clocking, clock network

synthesis, wireless IC interconnects, and networks-on-chip for chip multi-processors.

Dr. Taskin is a recipient of a number of awards for his research and professional contributions, including the National Science Foundation Faculty Early Career Development (NSF CAREER) Award in 2009, an ACM SIGDA Distinguished Service Award in 2012 and the Delaware Valley Young Electrical Engineer of the Year Award from the IEEE Philadelphia Section in 2013. Dr. Taskin was the Chair of the IEEE Circuits and Systems Society's Technical Committee on VLSI Systems and Applications (IEEE CAS VSA-TC) 2018-2020.