

# High-Speed Phase-Based Computing

Nicholas Sica\*, Ragh Kuttappa†, Vinayak Honkote†, Baris Taskin\*

\*Drexel University, USA †Intel Labs, USA

Email: {njs82, bt62}@drexel.edu

**Abstract**—This work presents the utilization of rotary traveling wave oscillators (RTWOS) to implement an Ising machine. Ising machines utilizing ring oscillators have recently been demonstrated on silicon, for instance, for the solution of a max-cut problem. Rotary traveling wave oscillators scale better in frequency compared to ring oscillators, but have increased power consumption. Phase-based computing principles, implemented with the proposed RTWO-based Ising machines, are prime for high speed phase-based computation. The experiments reveal the proposed RTWO-based Ising machines provide significant reduction (5x) in runtime in the solution of the max-cut problem. The power dissipation is two orders of magnitude higher than the minuscule, low power ring-oscillators but RTWO-based Ising machines sub-linear increase with the demonstrated frequency increase from 2GHz to 32GHz for high speed phase-based computing. The accuracy of the solution is significantly improved as well, as demonstrated with respect to two of the D-Wave solvers (tabu and simulated annealing) acting as the baseline for ring oscillator and RTWO based Ising machines.

## I. INTRODUCTION

Solving combinatorial optimization problems (e.g. NP-complete problems) on a classical computer suffers from an exponential increase in complexity [1]. A major goal of researching alternative computing paradigms is to reduce this complexity, with direct implications on critical fields such as health sciences and security. Quantum computing is one such alternative. The implementation of quantum computers is a costly process partially due to the operating conditions requiring an operating temperature near absolute zero which necessitates complex cooling requirements [2]. Another drawback is the spin states of quantum computers being implemented with technologies that are not standard manufacturing processes.

Computing paradigms that behave like quantum computing, without qubits, have been shown to be possible with phase-based computing devices implemented as Ising machines [3]. The phase of a device, such as that of an oscillator, is used to model the qubit operation, and systems are integrated in the form of an Ising machine. A number of NP-complete problems have been mapped to Ising machines [3], [4]. A silicon implementation of a ring oscillator (ROSC) based Ising machine is presented in [5]. The results in [5] are encouraging, providing a solution for a 1968 node max cut problem that is within 89 - 100% of the (golden, software-based) D-Wave tabu solver in 50 ns. The power dissipation of this Ising machine implementation is also low at a measured total dissipation of 42 mW [5].

The discussion in this paper is limited to the max cut problem (and associated topology/interconnects) to focus on the proposed RTWO implementation of phase-based computing,



Fig. 1: Illustration of an Ising machine with 12 nodes connected in a king's graph topology

i.e. the RTWO-based Ising machine. The major impacts of RTWOS on traditional application of clocking, i.e. scalability in size and performance, are leveraged on the proposed RTWO-based Ising machine, enabling high-speed phase-based computing.

## II. TECHNICAL BACKGROUND

The proposed RTWO-based phase based computing is presented on a technical background encompassing three primary areas: 1) Phase-based computing, 2) A demonstrative NP-complete problem solved with this computing principle and 3) The ROSC phased-based computing system in [5].

### A. Phase-Based Computing

While a classical computer bit is assigned with a boolean value, a qubit of a quantum computer can be a zero, one, or a superposition of both [2]. Phase-based computing emulates the operating principles of a quantum computer, but using CMOS-compliant components and phases. The spins of qubits are represented by the phase of an oscillator. For instance, the spins of -1 and 1 correspond to phases of 0 and 180 degrees, respectively. Phase-based computer is realized by adjacently placed or interconnected oscillator phases settling in relation to each other as in Fig. 1. These coupled oscillators naturally settle towards the lowest energy state modeled by the Ising Hamiltonian equation [3]. A simplified version of the Ising Hamiltonian equation is shown in (1).

$$H(s) = - \sum_{1 < i < j < n} J_{ij} s_i s_j \quad (1)$$

In (1), the total energy of the system is represented as a multiplication of the node states,  $s_i$  and  $s_j$  being -1 or 1 based on settling in one phase group or another, respectively, and the coupling,  $J_{ij}$ , between the two node states  $s_i$  and  $s_j$ . In the illustration of an Ising machine in Fig. 1, each node represents an  $s_i$  and  $s_j$  node, with -1 and 1 represented as  $0^\circ$  and  $180^\circ$  of phase, respectively. The weight of the edge between the nodes represent the coupling value  $J_{ij}$ . The Ising machine operation is to minimize the magnitude of the energy as much as possible, which corresponds to minimizing the Ising Hamiltonian equation by solving for all node states  $s_i$ .

There are multiple types of topologies that can be used when designing the Ising machine. At least three different topologies have previously been used in literature: 1) all-to-all [6], 2) chimera [7], and 3) king's graph [5] topologies. An all-to-all topology connects all nodes to each other and is the most flexible, but also uses the most resources. A chimera topology uses a hexagonal layout to connect hexagonally adjacent nodes together in the shape of a honeycomb. A king's topology is a set of nodes that is connected in all cardinal and ordinal directions, similar to how a king piece would move on the chess board. The example in Fig. 1 is a King's graph topology connected Ising machine of 12 nodes.

Software as well as hardware implementation of phase-based computing exhibit sub-optimal behavior. A software implementation, D-Wave [8], for instance, have multiple solvers that lead to non-identical results, due to heuristics that are developed for scalability. Hardware implementations shown sub optimality due to hardware characteristics such as the presence of parasitics and system variations. In particular, instead of settling down at the theoretically optimal solution of the minimized Ising Hamiltonian equation in (1), a hardware implementation can settle down in a local minima state. In these cases, different techniques such as sub-harmonic injection locking (SHIL) or noise injection are utilized to (ideally) guide the system towards the global minima. Sub-harmonic injection locking [9] is the first of two reviewed faculties to help the system into binarized final phases. Sub-harmonic injection locking takes advantage of the harmonics in a system to lock the phases at a certain offset, usually 0 or  $180^\circ$  degrees. A simplified version of the generalized Adler's equation at equilibrium derived in [3] from the full generalized Adler's equation allows us to quantify the effects of certain frequencies on the system:

$$\frac{\omega_1 - \omega_0}{\omega_0} = c(\phi^* - \phi_{in}) \quad (2)$$

The  $\omega_0$  and  $\omega_1$  stand for the natural frequency of the system and the frequency of the perturbation, respectively. On the right-hand side,  $c(\phi^* - \phi_{in})$  is a function,  $c(t)$ , a cross-correlation of the perturbation vector and perturbation projection vector, both  $2\pi$ -periodic [3]. The parts of the input to the function  $\phi^*$  and  $\phi_{in}$  are the phase in equilibrium, or the final phase, and the perturbation phase, respectively. This information is used to quantify whether the output of a perturbation is periodic and the base harmonic of the perturbation. A frequency of double the system frequency is typically used to injection lock oscillators

due to the equation simplifying to a  $\pi$ -periodic system [3]. As a differentiator, RTWOs also benefit from fractional frequencies for injection locking [10]. The second of the two reviewed faculties to help escape local minima is the use an external source of noise to push oscillators towards binarized states as used in [5].

### B. Max Cut

The max cut problem is an NP-complete problem [1] that has applications in areas that include VLSI routing [11]. The max cut of a set of nodes is any cut that is at least the size of any other cut in a graph [1]. An edge that is cut adds the edge weight to the total cut size. The utilization of an Ising machine to solve a max-cut problem has been shown in literature [3]. The mapping of the max cut problem is a one-to-one mapping between the nodes of the input graph to the nodes of the Ising machine. The coupling of the nodes in the graph are captured by the weights of the interconnects between the nodes of the Ising machine, for instance, through sizing of the interconnects.

An illustrative example is presented in Fig. 1. The phase value (i.e. defining  $s_i$  as -1 or 1 for  $0^\circ$  or  $180^\circ$ , respectively) of each node is presented in the figure. Through interaction of these nodes through the weighted edges, and subject to the minimization of the energy of the system as defined by (2), the Ising machine converges to a final solution. In the final solution, each node  $s_i$  settles as a phase value of  $0^\circ$  or  $180^\circ$ . In example Fig. 1, all existing edges have a unity weight for simplicity. When the size of the graph grows, the optimal solutions are harder to obtain, and sub-optimal solutions can be obtained through the hardware (and software) solutions.

### C. Ring Oscillator (ROSC) Phase-Based Computing

Due to its simplicity and size, ring oscillators find use in a number of application areas, such as testing [12] and security [13], where speed is not as critical and the demand for oscillation quality is not high. In [5], the use of ring oscillators as the nodes of an Ising machine for phase-based computing is presented.

The proposed design in [5] is 1968 nodes implemented in a king's graph topology. The design in [5] models the coupling with weights -2, -1, 1, 2, implemented with the number of transmission gates and the locations on the ring oscillator. The problem being solved is the max cut problem with noise injected to avoid local minima. The results are compared against D-Wave tabu solver as the golden model. The reported accuracy is 89% to 100% with a total power of the hardware at 42 mW. The operating frequency is 1 GHz, converging to a solution around 50 ns. Further implementation details and results are in [5], some of which are revisited in experimental comparison with the proposed RTWO-based Ising machine in Section V.

## III. RTWO PHASE-BASED COMPUTING

RTWOs (rotary traveling wave oscillators) are self-resonating oscillators that utilize the parasitics of mobius-shaped metal lines and the back-to-back inverters between the differential lines to sustain the traveling wave [15] as shown in Fig. 2.



Fig. 2: RTWO Circuit Diagram [14]

RWTOS boast improved frequency and performance scalability [16]. Unlike traditional (e.g. LC-based, ROSC, etc.) oscillators, RTWOS are unique in benefiting from a sub-linear increase in power dissipation as the frequency increases [17]. Increasing the frequency with the proposed RTWO-based Ising machine allows the system to settle to a steady state faster. In addition to higher operating frequencies being feasible at high GHz ranges where ring oscillators struggle with signal integrity, the proposed RTWO-based Ising machines benefit from the sub-linear power increase with frequency scaling. The tapping scheme takes advantage of the innate differential properties of the transmission lines on the rings. Overall, the scalability of the proposed RTWO-based Ising machines are superior, the design style is inherently suitable for phase-based computing, and the (transmission gate based for weights) interconnects between nodes, as well as the topologies, are identical to ROSC-based Ising machines, for simplicity and comparability.

#### IV. EXPERIMENTAL SETUP

The operating characteristics of the proposed RTWO-based phase computing implementation are investigated with circuit simulations. The simulation models of RTWO are computed with Cadence EMX, following the example set in previous work [17], where FEM models had been used to accurately capture the RTWO operating characteristics. A 65 nm GP PDK is selected for the cells (i.e. inverters in RTWO, transmission gates in interconnects) and to model the interconnects.

Two arbitrary max cut problem sizes of 1968 and 3900 nodes are selected in experimentation. The max cut problem with 2750 nodes is composed of  $55 \times 50$  nodes in a king's graph topology, identical to that in [5]. The max cut problem with the total of 3900 nodes demonstrated the size-wise scalability of the proposed RTWO phase-based system.

In addition to the exploration of the number of nodes, presented experiments also explore the affects of varying the frequency of operation from 3 GHz to an arbitrarily large frequency of 24 GHz. Note that the highest frequency of operation reported in the ROSC-based hardware implementation is 1 GHz in [5]. Frequencies in that range (i.e. low GHz to MHz) are feasible with RTWO frequency dividers [14], but not of interest due to the high performance of ROSC-based Ising

TABLE I: Different D-Wave Solvers and RTWO-based Ising Machine Max Cut Sizes (and Normalized Comparisons)

|            | D-Wave Tabu  | D-Wave SA | This Work    |
|------------|--------------|-----------|--------------|
| 1968 Nodes | 10038 (0.88) | 11374 (1) | 10459 (0.92) |
| 2750 Nodes | 14178 (0.89) | 15984 (1) | 14842 (0.93) |
| 3900 Nodes | 15380 (0.68) | 22784 (1) | 21213 (0.93) |

machines at these ranges. The higher frequencies of operation, selected in this work, are instrumental to explore the benefits of frequency scaling, made possible with the integration of RTWOS into phase-based computing.

In experimentation, the RTWOS have four close tapping points using the innate differential lines for each pair. Four tapping points on the RTWO rings are selected for coupling in or out of phase with multiple weights, from -2 to 2. Sub-harmonic injection locking is performed from an auxiliary RTWO source that is not part of the Ising machine nodes. There is no noise injection performed, unlike those in [5] on silicon, in the experiments in order to preserve the reproducibility of the simulation results.

Following the example set previously [5], the software solution by D-Wave is used as a golden model for the result of the max cut problems. A major difference with [5] is the selection of the solver within D-Wave, the tabu solver is the D-Wave solver of choice in [5]. In this paper, both tabu and simulated annealing (SA) D-Wave solvers are used. The use of the simulated annealing solver was necessary due to the observation that tabu solver produces results that are about 32% worse in the largest circuit, compared to the simulated annealing solver. The discrepancy in the solutions of the tabu and SA solvers, and the inferiority of tabu solver to SA solver when the designs were scaled up are shown in TABLE I. Proposed RTWO-based Ising machines produce max cut sizes that are consistently, and up to 32% higher (21213 vs 15380 for the largest max cut problem circuit with 3900 nodes) when compared to the tabu solver with increasing size of the graph. Consequently in this work, the simulated annealing solver is used as the golden model to compare the results of the RTWO-based Ising machine design. The RTWO-based Ising machines proposed in this work demonstrate accuracies that are 92 - 93% of the D-Wave simulated annealing solutions. The D-Wave tabu and proposed RTWO-based Ising solutions are normalized to the highest cut sizes computed by D-Wave SA method in TABLE I, demonstrating the relative accuracy of the techniques.

#### V. SIMULATION RESULTS AND ANALYSIS

The RTWO-based Ising machine designs were simulated using Cadence's spectre simulator at different frequencies and topology sizes. Transient simulation data was gathered for 25 ns and graphed on a plot with the x-axis as time and the y-axis as cut size in Fig. 3. From left to right the waveforms are for 24 GHz, 12 GHz, 6 GHz, and 3 GHz. All four waveforms show the Ising machine's cut size largely increasing with respect to time until the cut size settles around a peak value.



Fig. 3: RTWO-based Ising machines at 3 GHz, 6 GHz, 12 GHz, 24 GHz solving the 2750 node max cut problem

TABLE II: Impact of Frequency on RTWO-based Ising Machines

|                   | 3 GHz   | 6 GHz   | 12 GHz  | 24 GHz  |
|-------------------|---------|---------|---------|---------|
| Average Cut Size  | 14701   | 14657   | 15006   | 14842   |
| Measured Accuracy | 91.97%  | 91.70%  | 93.88%  | 92.85%  |
| Time to Solution  | 24 ns   | 20 ns   | 14 ns   | 10 ns   |
| Average Power     | 21.34 W | 20.38 W | 18.44 W | 17.48 W |

TABLE III: Comparison with Previous Literature

|                | This Work | [5]      | [4]          | [7]     |
|----------------|-----------|----------|--------------|---------|
| Technology     | 65 nm GP  | 65 nm LP | 40 nm        | 65 nm   |
| Spins          | 2750      | 1968     | 60000        | 560     |
| Solution Time  | 10 ns     | 50 ns    | 22 $\mu$ s   | 200 ns  |
| Average power  | 17.48 W   | 42 mW    | Not Reported | 23 mW   |
| Accuracy       | 91-94%    | 89-100%  | 98.8%        | 82-100% |
| Golden Results | SA        | Tabu     | Not Reported | Tabu    |

The max cut value, i.e. the accuracy of the Ising machine, does not significantly vary for different frequencies of the RTWO-based Ising machine. The average cut sizes are 14701, 14657, 15006, 14842 for 3 GHz, 6 GHz, 12 GHz, and the golden solution (i.e. D-Wave SA solver), respectively. This is computed without any post-processing; which can be applied, if desired, as was performed in [5].

Increasing the frequency of the RTWO-based Ising machines significantly changes the time to solution as shown in TABLE III. In particular, the fastest implementation at 24GHz reaches at the max cut solution faster (10ns) than the other solutions (14 ns to 24 ns). The time to solution is shorter as the clock speed increases due to the dependence on the number of cycles to settle. For every 2x frequency increase in this design, there is a consistent decrease in solution time, observed as a 4 ns to 6 ns decrease in the simulations. This monotonic decrease in the time to solution with increasing frequency of operation of the RTWO is one of the major benefits of the proposed RTWO-based Ising machines and is very promising

for further scaling. TABLE II also shows that each increase in frequency keeps average power steadily decreasing, with a reduction of about 1 W to 2 W per 2x frequency increase.

A comparison of the proposed RTWO-based Ising machine results to some other Ising machine solutions is presented in TABLE III. Compared to the ring oscillator designs [4], [5], [7], the power dissipation is significantly higher. The difference is two magnitudes of order, from 42 mW [5] and 23 mW [7] as compared to 17.48 W power consumed by 2750 RTWOS in RTWO-based Ising machine. This discrepancy is due to the minuscule form factor and associated operation of the ring oscillators as compared to RTWO rings, as well as the low operating frequency of the ROSCs. The limitation of ROSCs are in frequency scaling: RTWO-based Ising machines have a frequency that is up to 24x higher. Partially thanks to this increase in operating frequency, the time to solution of the RTWO-based Ising machine benefits from an 80% and 95% decrease compared to [5] and [7], respectively. The solution time is only 10 ns for the largest 2750 node max cut problem.

## VI. CONCLUSIONS

This implementation is the first rotary traveling wave oscillator based Ising machine and is the fastest implementation in literature in terms of operating frequency and time to solution. In particular, a 24 GHz RTWO phase-based computer was simulated using a 65 nm technology to solve a max-cut problem on a king's graph of size 2750 in 10 ns. The two major drawbacks of rotary traveling wave oscillators with respect to ring oscillator-based Ising machines in literature are speed and power as RTWOS operate at higher power. Conversely, RTWOS provide unmatched high frequency operation while benefiting from a sub-linear increase in power/area.

## ACKNOWLEDGMENTS

This material is based upon work supported by the National Science Foundation under Grant No. 1409014 and the Graduate Research Fellowships Program.

## REFERENCES

[1] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, *Introduction to Algorithms*, fourth edition ed. The MIT Press, 2022.

[2] F. Jazaeri, A. Beckers, A. Tajalli, and J.-M. Sallese. (August 2019) A Review on Quantum Computing: Qubits, Cryogenic Electronics and Cryogenic MOSFET Physics. [Online]. Available: <http://arxiv.org/abs/1908.02656>

[3] T. Wang and J. Roychowdhury, "OIM: Oscillator-Based Ising Machines for Solving Combinatorial Optimisation Problems," in *Unconventional Computation and Natural Computation*, I. McQuillan and S. Seki, Eds. Springer International Publishing, 2019, vol. 11493, pp. 232–256. [Online]. Available: [http://link.springer.com/10.1007/978-3-030-19311-9\\_19](http://link.springer.com/10.1007/978-3-030-19311-9_19)

[4] T. Takemoto, M. Hayashi, C. Yoshimura, and M. Yamaoka, "A  $2 \times 30$ k-Spin Multi-Chip Scalable CMOS Annealing Processor Based on a Processing-in-Memory Approach for Solving Large-Scale Combinatorial Optimization Problems," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 1, pp. 145–156, January 2020. [Online]. Available: <https://ieeexplore.ieee.org/document/8906059/>

[5] W. Moy, I. Ahmed, P.-w. Chiu, J. Moy, S. S. Sapatnekar, and C. H. Kim, "A 1,968-node coupled ring oscillator circuit for combinatorial optimization problem solving," *Nature Electronics*, vol. 5, no. 5, pp. 310–317, May 2022. [Online]. Available: <https://www.nature.com/articles/s41928-022-00749-3>

[6] H. Lo, W. Moy, H. Yu, S. Sapatnekar, and C. H. Kim, "An Ising solver chip based on coupled ring oscillators with a 48-node all-to-all connected array architecture," *Nature Electronics*, vol. 6, no. 10, pp. 771–778, August 2023. [Online]. Available: <https://www.nature.com/articles/s41928-023-01021-y>

[7] I. Ahmed, P.-W. Chiu, W. Moy, and C. H. Kim, "A Probabilistic Compute Fabric Based on Coupled Ring Oscillators for Solving Combinatorial Optimization Problems," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 9, pp. 2870–2880, September 2021. [Online]. Available: <https://ieeexplore.ieee.org/document/9377463/>

[8] F. Glover, G. Kochenberger, and Y. Du, "A Tutorial on Formulating and Using QUBO Models," November 2018. [Online]. Available: <https://arxiv.org/abs/1811.11538>

[9] A. Neogy and J. Roychowdhury, "Analysis and design of sub-harmonically injection locked oscillators," in *Design, Automation & Test in Europe Conference & Exhibition (DATE)*. IEEE, March 2012, pp. 1209–1214. [Online]. Available: <http://ieeexplore.ieee.org/document/6176677/>

[10] Z. Bai, X. Zhou, and R. Mason, "A novel Injection Locked Rotary Traveling Wave Oscillator," in *IEEE International Symposium on Circuits and Systems (ISCAS)*, June 2014, pp. 1768–1771. [Online]. Available: <http://ieeexplore.ieee.org/document/6865498/>

[11] Y. Zhang, Y. Deng, Y. Lin, Y. Jiang, Y. Dong, X. Chen, G. Wang, D. Shang, Q. Wang, H. Yu, and Z. Wang, "Oscillator-Network-Based Ising Machine," *Micromachines*, vol. 13, no. 7, p. 1016, June 2022. [Online]. Available: <https://www.mdpi.com/2072-666X/13/7/1016>

[12] K. S.-M. Li, "Oscillation and Transition Tests for Synchronous Sequential Circuits," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 21, no. 12, pp. 2338–2343, December 2013. [Online]. Available: <http://ieeexplore.ieee.org/document/6409488/>

[13] Y. Cao, X. Zhao, W. Zheng, Y. Zheng, and C.-H. Chang, "A New Energy-Efficient and High Throughput Two-Phase Multi-Bit per Cycle Ring Oscillator-Based True Random Number Generator," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 69, no. 1, pp. 272–283, January 2022. [Online]. Available: <https://ieeexplore.ieee.org/document/9455912/>

[14] R. Kuttappa, S. Lerner, L. Filippini, and B. Taskin, "Low Swing — Low Frequency Rotary Traveling Wave Oscillators," in *IEEE International Symposium on Circuits and Systems (ISCAS)*. IEEE, May 2019, pp. 1–5. [Online]. Available: <https://ieeexplore.ieee.org/document/8702782/>

[15] J. Wood, T. Edwards, and S. Lipa, "Rotary traveling-wave oscillator arrays: A new clock technology," *IEEE Journal of Solid-State Circuits*, vol. 36, no. 11, pp. 1654–1665, November 2001. [Online]. Available: <http://ieeexplore.ieee.org/document/962285/>

[16] R. Kuttappa, B. Taskin, S. Lerner, and V. Pano, "Resonant Clock Synchronization With Active Silicon Interposer for Multi-Die Systems," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 68, no. 4, pp. 1636–1645, April 2021. [Online]. Available: <https://ieeexplore.ieee.org/document/9360308/>

[17] V. Honkote, A. More, and B. Taskin, "3-D Parasitic Modeling for Rotary Interconnects," in *International Conference on VLSI Design*, January 2012, pp. 137–142. [Online]. Available: <http://ieeexplore.ieee.org/document/6167742/>