# **Energy-Efficient Adiabatic MTJ/CMOS-based CLB for Non-Volatile FPGA**

Milad Tanavardi Nasab, Wu Yang, and Himanshu Thapliyal

Abstract— High flexibility, infinite reconfigurability, and fast design-to-market of FPGAs make them a promising platform for modern applications, such as IoT, medical, and automotive applications. Energy and area limitations are challenging in these applications since many of these applications have limited power and hardware resources. Accordingly, the energy- and area-efficient design of FPGAs is of great importance. In this paper, an adiabatic non-volatile hybrid CMOS/MTJ logic-in-memory-based configurable logic block (CLB) has been proposed and compared to its state-ofthe-art counterparts. The simulation results show that the proposed design has 98%, 98%, 97%, 97%, 96%, and 92% lower energy consumption compared to CMOS counterparts for frequencies of 1, 2.5, 5, 10, 20, and 40 MHz. Also, compared to its adiabatic counterparts, the proposed design has at least 74%, 70%, 69%, 69%, and 46% lower energy consumption for frequencies of 1, 2.5, 5, 10, and 20 MHz, respectively. Also, the proposed design has at least 74% fewer transistors compared to its counterparts. Furthermore, the energy saving of the proposed design for different tunnel magnetoresistance (TMR) is almost consistent. In addition, the proposed design keeps its superiority in energy saving over its counterparts for different power supply voltages.

#### I. INTRODUCTION

Infinite reconfigurability and fast design-to-market make FPGAs a promising option to be utilized in modern applications such as IoT, AI, and medical applications [1-8]. Although these features of FPGAs provide higher computing power and flexibility, the aforementioned applications are facing limited power resources [9-11]. Accordingly, designing energy-efficient FPGAs is of great importance [12]. Utilizing novel architectures like logic-inmemory (LiM), emerging technologies like magnetic tunnel junctions (MTJ), and adopting adiabatic-based circuits as an energy-efficient approach are among the levers that can be used for designing energy-efficient FPGAs, providing a platform for hardware implementation of the target applications [12-14].

Data transfer between the memory and processing unit in conventional architectures imposes high power consumption on the system. Accordingly, novel architectures like LiM are designed based on migrating the processing unit into the memory [15]. Configurable logic block (CLB) is the main component of FPGAs and consists of CMOS-based look-up tables (LUT) alongside SRAM cells in SRAM-based FPGAs. Although the SRAM cells in FPGAs are distributed throughout the chip, no stored data is

processed in these memories. By combining memory and logic, the energy and area overhead will be reduced.

On the other hand, the configuration of the FPGA is stored in these memories. Due to the volatility of SRAM cells in SRAM-based FPGAs, the configuration of the FPGA will be lost in case of a power supply disconnection. Consequently, an external memory is needed to save the configuration permanently and reconfigure the FPGA after each power supply disconnection which increases the area overhead and energy consumption. Accordingly, nonvolatile devices such as MTJs are a promising alternative for volatile memory, to be utilized in LiM architectures [16]. Using MTJs in LiM architecture can significantly decrease energy consumption and area overhead which is vital in resource-constrained applications.

Adiabatic circuits use charge recovery to effectively reduce energy dissipation [10, 16]. The charge recovery is done by recycling the stored charge to the power supply. Adiabatic circuits show energy savings in low frequencies. Since target applications are usually low-frequency, adiabatic logic is a promising candidate for designing low-frequency FPGAs for these applications [17].

Previously, the authors have proposed an adiabatic-CMOS-based LUT and three different adiabatic memory cells for designing an adiabatic-based CLB in [12]. The results show significant energy saving compared to the CMOS counterpart. Although the results are promising, the volatility of the memory cells and separate circuits for memory cells and LUT impose higher energy consumption.

Since CLB is the main component of FPGAs and it consists of memory cells and LUT, in this paper an adiabatic non-volatile MTJ/CMOS- LiM-based CLB has been proposed. The proposed design uses non-volatile MTJs to eliminate the need for external memory. In addition, by using the novel LiM architecture and combining the memory cell with the LUT, the area overhead and energy dissipation are reduced. Furthermore, by using adiabatic-based logic, the dynamic energy dissipation is reduced significantly.

The rest of the paper is organized as follows: a brief background is presented in Section II. In Section III the proposed design is described. Section IV investigates and compares the functionality of the proposed design and its simulation results to its state-of-the-art counterparts. Finally, Section V concludes the paper.

This work was supported in part by the National Science Foundation CAREER Award under Grant 2232235.

Milad Tanavardi Nasab and Wu Yang are with the Department of Electrical Engineering and Computer Science, University of Tennessee Knoxville, Knoxville, TN, USA.



Figure 1. The proposed design a) Block diagram b)Core circuit c) MTJ-based function tree

## I. BACKGROUND

## A. Magnetic Tunnel Junction

MTJ device consists of two ferromagnetic layers with fixed (fixed layer) and changeable (free layer) magnetic orientations. These two ferromagnetic layers are separated by a thin oxide layer. Depending on the state of the magnetic orientation of the ferromagnetic layers, the MTJ device can have two states parallel and antiparallel. In the antiparallel state (parallel state), the magnetic direction of the free layer is opposite (same as) the fixed layer and MTJ shows relatively high (low) resistance shown by  $R_{AP}$  ( $R_P$ ). The difference between the high and low resistance of the MTJ can be parameterized by the tunnel magnetoresistance ratio (TMR). The TMR is calculated using Eq.1. It is noteworthy that higher TMR values increase the reliability of the MTJ-based circuits [18, 19].

$$TMR = \frac{R_{AP} - R_P}{R_P} \tag{1}$$

The other parameter that needs to be considered during the design process of MTJ-based circuits is retention time. Retention time is dependent on the physical parameters of the MTJ [3, 20]. The designer should use the proper size for MTJs to guarantee the retention time needed for the target applications. The retention time of MTJs is calculated using Eq. 2.

$$\tau = \tau_0. \exp(\Delta)$$
 ,  $\tau_0 = 1$  ns and  $\Delta = \frac{H_k M_s A_r t}{2k_B T}$  (2)

In Eq. 2.  $\Delta$  is the thermal stability,  $H_k$  is Uni-axial anisotropy,  $M_s$  is saturation magnetization,  $A_r$  is the area of the MTJ, t is the thickness of the free layer,  $k_B$  is Boltzmann constant, and T is the system temperature.

TABLE I. CRITICAL PARAMETERS OF CMOS TRANSISTORS AND MTIS

| Description             | Value             |  |  |  |  |
|-------------------------|-------------------|--|--|--|--|
| NMOS                    |                   |  |  |  |  |
| Gate length             | 60nm              |  |  |  |  |
| Gate width              | 200nm             |  |  |  |  |
| Number of fingers       | 1                 |  |  |  |  |
| PMC                     | OS                |  |  |  |  |
| Gate length             | 60nm              |  |  |  |  |
| Gate width              | 400nm             |  |  |  |  |
| Number of fingers       | 1                 |  |  |  |  |
| MT                      | J                 |  |  |  |  |
| MTJ surface             | 60nm×60nm         |  |  |  |  |
| Oxide barrier thickness | 0.85nm            |  |  |  |  |
| Free layer thickness    | 2nm               |  |  |  |  |
| Resistance area product | 10 <sup>-11</sup> |  |  |  |  |

TABLE II. REFERRING NAMES OF THE COUNTERPARTS PROPOSED IN [12]

| Referring name | Design                                  |  |
|----------------|-----------------------------------------|--|
| CMOS design    | CMOS LUT with CMOS SRAM                 |  |
| CP1            | Adiabatic LUT with 14T adiabatic memory |  |
| CP2            | Adiabatic LUT with 16T adiabatic memory |  |
| CP3            | Adiabatic LUT with 12T adiabatic memory |  |

For applications like FPGA design, in which the values of the memory (configuration of the FPGA) need to be stored for a long time, a storage class retention time for the MTJs is need. In order to have a storage class MTJ, the thermal stability of the MTJ needs to be greater than 75 [21-23].

## B. Adiabatic Logic

Adiabatic-based circuits are designed to minimize energy consumption. By using a gradually rising and falling power clock signal, non-adiabatic energy dissipation will be minimized, and electrical charges can be recovered from the capacitive load into the power supply [12, 16]. A constant current source is needed for ideal adiabatic switching. In this case, the energy dissipation can be calculated using Eq. 3.

$$E_{diss} = \frac{RC}{T}CV_{DD}^2 \tag{3}$$

Where T is the charging/discharging time of the capacitive load, R is electrical resistance, C is the capacitance of the capacitive load driven by the power clock, and  $V_{DD}$  is the full-swing voltage [24-26]. According to Eq.3. in case that T is greater than the 2RC, the energy dissipation of the CMOS-based circuits will be higher than adiabatic-based circuits. This energy saving makes the adiabatic-based circuits a promising candidate for low-frequency applications such as IoT edge devices.

## II. PROPOSED DESIGN

The proposed LiM non-volatile adiabatic- MTJ-based CLB is shown in Fig. 1. In order to reduce the routing complexity and area overhead, the proposed design utilizes a 2-phase sinusoidal power clock (VPC). The proposed design consists of two MTJ-based function trees and a core circuit. The function trees are identical except for the states of the

MTJs, and the state of the MTJs in function tree 2 is complementary to their corresponding MTJ in function tree 1. Therefore, to store each bit of the FPGA's configuration, two MTJs with complementary states are needed.

The proposed design has two operation phases: the evaluation and recovery phases. In the evaluation phase, the VPC starts to rise from GND toward  $V_{DD}$ , and the inputs of the function trees select the corresponding paths and MTJs. At the beginning of the evaluation phase, the SENSE signal is asserted to the logical value of '1'. The nodes Out and  $\overline{Out}$  start to rise toward  $V_{DD}$  as VPC rises, but the corresponding resistance of one of these paths between these two nodes and VPC is relatively higher. Consequently, one of these two nodes will have a lower voltage (the node with higher resistance) while the SENSE signal is '1'. As the VPC keeps rising towards  $V_{DD}$ , when the voltage difference between the nodes Out and  $\overline{Out}$  is high enough, the SENSE signal is reset to logical value '0', and one of the nodes Out or  $\overline{Out}$  will continue to rise toward  $V_{DD}$  and the other one will reset to the '0'.

In the recovery phase, the SENSE signal is '0' and VPC goes down from  $V_{DD}$  toward GND. As the VPC goes down, the stored charge in the core circuit is recovered to the power supply through the PMOS transistors in the core circuit. Since the PMOS transistors cannot discharge a node completely, there will be a residual charge on the nodes Out or  $\overline{Out}$ . This residual charge will be through the corresponding function tree at the beginning of the next evaluation phase.

#### III. SIMULATION RESULTS

In this section, the comprehensive investigation and comparison of the simulation results of the proposed 16:1 adiabatic- MTJ-based CLB and its counterparts have been presented. In order to simulate the proposed design and its counterparts, SPICE simulation using Cadence Spectre has been performed. For the simulation, TSMC 65nm CMOS PDK and the MTJ model presented in [27-30] have been used. Critical parameters of the transistors and MTJs are shown in Table I. The parameters used for MTJs in this paper assure the storage class memory thermal stability and retention time for the MTJ devices. The thermal stability and retention time of the MTJ devices in this paper is 77.4 and 4.3×10 <sup>24</sup> seconds. Also, Table II shows the names that are used for referring to different designs in [12] as the counterparts of the proposed design. It is noteworthy that the letter "T" in Table II stands for transistor.

#### A. Frequency Sweep

Table III shows the energy consumption per cycle of the proposed design and its counterpart for different frequencies. In these simulations, the TMR of the MTJs and  $V_{DD}$  have been set to 2 and 1.2, respectively. The results in Table III show that the proposed design has 98%, 98%, 97%, 97%, 96%, and 92% lower energy consumption compared to CMOS counterparts for frequencies of 1, 2.5, 5, 10, 20, and 40 MHz. These results show that the proposed design is a promising candidate to be utilized in energy-efficient FPGA.

TABLE III. TABLE 1 ENERGY CONSUMPTION (FJ/CYCLE) OF THE PROPOSED DESIGN AND ITS COUNTERPARTS

| Frequency | CMOS<br>design | CP1  | CP2  | СР3  | Proposed<br>design |
|-----------|----------------|------|------|------|--------------------|
| 1 (MHz)   | 632            | 53.1 | 62.7 | 54.7 | 13.78              |
| 2.5 (MHz) | 260            | 21.9 | 25.6 | 22.4 | 6.324              |
| 5 (MHz)   | 136            | 11.3 | 13.3 | 11.6 | 3.412              |
| 10 (MHz)  | 73.6           | 6.09 | 7.20 | 6.28 | 1.86               |
| 20 (MHz)  | 42.5           | 3.45 | 4.15 | 3.60 | 1.85               |
| 40 (MHz)  | 26.7           | 2.20 | 2.73 | 2.33 | 2.21               |

TABLE IV. ENERGY CONSUMPTION (FJ/CYCLE) OF THE PROPOSED DESIGN AT DIFFERENT FREQUENCIES AND USING DIFFERENT TMR VALUES

| Frequency | TMR=1 | TMR=1.5 | TMR=2  | TMR=2.5 | TMR=3  |
|-----------|-------|---------|--------|---------|--------|
| 1 (MHz)   | 15.35 | 13.43   | 13.78  | 14.38   | 16     |
| 2.5(MHz)  | 6.372 | 6.332   | 6.324  | 6.188   | 4.02   |
| 5 (MHz)   | 3.648 | 3.664   | 3.412  | 3.752   | 2.384  |
| 10 (MHz)  | 2.602 | 2.351   | 1.863  | 1.851   | 1.856  |
| 20 (MHz)  | 1.817 | 1.7905  | 1.8545 | 1.853   | 1.8505 |
| 40 (MHz)  | 2.102 | 2.156   | 2.2085 | 2.2287  | 2.2552 |



Figure 2. Energy saving of the proposed design and its adiabatic counterparts compare to their CMOS counterpart for different frequencies



Figure 3. Energy saving (fJ/cycle) of the proposed 16:1adiabatic- MTJbased CLB compared to its CMOS counterpart for different TMR values and at different frequencies

Also, compared to its adiabatic counterparts, the proposed design has at least 74%, 70%, 69%, 69%, and 46% lower energy consumption for frequencies of 1, 2.5, 5, 10, and 20 MHz, respectively, and for frequency of 40 MHz, only CP1 has 0.4% lower energy consumption compared to the proposed design. The energy saving of the proposed design and its adiabatic counterparts compared to their CMOS counterpart is shown in Fig. 2. which shows more than 92% energy savings for all simulated frequencies. These results show that the proposed design is a more promising candidate to be utilized in energy-efficient FPGA compared to its counterparts.

#### B. TMR Sweep

TMR is an important parameter of the MTJs, and its effects on the performance of the MTJ-based circuits need to be investigated. In order to investigate the effect of TMR on the proposed design, different TMR values for MTJs are used for different simulations. In these simulations, a  $V_{DD}$  of 1.2 has been applied. Table IV shows the energy consumption of the proposed design for different TMR values at different frequencies. Also, Fig. 3. shows the energy saving of the proposed design compared to its CMOS counterpart. The results show that the effect of the TMR value is less than 2% at all the target frequency hence it is negligible and the energy saving of the proposed design is almost independent of the TMR values of the MTJs. This feature is helpful to prevent undesirable functionality and performance due to process variation. Also, being able to use different TMRs with almost the same energy consumption can be used to increase the reliability of the circuit.

#### C. Power Supply Voltage sweep

The other important parameter that impacts energy consumption is the voltage of the power supply. Accordingly, the energy consumption of the proposed design and its counterparts for different power supply voltages have been investigated. It is noteworthy that the TMR of 2 and frequency of 20 MHz have been applied in these simulations. The energy consumption results of the proposed design and its counterparts are shown in Table V. Also, the energy saving of the proposed design and its adiabatic counterparts compared to their CMOS counterpart is shown in Fig. 4. The results show that although the energy saving of the proposed design fluctuates slightly, the proposed design keeps its superiority compared to its state-of-the-art adiabatic counterparts.

### D. SENSE Signal Sweep

The SENSE signal affects the reliability and energy consumption of the proposed design. A higher pulse width of the SENSE signal leads to higher reliability and energy consumption. Accordingly, the effects of the SENSE signal on the energy consumption and functionality of the proposed design need to be investigated. Therefore, the energy consumption of the proposed design for different pulse widths of the SENSE signal has been calculated, using a TMR of 2,  $V_{DD}$  of 1.2, and a frequency of 20 MHz. Figure 5 shows the energy consumption of the proposed design for different pulse widths of the SENSE signal.

TABLE V. ENERGY CONSUMPTION (FJ/CYCLE) OF THE PROPOSED 16:1ADIABATIC- MTJ-BASED CLB AND ITS COUNTERPARTS FOR DIFFERENT SUPPLY VOLTAGES

| Supply<br>voltage | CMOS<br>design | CP1 | CP2 | CP3 | Proposed<br>design |
|-------------------|----------------|-----|-----|-----|--------------------|
| 1.2               | 61.2           | 5.0 | 6.0 | 5.2 | 1.9                |
| 1.1               | 42.4           | 3.7 | 4.3 | 3.8 | 1.6                |
| 1                 | 29             | 2.7 | 3.1 | 2.7 | 1.5                |
| 0.9               | 38.9           | 3.4 | 3.7 | 3.3 | 1.3                |

TABLE VI. TRANSISTOR COUNT OF 16:1 CLB

| CLB<br>design    | CMOS<br>design | CP1 | CP2 | СР3 | Proposed<br>design |
|------------------|----------------|-----|-----|-----|--------------------|
| Transistor count | 402            | 324 | 292 | 260 | 66                 |



Figure 4. Energy saving of the proposed design and its adiabatic counterparts compare to their CMOS counterpart for different supply voltages



Figure 5. Energy consumption (fJ/cycle) of the proposed design for different pulse widths of the SENSE signal

According to Fig. 5, the energy consumption of the proposed design stays almost constant for pulse widths of 2.5 ns to 6.25 ns and the difference in energy consumption

is less than 3%. According to this observation, since a higher pulse width of the SENSE signal leads to higher reliability, 6.25 ns of pulse width for the SENSE signal seems to be the optimal pulse width.

It is noteworthy that since 6.25 ns is 1/8 of the power clock period, for all the aforementioned simulations, the pulse width of the SENSE signal has been set to 1/8 of the period of the power clock.

#### E. Hardware Overhead Comparison

In target applications, hardware resources are limited, and area-efficient design is desirable. Accordingly, the transistor number of the proposed design and its counterparts is reported in Table VI. It is noteworthy that since the MTJs are fabricated on a separate layer above the transistor layer, they do not impose extra area overhead on the circuits. Accordingly, the transistor count of the proposed design and its counterparts has been reported.

The results in Table VI show that the proposed design has at least 74% fewer transistors compared to its counterparts. Also, the proposed design has 83% fewer transistors compared to the CMOS counterpart. This is due to the use of LiM architecture for designing the proposed CLB which eliminates the need for separate memory cells. Consequently, fewer transistors are needed to implement the proposed design.

## IV. CONCLUSION

In this paper, a LiM non-volatile adiabatic MTJ/CMOSbased CLB has been proposed to improve the energy efficiency of FPGAs. The simulation results show that the proposed design consumes significantly less energy compared to its CMOS-based counterpart among the target frequencies. Also, due to the LiM structure of the proposed design, it consumes less energy compared to its state-of-theart adiabatic CLBs among the target frequencies. In addition, the energy saving of the proposed design and its adiabatic counterparts compared to their CMOS counterpart for different power supply voltages has been investigated which shows that the proposed design keeps its superiority against its counterparts for different power supply voltages. Also, due to the dependency of the performance of MTJbased circuits on the TMR values, the effect of using different TMR values has been investigated which shows that the difference between energy saving for different TMR values is less than 2% for all the target frequencies which can be neglected. Furthermore, the effect of the pulse width of the SENSE signal in the proposed design on energy consumption and reliability has been investigated. The results show that a pulse width of 1/8 power clock frequency can be considered as the optimized option since a higher pulse width of the SENSE signal leads to higher reliability and energy consumption difference for minimum pulse width and this pulse width is neglectable. Also, the transistor count of the proposed design and its counterparts shows that the proposed design is significantly more areaefficient than its counterparts which is vital in hardware resource-limited applications.

#### REFERENCES

- [1] A. Amirany, K. Jafari, and M. H. Moaiyeri, "A Task-Schedulable Nonvolatile Spintronic Field-Programmable Gate Array," *IEEE Magnetics Letters*, vol. 12, pp. 1-4, 2021, doi: 10.1109/lmag.2021.3092995.
- [2] M. Khojastehpour, S. Sahebi, and A. Samimi, "Public acceptance of a crowdsourcing platform for traffic enforcement," *Case Studies on Transport Policy*, vol. 10, no. 4, pp. 2012-2024, 2022, doi: 10.1016/j.cstp.2022.08.013.
- [3] B. Jahannia, S. A. Ghasemi, and H. Farbeh, "An Energy Efficient Multi-Retention STT-MRAM Memory Architecture for IoT Applications," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 71, no. 3, pp. 1431-1435, 2024, doi: 10.1109/tcsii.2023.3326303.
- [4] A. Degada and H. Thapliyal, "2-Phase Adiabatic Logic for Low-Energy and CPA-Resistant Implantable Medical Devices," *IEEE Transactions on Consumer Electronics*, vol. 68, no. 1, pp. 47-56, 2022, doi: 10.1109/tce.2022.3141342.
- [5] A. Amirany, M. Meghdadi, M. H. Moaiyeri, and K. Jafari, "Stochastic Spintronic Neuron with Application to Image Binarization," presented at the 2021 26th International Computer Conference, Computer Society of Iran (CSICC), 2021.
- [6] M. Elnawawy, A. Farhan, A. A. Nabulsi, A. R. Al-Ali, and A. Sagahyroon, "Role of FPGA in Internet of Things Applications," presented at the 2019 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), 2019.
- [7] X. Zhang et al., "Machine learning on FPGAs to face the IoT revolution," presented at the 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2017.
- [8] A. Magyari and Y. Chen, "Review of state-of-the-art FPGA applications in IoT Networks," Sensors, vol. 22, no. 19, p. 7496, 2022.
- [9] M. Rezaei, A. Amirany, M. H. Moaiyeri, and K. Jafari, "A High-Accuracy and Low-Power Emerging Technology-Based Associative Memory," *IEEE Transactions on Nanotechnology*, pp. 1-6, 2024, doi: 10.1109/tnano.2024.3380368.
- [10] W. Yang, A. Degada, and H. Thapliyal, "Adiabatic Logic-based STT-MRAM Design for IoT," presented at the 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2022.
- [11] M. Natsui, T. Chiba, and T. Hanyu, "Design of an energy-efficient XNOR gate based on MTJ-based nonvolatile logic-in-memory architecture for binary neural network hardware," Japanese Journal of Applied Physics, vol. 58, no. SB, 2019, doi: 10.7567/1347-4065/aafb4d.
- [12] W. Yang, M. Tanavardi Nasab, and H. Thapliyal, "Energy Efficient CLB Design Based on Adiabatic Logic for IoT Applications," *Electronics*, vol. 13, no. 7, 2024, doi: 10.3390/electronics13071309.
- [13] S. Dinesh Kumar, H. Thapliyal, A. Mohammad, and K. S. Perumalla, "Design exploration of a Symmetric Pass Gate Adiabatic Logic for energy-efficient and secure hardware," *Integration*, vol. 58, pp. 369-377, 2017, doi: 10.1016/j.vlsi.2016.08.007.
- [14] R. Rajaei, "Radiation-Hardened Design of Nonvolatile MRAM-Based FPGA," *IEEE Transactions on Magnetics*, vol. 52, no. 10, pp. 1-10, 2016, doi: 10.1109/tmag.2016.2578278.
- [15] A. Coluccio, M. Vacca, and G. Turvani, "Logic-in-Memory Computation: Is It Worth It? A Binary Neural Network Case Study," *Journal of Low Power Electronics and Applications*, vol. 10, no. 1, 2020, doi: 10.3390/ilpea10010007.
- [16] M. T. Nasab and H. Thapliyal, "Low-Power Adiabatic/MTJ LIM-Based XNOR/XOR Synapse and Neuron for Binarized Neural Networks," presented at the 2023 IEEE 23rd International Conference on Nanotechnology (NANO), 2023.
- [17] S. D. Kumar, H. Thapliyal, and A. Mohammad, "EE-SPFAL: A Novel Energy-Efficient Secure Positive Feedback Adiabatic Logic for DPA Resistant RFID and Smart Card," *IEEE Transactions on Emerging Topics in Computing*, vol. 7, no. 2, pp. 281-293, 2019, doi: 10.1109/tetc.2016.2645128.

- [18] A. Amirany, K. Jafari, and M. H. Moaiyeri, "DDR-MRAM: Double Data Rate Magnetic RAM for Efficient Artificial Intelligence and Cache Applications," *IEEE Transactions on Magnetics*, vol. 58, no. 6, pp. 1-9, 2022, doi: 10.1109/tmag.2022.3162030.
- [19] E. Garzón, B. Zambrano, T. Moposita, R. Taco, L.-M. Prócel, and L. Trojman, "Reconfigurable CMOS/STT-MTJ Non-Volatile Circuit for Logic-in-Memory Applications," in 2020 IEEE 11th Latin American Symposium on Circuits & Systems (LASCAS), 2020: IEEE, pp. 1-4.
- [20] B. Jahannia, S. A. Ghasemi, and H. Farbeh, "Multi-Retention STT-MRAM Architectures for IoT: Evaluating the Impact of Retention Levels and Memory Mapping Schemes," *IEEE Access*, vol. 12, pp. 26562-26580, 2024, doi: 10.1109/access.2024.3366074.
- [21] M. T. Nasab, A. Amirany, M. H. Moaiyeri, and K. Jafari, "Radiation-Immune Spintronic Binary Synapse and Neuron for Process-in-Memory Architecture," *IEEE Magnetics Letters*, vol. 15, pp. 1-5, 2024, doi: 10.1109/lmag.2024.3356815.
- [22] A. Nigam, C. W. Smullen, V. Mohan, E. Chen, S. Gurumurthi, and M. R. Stan, "Delivering on the promise of universal memory for spin-transfer torque RAM (STT-RAM)," in *IEEE/ACM International Symposium on Low Power Electronics and Design*, 2011: IEEE, pp. 121-126.
- [23] M. T. Nasab, A. Amirany, M. H. Moaiyeri, and K. Jafari, "Hybrid MTJ/CNTFET-Based Binary Synapse and Neuron for Process-in-Memory Architecture," *IEEE Magnetics Letters*, vol. 14, pp. 1-5, 2023, doi: 10.1109/lmag.2023.3238271.
- [24] P. Teichmann, Adiabatic logic: future trend and system level perspective. Springer Science & Business Media, 2011.
- [25] A. Vetuli, S. Di Pascoli, and L. Reyneri, "Positive feedback in adiabatic logic," *Electronics Letters*, vol. 32, no. 20, pp. 1867-1868, 1996.
- [26] Y. Moon and D.-K. Jeong, "An efficient charge recovery logic circuit," *IEICE transactions on electronics*, vol. 79, no. 7, pp. 925-933, 1996.
- [27] Z. Yue et al., "Compact Model of Subvolume MTJ and Its Design Application at Nanoscale Technology Nodes," IEEE Transactions on Electron Devices, vol. 62, no. 6, pp. 2048-2055, 2015, doi: 10.1109/ted.2015.2414721.
- [28] Y. Wang *et al.*, "Compact Model of Dielectric Breakdown in Spin-Transfer Torque Magnetic Tunnel Junction," *IEEE Transactions on Electron Devices*, vol. 63, no. 4, pp. 1762-1767, 2016, doi: 10.1109/ted.2016.2533438.
- [29] Y. Wang, Y. Zhang, E. Y. Deng, J. O. Klein, L. A. B. Naviner, and W. S. Zhao, "Compact model of magnetic tunnel junction with stochastic spin transfer torque switching for reliability analyses," *Microelectronics Reliability*, vol. 54, no. 9-10, pp. 1774-1778, 2014, doi: 10.1016/j.microrel.2014.07.019.
- [30] H. Cai, Y. Wang, L. A. De Barros Naviner, and W. Zhao, "Robust Ultra-Low Power Non-Volatile Logic-in-Memory Circuits in FD-SOI Technology," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 64, no. 4, pp. 847-857, 2017, doi: 10.1109/tcsi.2016.2621344.