skip to main content

Title: The Cost of Energy-Efficiency in Digital Hardware: The Trade-Off between Energy Dissipation, Energy–Delay Product and Reliability in Electronic, Magnetic and Optical Binary Switches
Binary switches, which are the primitive units of all digital computing and information processing hardware, are usually benchmarked on the basis of their ‘energy–delay product’, which is the product of the energy dissipated in completing the switching action and the time it takes to complete that action. The lower the energy–delay product, the better the switch (supposedly). This approach ignores the fact that lower energy dissipation and faster switching usually come at the cost of poorer reliability (i.e., a higher switching error rate) and hence the energy–delay product alone cannot be a good metric for benchmarking switches. Here, we show the trade-off between energy dissipation, energy–delay product and error–probability for an electronic switch (a metal oxide semiconductor field effect transistor), a magnetic switch (a magnetic tunnel junction switched with spin transfer torque) and an optical switch (bistable non-linear mirror). As expected, reducing energy dissipation and/or energy–delay product generally results in increased switching error probability and reduced reliability.
Award ID(s):
Publication Date:
Journal Name:
Applied Sciences
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. The Landau-Lifshitz-Gilbert (LLG) equation, used to model magneto-dynamics in ferromagnets, tacitly assumes that the angular momentum associated with spin precession can relax instantaneously when the real or effective magnetic field causing the precession is turned off. This neglect of “spin inertia” is unphysical and would violate energy conservation. Recently, the LLG equation was modified to account for inertia effects. The consensus, however, seems to be that such effects would be unimportant in slow magneto-dynamics that take place over time scales much longer that the relaxation time of the angular momentum, which is typically few fs to perhaps ~100 ps in ferromagnets. Here, we show that there is at least one very serious and observable effect of spin inertia even in slow magneto-dynamics. It involves the switching error probability associated with flipping the magnetization of a nanoscale ferromagnet with an external agent, such as a magnetic field. The switching may take ~ns to complete when the field strength is close to the threshold value for switching, which is much longer than the angular momentum relaxation time, and yet the effect of spin inertia is felt in the switching error probability. This is because the ultimate fate of a switching trajectory, i.e.more »whether it results in success or failure, is influenced by what happens in the first few ps of the switching action when nutational dynamics due to spin inertia holds sway. Spin inertia increases the error probability, which makes the switching more error-prone. This has vital technological significance because it relates to the reliability of magnetic logic and memory.« less
  2. There are now many examples of single molecule rotors, motors, and switches in the literature that, when driven by photons, electrons, or chemical reactions, exhibit well-defined motions. As a step toward using these single molecule devices to perform useful functions, one must understand how they interact with their environment and quantify their ability to perform work on it. Using a single molecule rotary switch, we examine the transfer of electrical energy, delivered via electron tunneling, to mechanical motion and measure the forces the switch experiences with a noncontact q-plus atomic force microscope. Action spectra reveal that the molecular switch has two stable states and can be excited resonantly between them at a bias of 100 mV via a one-electron inelastic tunneling process which corresponds to an energy input of 16 zJ. While the electrically induced switching events are stochastic and no net work is done on the cantilever, by measuring the forces between the molecular switch and the AFM cantilever, we can derive the maximum hypothetical work the switch could perform during a single switching event, which is ∼55 meV, equal to 8.9 zJ, which translates to a hypothetical efficiency of ∼55% per individual inelastic tunneling electron-induced switching event. Whenmore »considering the total electrical energy input, this drops to 1 × 10–7% due to elastic tunneling events that dominate the tunneling current. However, this approach constitutes a general method for quantifying and comparing the energy input and output of molecular-mechanical devices.« less
  3. Streaming codes take a string of source symbols as input and output a string of coded symbols in real time, which effectively eliminate the queueing delay and are regarded as a promising scheme for low latency communications. Aiming at quantifying the fundamental latency performance of random linear streaming codes (RLSCs) over i.i.d. symbol erasure channels, this work derives the exact error probability under, simultaneously, the finite memory length and finite decoding deadline constraints. The result is then used to examine the tradeoff among memory length (complexity), decoding deadline (delay), and error probability (reliability) of RLSCs for the first time in the literature. Two critical observations are made: (i) Too much memory can adversely impact the performance under a finite decoding deadline constraint, a surprising finding not captured by the traditional wisdom that large memory length monotonically improves the performance in the asymptotic regime; (ii) The end-to-end delay of the RLSC is roughly 50% of that of the MDS block code when under identical code rate and error probability requirements. This implies that switching from block codes to RLSCs not only eliminates the queueing delay (thus 50%) but also has little negative impact on the error probability.
  4. Wide band gap (WBG) devices feature high switching frequency operation and low switching loss. They have been widely adopted in tremendous applications. Nevertheless, the manufacture cost for SiC MOSFET greater than that of the Si IGBT. To achieve a trade off between cost and efficiency, the hybrid switch, which includes the paralleling operation of Si IGBT and SiC MOSFET, is proposed. In this article, an active gate driver is used for the hybrid switch to optimize both the switching and thermal performances. The turn-on and turn-off delays between two individual switches are controlled to minimize the switching loss of traditional Si IGBT. In this way, a higher switching frequency operation can be achieved for the hybrid switch to improve the converter power density. On the other hand, the gate source voltages are adjusted to achieve an optimized thermal performance between two individual switches, which can improve the reliability of the hybrid switch. The proposed active gate driver for hybrid switch is validated with a 2 kW Boost converter.
  5. The research problem of how to use a high-speed circuit switch, typically an optical switch, to most effectively boost the switching capacity of a datacenter network, has been extensively studied. In this work, we focus on a different but related research problem that arises when multiple (say $s$) parallel circuit switches are used: How to best split a switching workload $D$ into sub-workloads $D_1, D_2, ..., D_s$, and give them to the $s$ switches as their respective workloads, so that the overall makespan of the parallel switching system is minimized? Computing such an optimal split is unfortunately NP-hard, since the circuit/optical switch incurs a nontrivial reconfiguration delay when the switch configuration has to change. In this work, we formulate a weaker form of this problem: How to minimize the total number of nonzero entries in $D_1, D_2, ..., D_s$ (so that the overall reconfiguration cost can be kept low), under the constraint that every row or column sum of $D$ (which corresponds to the workload imposed on a sending or receiving rack respectively) is evenly split? Although this weaker problem is still NP-hard, we are able to design LESS, an approximation algorithm that has a low approximation ratio of onlymore »$1+\epsilon$ in practice and a low computational complexity of only $O(m^2)$, where $m = \|D\|_0$ is the number of nonzero entries in $D$. Our simulation studies show that LESS results in excellent overall makespan performances under realistic datacenter traffic workloads and parameter settings.« less