skip to main content


Title: Memory-Efficient Quantum Circuit Simulation by Using Lossy Data Compression,
In order to evaluate, validate, and refine the design of new quantum algorithms or quantum computers, researchers and developers need methods to assess their correctness and fidelity. This requires the capabilities of quantum circuit simulations. However, the number of quantum state amplitudes increases exponentially with the number of qubits, leading to the exponential growth of the memory requirement for the simulations. In this work, we present our memory-efficient quantum circuit simulation by using lossy data compression. Our empirical data shows that we reduce the memory requirement to 16.5% and 2.24E-06 of the original requirement for QFT and Grover’s search, respectively. This finding further suggests that we can simulate deep quantum circuits up to 63 qubits with 0.8 petabytes memory.  more » « less
Award ID(s):
1730449
NSF-PAR ID:
10084003
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
The 3rd International Workshop on Post-Moore Era Supercomputing (PMES) in conjunction with IEEE/ACM 29th The International Conference for High Performance computing, Networking, Storage and Analysis (SC2018)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Classical simulation of quantum circuits is crucial for evaluating and validating the design of new quantum algorithms. However, the number of quantum state amplitudes increases exponentially with the number of qubits, leading to the exponential growth of the memory requirement for the simulations. In this paper, we present a new data reduction technique to reduce the memory requirement of quantum circuit simulations. We apply our amplitude-aware lossy compression technique to the quantum state amplitude vector to trade the computation time and fidelity for memory space. The experimental results show that our simulator only needs 1/16 of the original memory requirement to simulate Quantum Fourier Transform circuits with 99.95% fidelity. The reduction amount of memory requirement suggests that we could increase 4 qubits in the quantum circuit simulation comparing to the simulation without our technique. Additionally, for some specific circuits, like Grover’s search, we could increase the simulation size by 18 qubits. 
    more » « less
  2. Quantum circuit simulations are critical for evaluating quantum algorithms and machines. However, the number of state amplitudes required for full simulation increases exponentially with the number of qubits. In this study, we leverage data compression to reduce memory requirements, trading computation time and fidelity for memory space. Specifically, we develop a hybrid solution by combining the lossless compression and our tailored lossy compression method with adaptive error bounds at each timestep of the simulation. Our approach optimizes for compression speed and makes sure that errors due to lossy compression are uncorrelated, an important property for comparing simulation output with physical machines. Experiments show that our approach reduces the memory requirement of simulating the 61-qubit Grover's search algorithm from 32 exabytes to 768 terabytes of memory on Argonne's Theta supercomputer using 4,096 nodes. The results suggest that our techniques can increase the simulation size by 2~16 qubits for general quantum circuits. 
    more » « less
  3. In order to evaluate, validate, and refine the design of a new quantum algorithm or a quantum computer, researchers and developers need methods to assess their correctness and fidelity. This requires the capabilities of simulation for full quantum state amplitudes. However, the number of quantum state amplitudes increases exponentially with the number of qubits, leading to the exponential growth of the memory requirement. In this work, we present our technique to simulate more qubits than previously reported by using lossy data compression. Our empirical data suggests that we can simulate full state quantum circuits up to 63 qubits with 0.8 petabytes memory. 
    more » « less
  4. Instruction scheduling is a key compiler optimization in quantum computing, just as it is for classical computing. Current schedulers optimize for data parallelism by allowing simultaneous execution of instructions, as long as their qubits do not overlap. However, on many quantum hardware platforms, instructions on overlapping qubits can be executed simultaneously through global interactions. For example, while fan-out in traditional quantum circuits can only be implemented sequentially when viewed at the logical level, global interactions at the physical level allow fan-out to be achieved in one step. We leverage this simultaneous fan-out primitive to optimize circuit synthesis for NISQ (Noisy Intermediate-Scale Quantum) workloads. In addition, we introduce novel quantum memory architectures based on fan-out.Our work also addresses hardware implementation of the fan-out primitive. We perform realistic simulations for trapped ion quantum computers. We also demonstrate experimental proof-of-concept of fan-out with superconducting qubits. We perform depth (runtime) and fidelity estimation for NISQ application circuits and quantum memory architectures under realistic noise models. Our simulations indicate promising results with an asymptotic advantage in runtime, as well as 7–24% reduction in error. 
    more » « less
  5. The current phase of quantum computing is in the Noisy Intermediate-Scale Quantum (NISQ) era. On NISQ devices, two-qubit gates such as CNOTs are much noisier than single-qubit gates, so it is essential to minimize their count. Quantum circuit synthesis is a process of decomposing an arbitrary unitary into a sequence of quantum gates, and can be used as an optimization tool to produce shorter circuits to improve overall circuit fidelity. However, the time-to-solution of synthesis grows exponentially with the number of qubits. As a result, synthesis is intractable for circuits on a large qubit scale. In this paper, we propose a hierarchical, block-by-block opti-mization framework, QGo, for quantum circuit optimization. Our approach allows an exponential cost optimization to scale to large circuits. QGo uses a combination of partitioning and synthesis: 1) partition the circuit into a sequence of independent circuit blocks; 2) re-generate and optimize each block using quantum synthesis; and 3) re-compose the final circuit by stitching all the blocks together. We perform our analysis and show the fidelity improvements in three different regimes: small-size circuits on real devices, medium-size circuits on noisy simulations, and large-size circuits on analytical models. Our technique can be applied after existing optimizations to achieve higher circuit fidelity. Using a set of NISQ benchmarks, we show that QGo can reduce the number of CNOT gates by 29.9% on average and up to 50% when compared with industrial compiler optimizations such as t|ket). When executed on the IBM Athens system, shorter depth leads to higher circuit fidelity. We also demonstrate the scalability of our QGo technique to optimize circuits of 60+ qubits, Our technique is the first demonstration of successfully employing and scaling synthesis in the compilation tool chain for large circuits. Overall, our approach is robust for direct incorporation in production compiler toolchains to further improve the circuit fidelity. 
    more » « less