<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>LSTM-QGAN: Scalable NISQ Generative Adversarial Network</title></titleStmt>
			<publicationStmt>
				<publisher>IEEE</publisher>
				<date>04/06/2025</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10583899</idno>
					<idno type="doi">10.1109/ICASSP49660.2025.10888847</idno>
					
					<author>Cheng Chu</author><author>Aishwarya Hastak</author><author>Fan Chen</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Current quantum generative adversarial networks (QGANs) still struggle with practical-sized data. First, many QGANs use principal component analysis (PCA) for dimension reduction, which, as our studies reveal, can diminish the QGAN's effectiveness. Second, methods that segment inputs into smaller patches processed by multiple generators face scalability issues. In this work, we propose LSTM-QGAN, a QGAN architecture that eliminates PCA preprocessing and integrates quantum long short-term memory (QLSTM) to ensure scalable performance. Our experiments show that LSTM-QGAN significantly enhances both performance and scalability over state-of-the-art QGAN models, with visual data improvements, reduced Fréchet Inception Distance scores, and reductions of 5× in qubit counts, 5× in single-qubit gates, and 12× in two-qubit gates.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Current QGANs. Recent advancements in Noisy Intermediate-Scale Quantum (NISQ) platforms <ref type="bibr">[1]</ref>- <ref type="bibr">[3]</ref> have catalyzed intense research on Quantum Generative Adversarial Networks (QGANs) <ref type="bibr">[4]</ref>- <ref type="bibr">[13]</ref>, which are well-suited to the constraints of NISQ systems, such as limited qubit counts and shallow circuit depths <ref type="bibr">[14]</ref>. Building on the foundational work <ref type="bibr">[4]</ref> that established the theoretical superiority of QGANs over classical counterparts, early QGAN implementations <ref type="bibr">[5]</ref>- <ref type="bibr">[7]</ref> only focused on low-dimensional inputs like single-bit data. Subsequent research introduced innovations such as Wasserstein loss <ref type="bibr">[8]</ref> and novel architectures <ref type="bibr">[9]</ref> to improve training stability. More recent work <ref type="bibr">[10]</ref>, <ref type="bibr">[11]</ref> expanded QGANs to high-dimensional data, like the 28&#215;28 MNIST dataset, by employing dimensionality reduction techniques like Principal Component Analysis (PCA). The state-of-the-art (SOTA) PatchGAN <ref type="bibr">[12]</ref> further segments inputs into smaller patches, enabling efficient processing on practical NISQ devices.</p><p>Limitations. Despite recent developments, QGANs continue to face challenges in managing practical-sized data. First, while pre-and post-processing with PCA and inverse PCA <ref type="bibr">[10]</ref>, <ref type="bibr">[11]</ref> enable QGANs to handle large-dimensional data, PCA often dominates the process, diminishing the contributions of the QGANs themselves. Second, although Patch-GAN <ref type="bibr">[12]</ref> facilitates the direct processing of practical-sized inputs through multiple small patches, its architectural limitations demand an increasing number of quantum resources as input size grows, leading to serious scalability challenges. For instance, generating a single MNIST image requires a prohibitively high 56 sub-quantum generators and 280 qubits. Third, and more concerning, our preliminary study shows a significant decline in output quality as PatchGAN scales from its original 5-qubit design <ref type="bibr">[12]</ref> to 8 qubits, severely limiting its effectiveness at larger scales.</p><p>Contributions. We introduce LSTM-QGAN, a novel architecture that eliminates the need for PCA when processing large-dimensional data. The design allows for the use of a constant amount of NISQ computing resources as input size increases. However, as additional hardware resources become available, the architecture scales efficiently, ensuring consistent and reliable performance. Our contributions include:</p><p>&#8226; Preliminary Analysis. We conduct experiments on the SOTA QGANs <ref type="bibr">[10]</ref>- <ref type="bibr">[12]</ref>, revealing previously undisclosed limitations in PCA pre-processing and model scalability.</p><p>&#8226; Scalable Architecture. We present LSTM-QGAN, a scalable QGAN architecture inspired by recent advances in quantum long-short memory (QLSTM) <ref type="bibr">[15]</ref>- <ref type="bibr">[17]</ref>. LSTM-QGAN eliminates the need for PCA, maintains constant NISQ resources as input size grows, and efficiently scales with increasing quantum computing resources. &#8226; Enhanced Performance. We conduct evaluations on NISQ computers. Experimental results show that LSTM-QGAN significantly enhances generative performance and improves scalability compared to SOTA QGANs. II. BACKGROUND QGAN Basics. Figure <ref type="figure">1</ref> illustrates a standard QGAN with two parameterized models: the Generator, G(&#952; g ), which generates synthetic data, and the Discriminator, D(&#952; d ), which evaluates the generated data against real data. G is implemented using a quantum neural network (QNN), typically composed of a data encoder E(&#8226;) and repeated layers of a variational quantum circuit (VQC) with one-qubit rotations (i.e., Rot.) and two-qubit entanglement (i.e., Ent.). D in SOTA QGANs <ref type="bibr">[10]</ref>- <ref type="bibr">[12]</ref> can be implemented with either classical or quantum models. The objective is to optimize the predefined minmax loss L, as outlined in Equation <ref type="formula">1</ref>, where z represents the latent variable. The specific loss function can be implemented using various specified functions <ref type="bibr">[6]</ref>- <ref type="bibr">[8]</ref>, <ref type="bibr">[12]</ref>, <ref type="bibr">[13]</ref>. The overall goal forget gate Q1 Q3 Q4 Q5 Q2 Q6 input gate candid. mem. output gate QNN / : hidden/cell state / :input/output state is to enable G to generate data indistinguishable from real data, while D improves its ability to differentiate between them.</p><p>SOTA QGANs. To manage larger-dimensional data with limited qubits on NISQ computers, SOTA QGANs <ref type="bibr">[10]</ref>- <ref type="bibr">[12]</ref>) primarily utilize the following two techniques:</p><p>&#8226; Pre-and Post-Processing. Several recent QGANs <ref type="bibr">[10]</ref>, <ref type="bibr">[11]</ref> utilize principal component analysis to reduce input dimensions (e.g., from 784 to 4 in <ref type="bibr">[10]</ref>) to fit within the limitations of NISQ computers with constrained qubits.</p><p>The key steps in PCA involve: <ref type="bibr">(1)</ref> standardizing the data to have zero mean and unit variance, and (2) calculating the covariance matrix C and the matrix V k , which contains the top k eigenvectors (principal components). For any data matrix X with mean &#956;, the data can be reduced to the top k principal components by Z=XV k . The reduced-dimensional data Z * can then be reconstructed to approximate the original data through inverse PCA: <ref type="bibr">[12]</ref> segments the input into small regional patches and trains a dedicated subgenerator for each, capable of generating synthesized data that follows the pattern of the corresponding patch. This approach makes it a resource-efficient QGAN framework.</p><p>The number of sub-generators scales with the input size; for instance, the 5-qubit design in the original work <ref type="bibr">[12]</ref> requires 56 sub-generators to process the 784pixel MNIST dataset, and doubling the input size would proportionally increase the number of sub-generators needed.</p><p>QLSTM. Long short-term memory <ref type="bibr">[18]</ref> effectively captures spatiotemporal information, enabling task-specific regulation of data flow. Recent work <ref type="bibr">[15]</ref> has introduced quantum LSTM, extended to various sequential learning tasks <ref type="bibr">[16]</ref>, <ref type="bibr">[17]</ref>. As shown in Figure <ref type="figure">2</ref>, QLSTM retains the classical LSTM gating mechanism, with the key distinction being the integration of QNNs. Due to page limit, we refer readers to <ref type="bibr">[15]</ref>, <ref type="bibr">[18]</ref> for detailed insights into QLSTM. LSTM has already been applied in classical GANs <ref type="bibr">[19]</ref>, <ref type="bibr">[20]</ref>, demonstrating enhanced generative power and reduced computational cost. Building on this, we aim to leverage LSTM's ability to selectively retain relevant patterns within a QGAN by training a QLSTM-based  generator using different patched inputs, rather than separate sub-generators for different patches as in <ref type="bibr">[12]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. PRELIMINARY STUDY AND MOTIVATION A. Preliminary Study</head><p>PCA Overshadows QGANs. QGANs <ref type="bibr">[10]</ref>, <ref type="bibr">[11]</ref> on MNIST use PCA and inverse PCA for dimensionality reduction and reconstruction. To assess PCA's impact, we reduced 28&#215;28 MNIST images to 1&#215;2 vectors using scikit-learn, generating the corresponding C, V 2 , and &#956;. We then randomly generated 1&#215;2 vectors, applied inverse PCA, and present the reconstructed images in Figure <ref type="figure">3</ref>. The reconstructed images closely resemble the original MNIST data and are comparable to those generated by QGANs <ref type="bibr">[10]</ref>, <ref type="bibr">[11]</ref>, suggesting that PCA pre-and post-processing may dominate, potentially overshadowing QGAN effectiveness. These results highlight concerns about the independent validity of QGANs when PCA is used, emphasizing the need for evaluation with unprocessed data.</p><p>Scalability for PatchGAN. PatchGAN <ref type="bibr">[12]</ref> claims effectiveness with patch-based processing of high-dimensional inputs but originally reports results with only 5 qubits. To evaluate scalability, we increased the qubit count from 5 to 8. Since PatchGAN employs amplitude encoding and processes one patch at a time, we adjusted the number of sub-generators to cover all 784 pixels in an MNIST image. As show in Figure <ref type="figure">4</ref>, the generated images degrade rapidly with increasing qubits, with sub-figure titles indicating qubit counts and required sub-generators (sub-gens). These findings underscore PatchGAN's poor scalability, suggesting limited potential for handling larger-scale inputs even with additional qubits.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Motivation</head><p>Our preliminary results highlight the critical need for a QGAN model capable of directly processing real-world data without PCA preprocessing, as well as a more scalable architecture to overcome the limitations of existing QGANs. Motivated by these findings, we are exploring the integration of patched inputs inspired by PatchGAN <ref type="bibr">[12]</ref> to enable direct input processing without PCA. Specifically, we are investigating a scalable QGAN framework that leverages QLSTM as the generator's backbone, utilizing QLSTM's ability to capture spatiotemporal information across patches with a single generator, rather than separate sub-generators as in <ref type="bibr">[12]</ref>. Additionally, we are reengineering the quantum circuit ansatz within the QLSTM structure to improve hardware efficiency, fully addressing the NISQ constraints overlooked in previous QLSTM studies like <ref type="bibr">[15]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. LSTM-QGAN A. Overall Architecture</head><p>As illustrated in Figure <ref type="figure">5</ref>(b), LSTM-QGAN utilizes QLSTM at the core of the generator to enhance scalability and resource efficiency. Like PatchGAN <ref type="bibr">[12]</ref>, the discriminator in LSTM-QGAN can be implemented using either a classical or quantum neural network, depending on the available quantum computing resources. The following outlines the key components and configurations in LSTM-QGAN.</p><p>&#8226; Patch Inputs without PCA. In line with <ref type="bibr">[12]</ref>, <ref type="bibr">[13]</ref>, LSTM-QGAN processes patched inputs to generate corresponding output patches, which are then recombined into a complete output. Unlike <ref type="bibr">[10]</ref>, <ref type="bibr">[11]</ref>, LSTM-QGAN eliminates the need for PCA and inverse PCA, processing the original data directly. This introduces a trade-off between resources (i.e., qubit number N ) and processing latency (i.e., steps T ). With an N -qubit implementation, LSTM-QGAN generates 2 N measured probabilities at each step as the output vector for each synthetic patched output. These vectors are then compared to the real patched input data in the discriminator. The total number of steps, T , is determined by D/2 N , where D represents the size of the real data.</p><p>&#8226; Scalable QGAN with LSTM. The generator in LSTM-QGAN consists of four QLSTM cells, as shown in Figure 5(b). Normally distributed noise z is input to the generator, producing the initial sub-image G &#952;g (z). The discriminator evaluates both synthetic and real input patches, computing the loss L. Unlike PatchGAN [12], which requires separate generators for each patch-drastically increasing NISQ resource overhead with input size-LSTM-QGAN utilizes QLSTM's ability to retain relevant patterns while discarding irrelevant information, independent of patch indices. To achieve this, gradients from all patches within a single input are averaged and applied to update the model parameters collectively, resulting in an image-adaptive generator that scales with increasing data dimensions while maintaining fixed resource usage. &#8226; Training Optimization. Convergence in QGAN training is a critical challenge, significantly influenced by the choice of quantum loss function. Within the LSTM-QGAN framework, we evaluated both the conventional</p><p>TABLE I DESIGN COST COMPARISON. PatchGAN [12] LSTM-QGAN (&#916;) Qubits per QNN 5 7 Number of QNNs 56 8 Total Number of Qubits 280 56 (5&#215;&#8595;) Total Number of 1QG 1680 336 (5&#215;&#8595;) Total Number of 2QG 1344 112 (12&#215;&#8595;)</p><p>binary cross entropy loss <ref type="bibr">[6]</ref>, <ref type="bibr">[7]</ref>, <ref type="bibr">[12]</ref> and the Wasserstein loss <ref type="bibr">[8]</ref>, <ref type="bibr">[13]</ref>. The specific Wasserstein loss used for LSTM-QGAN is detailed in Equation <ref type="formula">2</ref>, where</p><p>2 ], P r and P g represent the real data (i.e., x) and generated data (i.e., x&#8712;D &#952; d (G &#952;g (z))) distributions, respectively. The distribution P x is uniformly sampled between P r and P g , and &#955; is a constant. Experimental results on the impact of QGAN loss functions are discussed in Section V.</p><p>B. NISQ Implementation LSTM-QGAN offers flexibility in implementing G and D. For fair comparison, D is implemented as a classical neural network, as in PatchGAN <ref type="bibr">[12]</ref>. In the QLSTM cells for G, we employed a hardware-efficient ansatz inspired by recent QNNs <ref type="bibr">[21]</ref>- <ref type="bibr">[23]</ref>, instead of the generic circuit from <ref type="bibr">[15]</ref>. Figure <ref type="figure">5</ref>(a) shows the QNN circuit, which utilizes seven qubits. Each VAC block includes Rx, RY, and RZ layers, followed by 2-qubit CX entanglement layer, with the VQC layers repeated twice. The measurement layer converts the quantum state into classical vectors. Although the gate count matches that in <ref type="bibr">[15]</ref>, our circuit uses native gates, while the R(&#945;, &#946;, &#947;) gate in <ref type="bibr">[15]</ref> requires synthesis into multiple native gates.</p><p>Design Overhead. Table <ref type="table">I</ref> compares the hardware resources required by PatchGAN and LSTM-QGAN for the MNIST dataset. Due to architectural differences, a QNN in PatchGAN refers to the quantum generator used for each input patch, while in LSTM-QGAN, it refers to the quantum module within the QLSTM. The last three rows of Table I highlight that LSTM-QGAN achieves a significant reduction: a 5&#215; decrease in qubit counts, a 5&#215; decrease in one-qubit gates (1QG), and a 12&#215; decrease in two-qubit gates (2QG).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. EXPERIMENTS AND RESULTS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Experimental Setup</head><p>Schemes and Benchmarks. We compare LSTM-QGAN with PatchGAN <ref type="bibr">[12]</ref> using the MNIST dataset, which  consists of 28&#215;28 grayscale images of handwritten digits 0&#8764;9. PatchGAN is implemented according to its original design <ref type="bibr">[12]</ref>, utilizing 5 qubits and 56 sub-generators. Each sub-generator produces a 14-pixel patch, and together, the 56 sub-generators generate the entire 784-pixel MNIST image.</p><p>For LSTM-QGAN, we implement the generator with two QLSTM layers, each containing 4 QNNs with 7 qubits. At each time step, the LSTM-QGAN generates a 196-pixel patch, requiring 4 time steps to produce a complete MNIST image.</p><p>Simulation. All QGANs are implemented with the Pen-nyLane and Torchquantum libraries. PatchGAN and LSTM-QGAN are trained using the ADAM optimizer with a 2e-4 learning rate, a 128 batch size, and 1000 epochs. Quantum circuits are run on the NISQ IBM_kyoto computer <ref type="bibr">[24]</ref>.</p><p>Evaluation metrics. We evaluate the generated images using both qualitative (e.g., visual inspection) and quantitative methods. For quantitative assessment, we employ the Fr&#233;chet Inception Distance (FID), a widely recognized metric for measuring image similarity in GANs <ref type="bibr">[25]</ref>. A lower FID score indicates a closer feature distance between real and generated images, signifying higher quality. In our experiments, we randomly select 500 real images and 500 generated images for comparison.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Results and Analysis</head><p>Comparison of Image Visual Quality. Figure <ref type="figure">6</ref>(a) presents a visual comparison between the images generated by Patch-GAN and LSTM-QGAN. PatchGAN demonstrates limited generation capabilities, as the outlines of the digits (0&#8764;9) are only vaguely identifiable, with noticeable white noise in the background. Additionally, the clarity of more complex digits, such as 4, 5, and 9, is particularly low, further highlighting its deficiencies. In contrast, LSTM-QGAN demonstrates superior image generation, producing sharper and more distinct digits with minimal noise, underscoring its enhanced capability in generating high-quality images.</p><p>Comparison of Image FID Scores. Impact of Loss Function. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. CONCLUSION</head><p>This work presents LSTM-QGAN, a quantum generative adversarial network (QGAN) architecture that overcomes key limitations in existing models. By eliminating reliance on principal component analysis (PCA) and integrating quantum long short-term memory (QLSTM), LSTM-QGAN achieves scalable performance with efficient resource use. As the first QGAN to incorporate QLSTM, this approach represents a significant advancement likely to inspire further research.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>Authorized licensed use limited to: Indiana University. Downloaded on April 22,2025 at 02:09:45 UTC from IEEE Xplore. Restrictions apply.</p></note>
		</body>
		</text>
</TEI>
