<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Constructions for measuring error syndromes in Calderbank-Shor-Steane codes between Shor and Steane methods</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>08/01/2021</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10316361</idno>
					<idno type="doi">10.1103/PhysRevA.104.022429</idno>
					<title level='j'>Physical Review A</title>
<idno>2469-9926</idno>
<biblScope unit="volume">104</biblScope>
<biblScope unit="issue">2</biblScope>					

					<author>Shilin Huang</author><author>Kenneth R. Brown</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[In another work [S. Huang and K. R. Brown, Phys. Rev. Lett. 127, 090505 (2021)] we introduced syndrome extraction methods for Calderbank-Shor-Steane quantum error-correction codes that interpolate between the well-known Shor and Steane syndrome extraction methods. Here we provide detailed proofs of the main theorems and show that up to gate ordering there is a one-to-one correspondence between extraction gadgets and partitions of the parity check matrix. Operationally, all the circuits in our framework can be obtained by merging ancilla qubits of Shor error correction with certain rules, which enables us to design fault-tolerant syndrome extraction circuits for given specific codes. We then apply our construction to the toric code and provide a detailed analysis of the time overhead of fault tolerance. In particular, we construct a syndrome extraction family whose time overhead smoothly varies from Shor to Steane error correction. To understand the potential advantage, we consider two simplified error models: no errors on the ancilla block and uncorrelated errors on the ancilla block. We study the threshold behavior of the family for both error models and show its potential advantage for quantum architectures with long coherence times.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>The theory of quantum error correction <ref type="bibr">[1]</ref><ref type="bibr">[2]</ref><ref type="bibr">[3]</ref><ref type="bibr">[4]</ref><ref type="bibr">[5]</ref><ref type="bibr">[6]</ref><ref type="bibr">[7]</ref><ref type="bibr">[8]</ref><ref type="bibr">[9]</ref><ref type="bibr">[10]</ref> opens the path towards large-scale quantum computation. Stabilizer codes are a conventional choice of quantum error-correcting codes <ref type="bibr">[7,</ref><ref type="bibr">8,</ref><ref type="bibr">11]</ref>, where error correction is performed conditional on the measurement outcomes of a set of code stabilizers, also known as the error syndrome bit string. The syndrome extraction circuits need to be fault tolerant <ref type="bibr">[9,</ref><ref type="bibr">[12]</ref><ref type="bibr">[13]</ref><ref type="bibr">[14]</ref><ref type="bibr">[15]</ref><ref type="bibr">[16]</ref><ref type="bibr">[17]</ref><ref type="bibr">[18]</ref> as the act of measuring syndromes also introduces extra errors in the quantum system.</p><p>The first fault-tolerant syndrome extraction scheme was proposed by Shor <ref type="bibr">[9,</ref><ref type="bibr">10]</ref>. In Shor's scheme, each syndrome bit is extracted from the data qubits to a verified ancilla cat state by transversal two-qubit gates. As transversal operations limit the error propagation, no high-weight correlated errors can occur on the data qubits if the cat states are verified by postselection. The value of the syndrome bit is the parity of the transversal measurement outcome of the cat state. As any measurement error will flip the syndrome bit, for a stabilizer code of distance d, one needs to repeat the syndrome extraction for O(d 2 ) rounds to guarantee fault tolerance. Utilizing the information of the code structure, the time overhead can be significantly reduced on particular codes <ref type="bibr">[19]</ref><ref type="bibr">[20]</ref><ref type="bibr">[21]</ref><ref type="bibr">[22]</ref><ref type="bibr">[23]</ref><ref type="bibr">[24]</ref>. Optimizing Shor's scheme is an active area of research building off substantial progress since its invention. For example, for low-weight stabilizers, ancilla postselection could be avoided by decoding the ancilla cat states <ref type="bibr">[16,</ref><ref type="bibr">25,</ref><ref type="bibr">26]</ref>. On specific stabilizer codes, the space overhead can be reduced by allowing nontransversal data-ancilla interactions that preserve the code distance <ref type="bibr">[19,</ref><ref type="bibr">[26]</ref><ref type="bibr">[27]</ref><ref type="bibr">[28]</ref><ref type="bibr">[29]</ref><ref type="bibr">[30]</ref><ref type="bibr">[31]</ref><ref type="bibr">[32]</ref>. The time overhead can also be reduced by careful choice of sequential extractions <ref type="bibr">[23,</ref><ref type="bibr">24]</ref>. Notably, as an alternative to cat-state measurements, flag error-correction gadgets <ref type="bibr">[16]</ref><ref type="bibr">[17]</ref><ref type="bibr">[18]</ref><ref type="bibr">30]</ref> circumvent the need of ancilla postselection for arbitrary stabilizer codes, while having a low-qubit overhead for low-distance codes or highdistance codes with low-weight stabilizer measurements.</p><p>The extraction gadgets for Shor's scheme are arguably the smallest. As a tradeoff, a large number of two-qubit gates are applied between data and ancilla qubits. For many quantum devices, two-qubit gates are usually the most challenging operations to implement with high fidelity <ref type="bibr">[33,</ref><ref type="bibr">34]</ref>. To minimize the data-ancilla interaction, Steane <ref type="bibr">[13]</ref> and Knill <ref type="bibr">[15]</ref> suggested the use of transversal two-qubit gates to transfer the errors from the data block to an ancilla code block and then measurement of the ancilla block to gain error information. Steane's scheme is specialized for Calderbank-Shor-Steane (CSS) codes <ref type="bibr">[5,</ref><ref type="bibr">6]</ref> and needs two separate rounds of transversal two-qubit gates for extracting the X -and Z-type stabilizers, respectively. Knill's scheme works for arbitrary stabilizer codes and only needs one round of the transversal gate to extract all the stabilizers. Using a constant number of Steane or Knill syndrome extractions, an arbitrary logical Clifford circuit can be implemented fault tolerantly in O <ref type="bibr">(1)</ref> steps <ref type="bibr">[35]</ref>. Both Steane's and Knill's schemes are single shot, i.e., no repetition of measurements is required. Indeed, each data qubit is touched by O(1) two-qubit gates. However, comparing to a cat state, the ancilla blocks for both Steane's and Knill's schemes are much harder to prepare, as they are as large as the data block <ref type="bibr">[14,</ref><ref type="bibr">[36]</ref><ref type="bibr">[37]</ref><ref type="bibr">[38]</ref><ref type="bibr">[39]</ref>.</p><p>A natural question to raise is whether it is possible to balance the complexities of ancilla preparation and data-ancilla interaction. In Ref. <ref type="bibr">[40]</ref> we gave a positive answer for CSS codes and in this companion paper we elaborate on the theory behind the method and present additional numerical details. We have generated a family of extraction circuits including Shor's and Steane's constructions as its two extremes. This family increases the complexity of ancilla construction in exchange for reducing the number of two-qubit gates between data and ancilla qubits required to fault-tolerantly measure the error. Applying our construction on the toric code, we are able to use a single ancilla block to measure the plaquette operators (Z-stabilizer elements) inside any connected sublattice. In particular, one can partition the L &#215; L toric lattice into patches, each of which contains m &#215; m plaquettes. Shor's and Steane's schemes correspond to the special cases m = 1 and m = L, respectively. Moreover, by offsetting the partition periodically, one can achieve fault tolerance within O(L/m) measurement rounds. Indeed, the parameter m simultaneously controls the ancilla-block size and the time overhead of fault tolerance. Our result is compatible with the fact that Shor's and Steane's schemes require O(L) and O(1) measurement rounds on the toric code, respectively.</p><p>Our paper is organized as follows. In Sec. II we review the essential background and present our notation. In Sec. III we discuss the construction of our family of extraction circuits. In particular, a subfamily called transversal gadgets is proposed. In Sec. IV we discuss the application of transversal gadgets on the toric code and analyze the time overhead of fault-tolerant syndrome measurements. We conclude with a summary and discussion in Sec. V.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. BACKGROUND</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Linear codes</head><p>We start by developing our notation for connecting sets and binary vector spaces. Given a set , we can define a vector space V where each vector corresponds to a finite subset A &#8838; , and vector addition is defined as the symmetric difference of two finite subsets of , i.e., for any two finite subsets A, B &#8838; we define A + B := (A &#8746; B) \ (A &#8745; B). The symmetric difference yields A + A = &#8709; and as a result V is a binary vector space. For convenience, each element w &#8712; is associated with two other meanings: a subset {w} &#8838; and indeed a vector {w} &#8712; V . We can then simplify the equation A = w&#8712;A {w} as A = w&#8712;A w for every A &#8712; V . As a result, we have</p><p>i.e., V is the binary vector space spanned by the set , denoted by F 2 [ ]. The cardinality of a finite subset A &#8838; , |A|, is also referred to as the weight of the vector</p><p>are said to be orthogonal if and only if &#966;, &#968; = 0. More generally, two vector subspaces V, W &#8838; F 2 [ ] are said to be orthogonal (denoted by V &#8869;W ) if we have v, w = 0 for every v &#8712; V and w &#8712; W .</p><p>Let and be two sets and M : F 2 [ ] &#8594; F 2 [ ] be a linear map. The kernel and image of M are denoted by ker M and im M, respectively. When and are finite, the transpose of M is defined to be a linear map</p><p>for every a &#8712; and b &#8712; . In other words, under the standard bases and , the matrix M T is the transpose of the matrix M. We also define the transpose of a vector &#966; &#8712; F 2 [ ], denoted by &#966; T , as the map</p><p>(</p><p>We now briefly discuss linear codes. Let be a set of n classical bits. A linear code on is a subspace</p><p>The vectors of C are called codewords. A parity-check matrix H of C is a linear map from F 2 [ ] to some other F 2 -vector space, F m 2 , such that C &#8838; ker H. In this work, we are not required to have C = ker H. The dual code of C is defined by</p><p>for every c &#8712; C.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. CSS stabilizer codes</head><p>Let be a set of n qubits. The state space of the quantum system is denoted by</p><p>The Pauli group on is denoted by P . For each qubit q &#8712; , the Pauli X and Z operators on H q are denoted by X q and Z q , respectively. For each subset &#968; &#8838; , the X -type and Z-type operators supporting on &#968; are defined by</p><p>and</p><p>respectively. Note that</p><p>A stabilizer code on is defined by an Abelian subgroup S &#8838; P such that every operator P &#8712; S has eigenvalues &#177;1 and -1 H / &#8712; S. The corresponding code space C Q is the common +1 eigenspace of all the operators in S. In particular, if dim C Q = 1, the unique state in C Q (up to a constant) is said to be a stabilizer state. If S = S X &#8853; S Z , where elements in S X (S Z ) are all X type (Z type), we say that S = S X &#8853; S Z is a Calderbank-Shor-Steane code <ref type="bibr">[5,</ref><ref type="bibr">6]</ref> and S X and S Z are the X and Z stabilizers, respectively. We can represent S X and S Z by two binary vector subspaces</p><p>and</p><p>respectively. Note that for any &#968;, &#966; &#8838; , X [&#968;] and Z[&#966;] commute with each other if and only if &#968; and &#966; are orthogonal. As S = S X &#8853; S Z is Abelian, we must have C X &#8869;C Z . In this paper, we prefer the binary vector space representation of CSS codes and use the notation C X &#8869;C Z to denote a CSS code. For a CSS code C X &#8869;C Z with a codespace</p><p>Thus we say that C X &#8869;C Z encodes k logical qubits. For any vector &#968; &#8712; C &#8869; Z , X [&#968;] commutes with any stabilizer and hence must be a logical operator. Similarly, for any &#966; &#8712; C &#8869; X , Z[&#966;] is a logical operator. In fact, we can always find k vectors &#968; 1 , . . . ,</p><p>Indeed, Xi and Zi can be regarded as the logical Pauli X and Z operators of the ith logical qubit, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. SYNDROME EXTRACTIONS FOR CSS CODES</head><p>On a stabilizer code, the errors are detected by measuring the stabilizer elements. For CSS codes, it is natural to measure the Z-and X -stabilizer elements so that the X errors (bit flips) and the Z errors (phase flips) can be handled separately. Here we are regarding a Y error as the combination of an X error and a Z error. In this paper, we focus on Z-stabilizer measurements, since an analysis of X -stabilizer measurements would be the same up to a Hadamard transform.</p><p>Let C X &#8869;C Z be a CSS code on a set of data qubits D with a codespace C Q &#8838; H D . Suppose we have an X error X [&#968;] (&#968; &#8838; D) and we are measuring the Z-stabilizer elements {Z[&#966; b ]} b&#8712;B , where &#966; b &#8712; C Z and B is a finite set of syndrome bits. For each syndrome bit b &#8712; B, the outcome of the measurement Z[&#966; b ] is determined by the value &#966; T b &#968; &#8712; F 2 . The binary vector</p><p>is called the syndrome. The map H = b&#8712;B b&#966; T b is called a Z-check matrix of the CSS code C X &#8869;C Z . Note that we do not require the condition im H T = C Z . However, we do require that im</p><p>If we are at the end of the computation, the syndrome of a Z-check matrix H can be obtained by measuring all qubits of D in the Z basis: If D is measured to be in the state |&#968; &#8712; H D , where &#968; &#8712; F 2 [D], then the syndrome is simply H&#968;. In the intermediate steps, however, we are not allowed to measure the data qubits directly, as the single-qubit Z measurements anticommute with the X -stabilizer elements. A general idea shared by both Shor's and Steane's syndrome extraction protocols is to transfer the X errors to a set of ancilla qubits A by CNOT gates: As the CNOT gate propagates the X errors on the control qubit to the target qubit, we can perform CNOT gates with controls in D and targets in A and then apply Z measurements on all ancilla qubits to detect these errors. Of course, this is far from being a valid construction. In the following, we explore the necessary and sufficient conditions for a valid extraction circuit. To simplify our problem, we do not initially consider the challenge of fault tolerance.</p><p>As a first step, we encode the information of the CNOT gates by a matrix :</p><p>, where for each data qubit d &#8712; D and ancilla qubit a &#8712; A, a T d = 1 if and only if a CNOT gate with control d and target a is applied. As these CNOT gates commute with each other and we do not consider the fault-tolerance properties, the order of these CNOT gates does not matter. The product of all CNOT gates is denoted by U . One can easily verify the identities</p><p>where</p><p>Suppose the ancilla block A is prepared in a state |anc . When there are no errors on the data block D, the action of U is required to be trivial, i.e., for any code state |&#969; &#8712; C Q , we should have</p><p>Note that the code space C Q is spanned by</p><p>where |+ k , the logical |+ k state, is the CSS stabilizer state of C &#8869; Z &#8869;C Z . This is true because we have enumerated all the logical Z-type operators. From <ref type="bibr">(13)</ref>, Z[&#966;] commutes with U for any &#966; &#8838; D. Thus ( <ref type="formula">14</ref>) can be simplified as</p><p>we can obtain H&#968; = H &#968;, the syndrome of H, by measuring all qubits of A in the Z basis. One could naturally ask the following question.</p><p>Question 1. Given two matrices H :</p><p>Suppose we already have a satisfying ancilla state |anc for a given H and . In <ref type="bibr">(17)</ref>, if we take &#968; &#8712; C &#8869; Z and combine ( <ref type="formula">16</ref>), we will have</p><p>On the other hand, let CZ &#8838; F 2 [A] represents the Z-stabilizer group of |anc . For every &#966; &#8712; CZ we have</p><p>Therefore, T &#966; &#8712; C Z and hence</p><p>The conditions ( <ref type="formula">18</ref>) and ( <ref type="formula">19</ref>) imply that U preserves the CSS stabilizer group</p><p>If CX = C&#8869; Z , i.e., |anc is a CSS stabilizer state and hence |+ k |anc is the stabilizer state of ( <ref type="formula">20</ref>), the condition ( <ref type="formula">16</ref>) must hold.</p><p>Our problem now becomes finding a CSS stabilizer state |anc with X and Z stabilizers CX and CZ ( CX = C&#8869; Z ), respectively, satisfying ( <ref type="formula">18</ref>), <ref type="bibr">(19)</ref>, and the condition im HT &#8838; CZ . A natural strategy is to minimize CZ and maximize CX = C&#8869; Z . The minimal choice of CZ is just im HT as required. The following lemma shows that CZ = im HT and CX = ker H always satisfy <ref type="bibr">(18)</ref> and <ref type="bibr">(19)</ref>.</p><p>Lemma 1. T (im HT ) &#8838; C Z and (C &#8869; Z ) &#8838; ker H . Proof. The first part follows from a direct calculation</p><p>The latter part is also straightforward: For any &#968; &#8712; C &#8869; Z , we have H &#968; = H&#968; = 0. Thus &#968; &#8712; ker H.</p><p>From the discussion above, we can conclude that given a Z-check matrix H of C X &#8869;C Z , any decomposition H = H corresponds a valid gadget extracting the syndrome of H. All the information of the gadget is determined by H and , which allows us to define a gadget in an abstract way.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Definition 1 ( Z-extraction gadgets). A Z-extraction gadget of a CSS code C</head><p>where A is the set of ancilla qubits, B is the set of syndrome bits, and H :</p><p>Here H, H, and are referred to as the data check matrix, ancilla check matrix, and gate matrix, respectively. The ancilla block A is prepared in the CSS state of ker H&#8869;im H T , while the gate applied between D and A is U .</p><p>If we apply two gadgets</p><p>The gate matrix and the ancilla check matrix H are defined by = 1 2 <ref type="bibr">(21)</ref> and</p><p>respectively. The total parity check matrix</p><p>We say that G is a sum of G 1 and G 2 , denoted by</p><p>We now review Shor's and Steane's constructions in our notation.</p><p>Example 1 (Shor's scheme). In Shor's scheme <ref type="bibr">[9,</ref><ref type="bibr">10]</ref>, Example 2 (Steane's scheme). In Steane's scheme <ref type="bibr">[13]</ref>, all the Z-stabilizer elements are extracted by one ancilla block. The ancilla block A is a set identical to the data block D. The gate matrix is an identity matrix under the standard bases, and the ancilla check matrix H is identical to the data check matrix H. The ancilla state is the CSS stabilizer state of</p><p>where k is the number of logical qubits.</p><p>Shor's scheme benefits from having small transversal catstate gadgets with ancilla blocks whose sizes are defined by the weight of the measured stabilizers, |A b | = |&#966; b |. If the stabilizers are measured in a serial manner, one only needs a number of ancilla qubits equal to the highest-weight stabilizer max b (|&#966; b |). Parallel syndrome measurement is desirable and one will need F ancilla qubits where F = b&#8712;B |&#966; b |. If each one of the n data qubits interacts on average with f measured stabilizers, then we can also write F = f n. Measurements need to be repeated T times until errors due to measurement can be distinguished from errors on the data; T can scale as O(d 2 ) <ref type="bibr">[9,</ref><ref type="bibr">23,</ref><ref type="bibr">24]</ref> in the worst case, but scales as O(d ) for topological codes <ref type="bibr">[19]</ref>.</p><p>Steane's scheme requires large ancilla-block size equal to the data block. However, only one block is needed for Z extraction. The total number of ancilla qubits is just n, which is less than the f n needed for the parallel Shor scheme. In addition, Steane's scheme is a single-shot method, so T is constant. The cost of Steane's scheme is the preparation and verification of the ancilla block, which becomes challenging as the code distance increases. However, if we can prepare the ancilla, it greatly reduces the number of interactions between the data block and the ancilla block before correction is applied.</p><p>Our goal is to construct gadgets other than Shor-and Steane-style ones that result in simplified ancilla blocks compared to Steane's and fewer interactions between data and ancilla than Shor's. Conceptually, we can start with Shor ancilla blocks and merge ancilla blocks in a way that removes ancilla qubits. We now describe this procedure in detail.</p><p>We start from Shor's cat-state syndrome measurement circuit, denoted by G = (A, B, H , ). For every ancilla qubit a &#8712; A, it interacts with exactly one data qubit and transfers the error to exactly one syndrome bit, i.e., | T a| = | Ha| = 1. In matrix language, each row of and each column of H contains exactly one nontrivial entry. We construct gadgets by applying a series of steps, each of which is described by one of the following operations.</p><p>(i) Pick two ancillas It is easy to check that the updated H and satisfy the condition H = H . If we only apply (i), each row of will always have exactly 1 nontrivial entry. Indeed, the gadget will be transversal on the ancilla. Assuming the ancilla block can be fault-tolerantly prepared by postselection so that no correlated errors can occur, a transversal gadget will not introduce correlated errors to the data block. If we keep doing (i), we will eventually obtain Steane's scheme. On the other hand, if we only apply (ii), we make the cat states smaller and introduce nontransversal interactions. The extreme case where no ancilla can be further merged is the bare-ancilla scheme.</p><p>By allowing both types of ancilla merging, we can generate a large number of gadgets. In fact, arguably, all the gadgets can be generated in this way. To see this, for any gadget G = (A, B, H , ) we can split each ancilla qubit a &#8712; A into | T a| &#215; | Ha| ancilla qubits: With each data qubit q &#8712; T a and syndrome bit b &#8712; Ha &#8838; B, we associate an ancilla qubit to pass the X error on q to b. The obtained gadget is Shor's cat-state syndrome measurement scheme, and the reverse of the splitting provides us the merging steps from Shor's scheme to G.</p><p>However, such an argument has a few exceptions. First, if | T a| = 0 or | Ha| = 0, a will be split into 0 qubits so that the process is irreversible. Second, for each data qubit q and each syndrome bit b, the number of ancilla qubits connecting them after splitting, whose parity is b T Hq, can be more than 1. In contrast, for Shor's scheme, there must be exactly b T Hq &#8712; {0, 1} ancilla qubits. To handle these exceptions, we can introduce pairs of redundant ancilla qubits to Shor's scheme before merging: Each pair of ancilla qubits passes the error on the same data qubit to the same syndrome bit. After merging, we can add ancilla qubits that only connect syndrome bits. Of course, we can add ancilla qubits that only connect data qubits as well. These ancilla qubits, however, are not helpful for gaining error information.</p><p>The tradeoffs of two merging operations are different: Operation (i) reduces the number of CNOT gates between data and ancillas while making the ancilla block more entangled; (ii) simplifies the ancilla-block preparation while taking the risk of introducing correlated errors and breaking fault tolerance. If we wish to have postselection-free ancillas, the stabilizer generators of the ancilla block should have weight no more than 3 so that the residual error of each single error in the preparation circuit can be reduced to a weight-1 error. Bare ancillas, Bell states, and three-qubit Greenberger-Horne-Zeilinger states are example states that satisfy this requirement. More generally, any CSS state equivalent to a one-dimensional cluster state (up to Hadamard gates) can be directly prepared. To preserve the code distance, the correlated errors introduced by these simple gadgets need to be handled by either a well-designed decoder or modifications to the decoding circuits, such as flag qubits <ref type="bibr">[16,</ref><ref type="bibr">18,</ref><ref type="bibr">30]</ref> and DiVincenzo-Aliferis ancilla decoding <ref type="bibr">[25]</ref>. However, as the matrix decomposition H = H inherits the code structure from H, the detailed fault-tolerance design will be code specific.</p><p>In Fig. <ref type="figure">1</ref> we illustrate how to obtain syndrome extraction methods for Steane's seven-qubit code by applying merging operations on Shor's scheme. Instead of applying only one type of operation, we can apply both. As an example, we apply (i) on Shor's scheme twice to obtain a transversal gadget (referred to as scheme A in Fig. <ref type="figure">1</ref>) and then apply (ii) five times to obtain a non-fault-tolerant circuit (referred to as scheme B) whose ancilla state can be prepared without verification. As Steane's code has distance 3, one can apply a DiVincenzo-Aliferis decoding circuit <ref type="bibr">[25]</ref> to make scheme B fault tolerant. The details of the protocol are given in Appendix A.</p><p>When measurement errors are being considered, the use of different extraction gadgets will lead to different decoding problem details. In the next section, as an example, we will study the behavior of transversal gadgets on Kitaev's toric code <ref type="bibr">[4]</ref> thoroughly. In particular, we will show that by switching between different gadgets, the time overhead of fault tolerance can be reduced without increasing the ancilla complexity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. BLOCK EXTRACTIONS OF THE TORIC CODE</head><p>We briefly review the construction of the toric code <ref type="bibr">[4]</ref>. A toric code is a CSS code defined on an L &#215; L periodic lattice on the torus. The lattice has a set V of L 2 vertices, a set E of 2L 2 edges, and a set F of L 2 faces. Define the boundary map</p><p>and the coboundary map</p><p>For each</p><p>] code, we show how to construct syndrome extraction circuits by merging ancilla qubits in Shor's scheme. In the circuit diagram, the labeled sets of the data qubits, ancilla qubits, and syndrome bits are {1, . . . , 7}, {1 , . . . , 12 }, and {a, b, c}, respectively. The syndrome bits can be determined by a collective measurement of the labeled qubit outputs. Two ancilla qubits can be merged if they (i) talk to the same set of data qubits or (ii) contribute to the same set of syndrome bits. If we repeat application of rule (i), we will eventually obtain Steane's scheme. Bare-ancilla extraction is the limit for applying rule (ii) only. For example, we demonstrate scheme A, a transversal gadget obtained by applying (i) to Shor's scheme twice, and scheme B, obtained by applying (ii) on scheme A five times. We illustrate the matrix decomposition description H = H of these gadgets. Empty elements in H are zeros, which are not shown to emphasize the block structure. The Z-and X -stabilizer generators are generated from im HT and ker H , respectively. We note that the ancilla state for scheme B is (|000 + |111 ) &#8855; (|00 + |11 ), which can be directly prepared without verification. However, as the circuit is not transversal on the ancilla block, we will require a DiVincenzo-Aliferis decoding circuit <ref type="bibr">[25]</ref> to achieve fault tolerance. The details are given in Appendix A.</p><p>edge set E . A nontrivial logical Z-type (X -type) operator is represented by a noncontractible loop (cut) on the torus, which has minimum length L. In other words, the toric code has distance L. The Z-and X -check matrices of the toric code are &#8706; T and &#948; T , respectively. A Z-extraction gadget is therefore represented by a decomposition &#8706; T = &#8706;T &#947; T , or &#8706; = &#947; &#8706;. The matrices &#8706;T and &#947; T are the ancilla check and gate matrices, respectively.</p><p>The Shor-style extraction gadget, denoted by ( &#7868;0 , F, &#8706;T 0 , &#947; T 0 ), can be understood as cutting the torus into L 2 disjoint square faces. The set &#7868;0 contains the 4L 2 edges of these squares. For two edges &#7869;1 , &#7869;2 &#8712; &#7868;0 , if they were the same edge &#947; 0 &#7869;1 = &#947; 0 &#7869;2 &#8712; E before the torus was cut, we can glue the two edges back, which corresponds to a type-(i) merging of ancilla qubits &#7869;1 and &#7869;2 . Therefore, any transversal gadgets can be constructed by cutting the torus along some chosen edges. If we do not make any cut, the gadget will be Steane style; if we choose to cut along all edges, we will obtain the Shor-style gadget. In general, we will obtain a new topological space, with an edge set &#7868; and a boundary map &#8706; :</p><p>maps an edge &#7869; &#8712; &#7868; back to its corresponding edge e &#8712; E on the torus. The boundary of the space is &#8706;F &#8838; &#7868; . Here, as a reminder, F is viewed as an F 2 -vector. If we cut along an edge e &#8712; E so that |&#947; T e| = 2, we must have &#947; T e &#8838; &#8706;F . If we cut the torus into several connected components so that the face set F is decomposed as F = i F i , where each F i is the face set of a component, then the gadget ( &#7868; , F, &#8706;T , &#947; T ) can be decomposed as</p><p>where each &#7868;i := f &#8712;F i &#8706; f is the set of edges of the component F i , &#8706;| F i is the restriction of &#8706; on F i , and &#947; | &#7868;i is the restriction of &#947; on &#7868;i . That is, the ancilla block &#7868; is decomposed as subblocks &#7868;i , and two different blocks are not entangled.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Errors in space-time</head><p>We now describe our model of fault-tolerant error correction following the presentation in Ref. <ref type="bibr">[40]</ref>. As X and Z errors can be corrected separately, we can ignore the Z errors on the data qubits and the X -stabilizer measurements. Suppose our computation starts from time 0 and never ends. We extract the Z-check matrix &#8706; at every positive integer time. For every t &#8712; Z + , each data qubit could have an X error at time t -1 2 , and the measurement outcome of each ancilla qubit at time t could have a classical bit-flip error. For now, we ignore the data errors between two CNOT gates in the same extraction round.</p><p>Suppose the transversal gadget applied at time</p><p>, where &#947; t &#8706;t = &#8706;. An X error on the data qubit e &#8712; E at time t - 1  2 is denoted by the pair (e, t ). The measurement error on the ancilla qubit &#7869; &#8712; &#7868;t at time t &#8712; T is denoted by the pair (&#7869;, t ). The set of all single data-qubit errors, denoted by D, is</p><p>while the set of measurement errors, denoted by M, is</p><p>An error history is defined to be a finite subset &#968; &#8838; D &#8746; M, or equivalently a vector</p><p>For later convenience, the time coordinates of the errors are discarded. Similarly, the evaluation of measurement errors at time t is a map</p><p>The data error propagating to the ancilla block &#7868;t is the accumulation of all data errors before time t, which can be evaluated by the map</p><p>In particular, we define D := t&#8712;N D t . The syndrome at time t can therefore be evaluated by the map</p><p>We can take the difference of the syndrome sequence {&#963; t } t&#8712;N ,</p><p>In the above calculation, we set &#963; 0 = 0, D0 = 0, and M 0 = 0 for convenience. We can see that the data error D t only contributes to t , while the measurement error M t contributes to both t and t+1 . The syndrome history is defined as a map</p><p>We now analyze the behavior of each single error. For each data error (e, t ), one can verify that</p><p>where f 1 , f 2 &#8712; F are the two faces sharing e as their borders.</p><p>For each measurement error (&#7869;, t ) &#8712; M, where &#7869; &#8712; &#7868;t , one can verify that</p><p>If &#7869; &#8712; &#8706;t F , i.e., &#7869; is an split edge, &#8706;T t &#7869; has only one face and we say that (&#7869;, t ) is a type-I error. Otherwise, | &#8706;T t &#7869;| = 2 and (&#7869;, t ) is said to be a type-II error. The set of type-I errors, denoted by M 1 , is</p><p>while the set of type-II errors is denoted by</p><p>The syndrome bit flips can be viewed as defects in the (2 + 1)-dimensional lattice: Each data error creates two defects in the same time slice: Each type-I error creates two defects on the same location, but in two consecutive time slices, and each type-II error creates four defects. We give an example of the effect of different error types in Fig. <ref type="figure">2</ref>. If G t is a Shor-style gadget, all measurement errors at time t will be of type I, and the data and measurement errors are referred to as spacelike and timelike errors, respectively <ref type="bibr">[19]</ref>. If G t is a Steanestyle gadget, however, all measurement errors at time t will be of type II.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Minimum-weight perfect matching decoder</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Given an error history</head><p>will take &#968; as the input and output an estimation of error history &#968; = Dec( &#968; ) such that &#968; = &#968;. The optimal choice of the decoder is the minimum-weight-error (MWE) decoder Dec MWE defined by</p><p>If the gadgets are all Shor style so that M = M 1 , the decoding problem can be visualized by a decoder graph G with vertex set F &#215; N and edge set D &#8746; M 1 : Each ( f , t ) &#8712; F &#215; N is a vertex and each error &#968; &#8712; D &#8746; M 1 is an edge connecting the two vertices (defects) in &#968;. The error histories are also called error chains, as they can be visualized as sums of chains in G. Note that the map evaluates the boundary of a given error chain. Decoding a syndrome &#968; is essentially finding an error chain &#968; whose boundary coincides with &#968;. As an error chain &#968; always matches the defects in &#968; into pairs, the MWE decoder returns a minimum-weight perfect matching (MWPM) of the defects <ref type="bibr">[19,</ref><ref type="bibr">27,</ref><ref type="bibr">41]</ref>.</p><p>The existence of type-II errors complicates our problem, as they create four defects instead of two. We have to use hyperedges to represent these errors in the decoder (hyper)graph. Although we can still use a MWE decoder, the geometric meaning will be less clear. Notice that for any type-II error (&#7869;, t ) &#8712; M 2 is equivalent to two data errors (&#947; t &#7869;, t ) + (&#947; t &#7869;, t + 1) &#8712; F 2 <ref type="bibr">[D]</ref>. As an approximation, we can pretend that the type-II errors do not exist and use the MWPM decoder</p><p>on the decoder graph with an edge set D &#8746; M 1 . For convenience, we define a map</p><p>that projects all the errors onto the decoder graph. For each error history &#968; &#8712; F 2 [D &#8746; M], the MWPM decoder will regard it as an error chain &#968; on the decoder graph with total length The ancilla blocks when aligned lead to timelike correlations between direct repeated measurements. (d) By offsetting the ancilla blocks, the timelike correlations require spacelike errors in order to correlate defects from top to bottom. The images in (c) and (d) have been reproduced from Ref. <ref type="bibr">[40]</ref>.</p><p>Indeed, the MWPM decoder can only guarantee that</p><p>However, as shown below, Dec MWPM can preserve the code distance L. Theorem 1. If |&#968;| &lt; L/2, Dec MWPM will correct &#968; without introducing a logical error.</p><p>Proof. Let &#968; = Dec MWPM ( &#968; ). By applying the correction &#968; , the data error will become an undetectable error D(&#968; + &#968; ). Since the toric code has distance L, it suffices to show that | D(&#968; + &#968; )| &lt; L.</p><p>From ( <ref type="formula">41</ref>) and the fact that |&#968; &#8745; D| |&#968; |, we obtain</p><p>The theorem is proved by combining <ref type="bibr">(42)</ref> and the fact that</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Fault-tolerant error correction in finite time</head><p>In practice, the circuit will always end in some finite time T . All the errors and defects can only occur before T . Indeed, the MWPM decoder will have a finite decoder graph of size O(T L 2 ). As the MWPM algorithm runs in polynomial time to the input size, for the purpose of reliable quantum memory, we can delay the decoding until the end of the circuit execution. However, this will be impractical for quantum computation tasks with non-Clifford gates <ref type="bibr">[42]</ref>. Instead, we need to process the syndrome bits and correct the errors as soon as possible. As the information provided by the latest syndrome bits is always unreliable, they will be processed only when further syndrome bits are gathered. For example, we can divide the time axis by some chosen time</p><p>In the ith round of error correction, we decode the syndrome bits in the time interval [t i , t i+2 ) but only correct the errors in [t i -1 2 , t i+1 -1 2 ); then we discard the syndrome bits in [t i , t i+1 ) while keeping the syndrome bits in [t i+1 , t i+2 ) for the next round of correction. Note that the syndrome bits at time t i+1 need to be updated. This is known as the overlapping recovery method <ref type="bibr">[19,</ref><ref type="bibr">23]</ref>. The details are formally described in Procedure 1, for which we define</p><p>and</p><p>for convenience.</p><p>Procedure 1 Overlapping recovery.</p><p>1:i &#8592; 1.</p><p>2: Use MWPM to find an error history &#968; such that</p><p>. // Correct the error history. In practice, this is equivalent to applying error correction on the data qubits and updating the syndrome at time t i+1 .</p><p>In the ith iteration of Procedure 1, the decoder graph for MWPM only contains the vertices (syndrome bits) and edges (errors) in [t i -1 2 , t i+2 -1 2 ). In particular, the type-I errors at time t i+2 -1 will become open edges, i.e., edges connecting to the time boundary. The defects not only can be paired with each other, but also can be fused with the time boundary so that its lifetime is extended to the next round. Intuitively, if the distance from time slice t i to t i+1 on the decoder graph, denoted by d (t i , t i+1 ), is too small, it would be too easy for the decoder to fuse a defect at time t i to the time boundary. As a consequence, errors can hardly be corrected. If we use the Shor-style gadget all the time, we will have d (t i , t i+1 ) = |t i+1t i |, and it has been shown that |t i+1t i | = O(L) suffices for fault tolerance <ref type="bibr">[19]</ref>. Naturally, the result should be generalized as d (t i , t i+1 ) = O(L) for arbitrary choices of gadgets. To show this, we prove the following.</p><p>Theorem 2. In the ith round of error correction, if the correction &#968; creates a propagating error, i.e., &#968; + &#968; contains a path from the time slice t i to the time slice t i+1 , then |&#968;[t i , t i+1 ]| 1 2 [d (t i , t i+1 ) -L]. Proof. Suppose &#968; + &#968; contains a path P 1 from a vertex ( f 1 , t i ) to some vertex ( f 2 , t i+1 ), where f 1 , f 2 &#8712; F . Then &#968; + &#968; must contain another path P 2 from a vertex ( f 1 , t i ) to a vertex ( f 2 , t i+1 ), where f 1 , f 2 &#8712; F and P 1 &#8745; P 2 = &#8709;. We must have </p><p>Adding | &#968; &#8745; P 1 | + | &#968; &#8745; P 2 | to both side of (48) and combining <ref type="bibr">(46)</ref>, we obtain</p><p>Given the gadgets {G t } t&#8712;N , we can choose the time slices {t i } i&#8712;N such that d (t i , t i+1 ) &#945;L = O(L) for some constant &#945; 1. By Theorem 2, to make a propagating error happen, the system should have at least (&#945;-1)L 2 errors. As L 2 (&#945;-1)L 2 errors can already lead to a logical error, assuming the errors are independent, the probability of a propagating error is negligible compared to that of a logical error. Now we can move from these theorems to comparing different transversal gadgets by visualizing their decoder graph in time and space. The Shor gadget leads to a uniform three-dimensional lattice decoder graph where the vertices are syndrome bits as shown in Fig. <ref type="figure">3(a)</ref>. As proved above, we need the time dimension to be comparable to the space dimension. Steane-style syndrome extraction is capable of single-shot error correction <ref type="bibr">[14]</ref>. In our method, this is clear from the lack of timelike edges in the decoder graph in Fig. <ref type="figure">3(b)</ref>. What intermediate schemes with block decoding generate are decoder graphs that only have timelike edges on the edges of the block.</p><p>We create the blocks by partitioning the L &#215; L toric code into m &#215; m blocks, where m divides L. We then use a small m &#215; m surface-code ancilla block with 2m(m + 1) ancilla qubits. At the edges of the blocks, there are timelike boundaries as shown in Fig. <ref type="figure">3(c)</ref>.</p><p>In this picture, it is clear that we can reduce timelike edges by shifting the pattern of blocks. To simplify the shifting, we choose m = 3k for an integer k. At every time step, we shift the position of the blocks up and to the right by k as shown in Fig. <ref type="figure">3(d)</ref>. By construction, the error gadgets repeat every three steps, G t = G t+3 , and we can verify that d (t, t + 3) = (m). This enables us to achieve fault tolerance with |t i+1t i | = O(L/m).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Numerical results</head><p>We test our methods using numerical simulations as described in Ref. <ref type="bibr">[40]</ref>. In Appendix B we include the raw data of the simulation used to determine the thresholds in <ref type="bibr">[40]</ref>. We reproduce the details of the numerical method below for convenience. We also expand the simulations to consider the case with no ancilla-block error.</p><p>We study the circuit-level performance of our fault-tolerant error-correction schemes by Monte Carlo simulations. The X -and Z-syndrome extractions are applied alternatively. Our error model is parametrized by a single error parameter p and consists of three parts.</p><p>(i) Gate errors. With probability p, each two-qubit CNOT gate is followed by a Pauli error drawn uniformly at random from the set {I, X, Y, Z} &#8855;2 \ {I &#8855; I}.</p><p>(ii) Measurement errors. With probability 2p/3, a measurement outcome in either the Z or X basis is flipped. (iii) Preparation errors. Ancilla preparation can lead to correlated errors that need to be removed through verification or syndrome measurement decoding. Here we assume a simple error model where the complicated ancilla blocks are generated perfectly and then each qubit undergoes an independent depolarizing channel with probability p 1 , which we set to either p or 0.</p><p>We further simplify by ignoring idling errors, which enables us to avoid complications due to scheduling. Comparison of these syndrome extraction methods for practical application would require a detailed multiparameter error model, a procedure for ancilla generation and verification, and the connectivity constraints of the quantum processor. To accelerate our simulation, instead of the standard MWPM decoding algorithm, we use a weighted variant of the union-find decoder <ref type="bibr">[43,</ref><ref type="bibr">44]</ref>. For the surface code with bare ancilla, weighted union-find relative to MWPM decreases the threshold from 0.72% to 0.62% for a standard depolarizing error model with idling errors <ref type="bibr">[44]</ref>. Tables I and II compare our block extraction schemes and Shor's and Steane's schemes for the two cases p 1 = p and p 1 = 0, respectively. In our noise model, the thresholds of transversal-gadget extraction is lower (upper) bounded by that of Shor's cat-state extraction (Steane's method). For block extraction, we fix the ancilla-block size m when L &#8594; &#8734;. In this case, we will still need O(L) rounds of extractions even if we offset the blocks. However, we observe that offsetting the blocks yields different threshold values than aligning them. This makes sense as the two strategies provide different decoder graph symmetries. When m gets larger, the offset version starts to yield higher threshold values. We also calculate the threshold of the conventional bare-ancilla extraction scheme <ref type="bibr">[41]</ref> for a comparison.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. CONCLUSION</head><p>An ideal fault-tolerant syndrome extraction circuit would have minimal interaction with the data, easy to prepare ancilla blocks, and require a small number of measurement rounds to make a decision. In this work, we have shown a family of extraction circuits that produces methods between Shor's [9] and Steane's <ref type="bibr">[13]</ref> schemes. These circuits allow us to trade off the complexity of ancilla-block preparation for reduced interactions with the data. Furthermore, by shifting the choice of ancilla blocks in time, we can reduce the number of measurement rounds to achieve fault tolerance while maintaining constant ancilla-block complexity. Specifically for the toric code of distance L, we can use offset blocks of size O(m 2 ) to achieve fault tolerance in O(L/m) rounds, as presented in Ref. <ref type="bibr">[40]</ref>.</p><p>When ancilla postselections are allowed, we found that our error-correction schemes could yield higher thresholds for certain error models assuming negligible idling errors and independent ancilla errors. For a more realistic threshold estimation, we need to choose a specific ancilla preparation protocol. For the toric code example, the ancilla blocks inherit the toric code structure and one can use the bare-ancilla extraction circuit with postselections to prepare them <ref type="bibr">[41]</ref>. There also exist protocols for preparing a general CSS stabilizer state by state distillations without leaving any correlated errors <ref type="bibr">[39]</ref>. The detailed simulation of these preparation protocols is beyond the scope of this work and the utility strongly depends on the physical error model.</p><p>We have presented the toric code as an example because it is well studied and enables us to separate the advantages and disadvantages of our methods from advantages and disadvantages of new codes. The toric code allows for fault-tolerant syndrome extraction with bare ancilla on a nearest-neighbor two-dimensional lattice. Our methods are not a natural choice for the toric code, because the methods require the architecture to break out of two dimensions and also creates new challenges for generating ancilla blocks. On the other hand, we know that finite-rate quantum error-correction codes are not compatible with Euclidean two-dimensional architectures. We hope that our framework will enable the development of high-threshold fault-tolerant extraction circuits for these more qubit efficient codes.</p><p>There are a number of directions for further study. Codes that typically use Shor-style extraction, such as twodimensional color codes <ref type="bibr">[30,</ref><ref type="bibr">45]</ref>, can be decoded with ancilla blocks to improve the threshold. Concatenated codes that have high thresholds with postselected Knill or Steane schemes <ref type="bibr">[15,</ref><ref type="bibr">46]</ref> also have high ancilla rejection rates and block methods can examine trading a reduced threshold for less ancilla verification. The non-fault-tolerant schemes developed here can be made fault tolerant using flag methods <ref type="bibr">[17,</ref><ref type="bibr">18,</ref><ref type="bibr">30]</ref>. The time optimization and the choice of ancilla blocks can be analyzed using the framework recently applied to Shorstyle extraction <ref type="bibr">[23,</ref><ref type="bibr">24]</ref>. Finally, these methods need to be tested in the face of more realistic errors as experimental systems approach the complexity capable of generating and utilizing large ancilla blocks. be used to detect Z (X ) errors on the ancilla block, which could propagate to the data block as weight-2 Z (X ) errors. We repeatedly execute the scheme B circuit until we observe a nontrivial outcome. As we have detected at least one error and we cannot correct any two errors for a distance-3 code, we can use the bare-ancilla circuit to extract the syndrome bits a Z , b Z , and c Z (a X , b X , and c X ), as shown in Fig. <ref type="figure">4</ref>. We use Table <ref type="table">III</ref> to decode the X errors based on the syndromes measured in the unflagged circuit when no flags were raised. However, if f 1,X ( f 1,Z ) is flagged and the syndrome a Z b Z c Z (a X b X c X ) is 001, the most probable error should be X 2 X 3 (Z 2 Z 3 ) instead of X 7 (Z 7 ). Similarly, if f 2,X ( f 2,Z ) is flagged, we need to change the correction for the syndrome 010 to X 3 X 4 (Z 3 Z 4 ). The flags tell us to expect correlations and decode appropriately. </p></div></body>
		</text>
</TEI>
