<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Sumsets and entropy revisited</title></titleStmt>
			<publicationStmt>
				<publisher>Wiley Periodicals</publisher>
				<date>07/31/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10535327</idno>
					<idno type="doi">10.1002/rsa.21252</idno>
					<title level='j'>Random Structures &amp; Algorithms</title>
<idno>1042-9832</idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Ben Green</author><author>Freddie Manners</author><author>Terence Tao</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<title>Abstract</title> <p>The entropic doubling of a random variable taking values in an abelian group is a variant of the notion of the doubling constant of a finite subset of , but it enjoys somewhat better properties; for instance, it contracts upon applying a homomorphism. In this paper we develop further the theory of entropic doubling and give various applications, including: (1) A new proof of a result of Pálvölgyi and Zhelezov on the “skew dimension” of subsets of with small doubling; (2) A new proof, and an improvement, of a result of the second author on the dimension of subsets of with small doubling; (3) A proof that the Polynomial Freiman–Ruzsa conjecture over implies the (weak) Polynomial Freiman–Ruzsa conjecture over .</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>where X 1 , X 2 are independent copies of X. Here, H(X) denotes the Shannon entropy of X, the definition and basic properties of which we review in Appendix A.</p><p>If A &#8838; G is a finite non-empty set, by abuse of notation we write &#120590; ent [A] = &#120590; ent [U A ], where U A is a uniform random variable drawn from A. For instance, if H is a finite subgroup of G, one can check that &#120590; ent [H] = 1.</p><p>The entropic doubling constant is related to other standard measures of additive structure via the inequalities</p><p>Here, &#120590;[A] &#8758;= |A+A| |A| is the doubling constant of A and</p><p>is the additive energy of A.</p><p>The second inequality in (1.1) was noted in <ref type="bibr">[20, eq. 10]</ref>, and we recall the proof in Appendix B. The first inequality seems not to have appeared explicitly in the literature, but it follows easily either by a direct argument using the weighted AM-GM inequality, or by quoting the monotonicity of R&#233;nyi entropy. For completeness, we give this argument in Appendix B. Both inequalities can be far from tight: we give an example in Appendix B.</p><p>1.1.1 Entropic Ruzsa distance</p><p>If X, Y are G-valued random variables (not necessarily independent, or even defined on the same sample space), then we define the entropic Ruzsa distance &#119889; ent (X, Y) between these variables by the formula</p><p>where X &#8242; , Y &#8242; are independent copies of X, Y respectively. This concept, introduced by Ruzsa <ref type="bibr">[16]</ref> and studied in more detail by the third author <ref type="bibr">[20]</ref>, generalizes entropic doubling, since &#120590; ent [X] = e &#119889; ent (X,-X) . It is easy to see that &#119889; ent (X, Y) = &#119889; ent (Y, X) &#8805; 0, and also that &#119889; ent (U H , U H ) = 0 for any finite subgroup H of G. Note that &#119889; ent (X, Y) depends only on the distributions p X (x) &#8758;= P(X = x); p Y (y) &#8758;= P(Y = y) of X, Y. We have (see the final paragraph of <ref type="bibr">[16]</ref>, <ref type="bibr">[20, theorem 1.10]</ref>, or Lemma 1.1 below) the entropic Ruzsa triangle inequality &#119889; ent (X, Z) &#8804; &#119889; ent (X, Y) + &#119889; ent (Y, Z) <ref type="bibr">(1.3)</ref> for any three G-valued random variables X, Y, Z.</p><p>Remark. It is somewhat traditional to use the letter K for the combinatorial doubling constant &#120590; <ref type="bibr">[A]</ref>. We will generally use the letter k for distances &#119889; ent (X, Y). Where these arise from sets (for instance if X = Y = U A ) one should informally think of k being on the order of log K. It should be carefully noted that k may take values in [0, &#8734;) and is not constrained to be an integer.</p><p>It will be technically convenient to introduce a small modification of the entropic Ruzsa distance. Define the maximal entropic Ruzsa distance &#119889; * ent (X, Y) to be the quantity</p><p>where X &#8242; , Y &#8242; range over all pairs of random variables with marginal distributions p X , p Y respectively (i.e., all couplings of X and Y). In particular, X &#8242; , Y &#8242; are not required to be independent. We have the following observations. Lemma 1.1. Let X, Y, Z be G-valued random variables. Then:</p><p>ent (X, Y) &#8804; 3&#119889; ent (X, Y). For the proof, see Section 2.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1.2">Small Ruzsa distance</head><p>We turn now to our first main result, which gives a closer connection than (1.1) between small entropy doubling (or more generally small Ruzsa distance) and small doubling. Here, for a real parameter p &#8712; (0, 1), we write h(p) &#8758;= p log 1 p + (1p) log 1  1-p for the entropy of the Bernouilli random variable with probability p. Proposition 1.2. Let C &#8805; 4 be a real parameter. For any G-valued random variables X, Y there is a non-empty finite subset S of G such that, if U S is a uniform random variable on S, then</p><p>) (1.5)</p><p>) .</p><p>(1.6)</p><p>We isolate two special cases of this proposition for future use:</p><p>(i) If we take C = 4 in the above proposition, then (using, in addition, Lemma 1.1 (ii)) we obtain the bounds &#119889; ent (U S , Y) &#8804; 6&#119889; ent (X, Y) + log 2 and |S -S| &#8804; 4e 12&#119889; ent (X,Y) |S|. (ii) If &#119889; ent (X, Y) = &#120576; for some 0 &lt; &#120576; &#8804; 1  16 , then on taking C &#8758;= &#120576; -1&#8725;2 we obtain the bounds &#119889; ent (U S , Y) &#8810; &#120576; 1&#8725;2 log 1  &#120576; and |S -S| &#8804;</p><p>We also remark that a qualitative version of this proposition (with unspecified dependence on &#119889; ent (X, Y) on the right-hand side) was previously established in <ref type="bibr">[20, proposition 5.2]</ref>.</p><p>In the regime where &#119889; ent (X, Y) is small, we can in fact obtain the following more precise result.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Proposition 1.3.</head><p>There is absolute constant &#120576; 0 &gt; 0 such that the following is true. Let X, Y be G-valued random variables, and suppose that &#119889; ent (X, Y) &#8804; &#120576; 0 . Then there is some</p><p>Remark. The constant &#120576; 0 could be specified explicitly if desired, but we have not carried out such a calculation. The constant 12 can be improved, but we will not attempt to optimise it here.</p><p>1.1.3 Behaviour under homomorphisms Given the close relation between notions of entropy doubling and Ruzsa distance and the usual combinatorial notions, one would be forgiven for wondering what the point of introducing the former is.</p><p>The answer is that the entropy notions are more flexible and behave better in various ways. Most obviously, they are defined for arbitrary random variables X, with no requirement that X be uniform on a set. Related to this is the fact that the entropy notions behave well under homomorphisms in a way that the combinatorial notions do not.</p><p>The following is our main result in this direction. Here (and below) we use (X|E) to denote a random variable X conditioned to a positive probability event E. We adopt the convention that expressions such as</p><p>In particular, we have</p><p>and thus</p><p>for any G-valued random variable X.</p><p>Remark. The main inequality here is a precise version of the intuition that the doubling constant of a subset of G in the presence of a homomorphism &#120587; &#8758; G &#8594; H should somehow be at least the doubling constant of the 'base' times some combination of the doubling constants of the 'fibres'. To make sense of this rigorously we need to pass from sets to general random variables, and replace combinatorial doubling by entropy doubling. An example of the failure of a similar result in the purely combinatorial setting (in fact of the analogue of (1.7)) is outlined in [22, exercise 2.2.10].</p><p>Remark. Many previous works have noted the advantageous properties of entropy in somewhat related settings. To give a few examples, in rough chronological order there is the work of Avez <ref type="bibr">[1]</ref> and of Ka&#523;manovich and Vershik <ref type="bibr">[11]</ref> on random walks on discrete groups, the work of Hochman <ref type="bibr">[10]</ref> on fractals, and the work of Breuillard-Varj&#250; <ref type="bibr">[3]</ref> on Bernouilli convolutions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2">Structure of sets with small doubling</head><p>We turn now to applications of the results of the previous subsection to inverse theorems for sets with small doubling.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2.1">Skew dimension</head><p>The first application is a new proof of a result of P&#225;lv&#246;lgyi and Zhelezov <ref type="bibr">[15]</ref>, which they used to give a new and much shorter proof of a celebrated result of Bourgain and Chang <ref type="bibr">[2]</ref>. For the purposes of this result, an affine coordinate space is a subset of Z D (for some D) obtained by fixing the values of some possibly empty set of coordinates. For instance, {(2, -1, x 3 ) &#8758; x 3 &#8712; Z} &#8838; Z 3 is an affine coordinate space. If A is a finite subset of some affine coordinate space V, its<ref type="foot">foot_1</ref> skew-dimension dim * (A) is defined inductively, as follows:</p><p>1. dim * (A) = 0 if and only if A is a singleton (or empty).</p><p>Theorem 1.5 <ref type="bibr">(P&#225;lv&#246;lgyi-Zhelezov)</ref>. Suppose A is a finite subset of some affine coordinate space with</p><p>This result is essentially contained in a paper <ref type="bibr">[15]</ref> of P&#225;lv&#246;lgyi and Zhelezov: whilst it is not actually stated in that paper, it is mentioned in a talk by Zhelezov <ref type="bibr">[23, min 27:30]</ref>, and can be established using the methods of <ref type="bibr">[15]</ref>. Lecture notes of the first author may be consulted for a detailed account with some simplifications <ref type="bibr">[8]</ref>, as well as details of the deduction of the result of Bourgain and Chang.</p><p>In fact we will establish the following slightly stronger result.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Theorem 1.6.</head><p>There is an absolute constant C with the following property. Suppose that A, B &#8838; Z D . Then there are A &#8242; &#8838; A and</p><p>Setting B = -A and using (1.1), we recover Theorem 1.5. The "bilinear" form of Theorem 1.6 will be convenient for induction purposes.</p><p>1.2.2 Dimension and PFR over Z Whilst the notion of skew-dimension is useful in the context of the work of Bourgain and Chang, the actual (affine) dimension dim A, defined as the dimension of the span of A -A over the reals, is a more intrinsically natural quantity. Note that dim * A &#8804; dim A for any A. It is conjectured that Theorem 1.5 remains true with dim * A &#8242; replaced by dim A &#8242; -this is sometimes (see, for instance, [4], [15, conjecture 1]) called the weak Polynomial Freiman-Ruzsa conjecture (PFR) over Z. (The term 'weak' comes from the fact that only the dimension is controlled, and no attempt is made to put A economically inside a box or homomorphic image of the lattice points in a convex set.</p><p>Such stronger statements are also speculated to be true, but care must be taken in their formulation, as discussed in <ref type="bibr">[13]</ref>.)</p><p>Prior to this paper, the best known bound in the direction of this conjecture was a result of the second author <ref type="bibr">[14]</ref>.</p><p>The proof of this result in <ref type="bibr">[14]</ref> was a little exotic, making use of projections modulo 2 and a kind of "U 3 -energy". We provide a new, shorter, proof of this result, retaining the first feature but using entropic notions in place of the exotic energy.</p><p>We will eventually go further in this paper by using results on sets with additive structure in F D 2 to improve the bounds, but before doing that we make a detour into the world of structure theorems for sets of small doubling in F D 2 .</p><p>1.2.3 Small doubling in F D 2 and PFR over F 2 In the following discussion, F D 2 denotes the vector space of dimension D over F 2 ; the value of D is typically somewhat unimportant. Essentially everything we have to say would work equally well over other finite fields F, but this introduces some further technicalities and implied constants would need to depend on F, and we do not discuss this aspect here. Denote by C PFR any constant for which the following statement is true: if A &#8838; F D 2 , and if the doubling constant &#120590;[A] is at most K, then A is covered by exp(O(log C PFR (2K))) cosets of some subspace H &#8804; F D 2 of size at most |A|. The implied constant in the O() notation is allowed to depend on C PFR . A celebrated result of Sanders [18, corollary A.2] (together with standard covering lemmas) is that one may take C PFR = 4. By an improved version of the argument due to Konyagin (see <ref type="bibr">[19, theorem 1.4]</ref>), one can in fact take any C PFR &gt; 3. Strictly speaking, the statement in <ref type="bibr">[19]</ref> applies to more general abelian groups than F D 2 , but replaces the subspace H by a convex coset progression. However, an inspection of the arguments in the characteristic 2 case shows that the convex coset progression in this case can be taken to be a subspace (basically because the convex coset progressions are constructed via Bohr sets, which are automatically subspaces in the characteristic 2 setting). Alternatively, one can invoke the discrete John theorem (see <ref type="bibr">[21, theorem 1.6]</ref>) to control the convex coset progression by a generalized arithmetic progression (up to acceptable losses, and increasing C PFR by an epsilon), and then observe that in F D 2 , all generalized arithmetic progressions are in fact subspaces. We leave the details of these arguments to the interested reader.</p><p>We have the following notorious conjecture, known as the Polynomial Freiman-Ruzsa conjecture over F 2 . Conjecture 1.9. We may take C PFR = 1.</p><p>There are a large number of equivalent formulations of Conjecture 1.9; see <ref type="bibr">[7,</ref><ref type="bibr">9,</ref><ref type="bibr">12,</ref><ref type="bibr">17]</ref>. We add a further equivalent form of Conjecture 1.9, formulated in terms of entropy. It says that, in the case G = F D 2 , Proposition 1.3 is valid with no smallness restriction on the entropic distance &#119889; ent (X, Y). Proposition 1.10. Conjecture 1.9 is equivalent to the claim that, for any</p><p>Small doubling and dimension, again</p><p>We now return to the main topic, and offer the following improvement of Theorem 1.8.</p><p>We remark that the constant C is allowed to depend on C PFR (and so by implication on the implied constant in the definition of C PFR ). As it turns out, the implied constant in the &#8810; is independent of C PFR , and in particular can be taken to be 40&#8725; log 2.</p><p>Thus, using the Konyagin/Sanders result we can obtain</p><p>which is the strongest unconditional result currently known. Perhaps more interestingly, we obtain the following conditional implication between the two forms of the Polynomial Freiman-Ruzsa conjecture discussed above.</p><p>Corollary 1.12. Conjecture 1.9 implies Conjecture 1.7.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2.5">Plan of the paper</head><p>We begin by developing the theory of entropy doubling and (entropy) Ruzsa distance, as discussed above. In Section 2 we establish Lemma 1.1. In Section 3, we look at the link between random variables and sets and establish Proposition 1.2. In Section 4 we look at homomorphisms and give the (short) proof of Proposition 1.4. Then, in Section 5, we combine these results to prove Proposition 1.3, which relates random variables with small entropy doubling to subgroups. This section is a little lengthy but, as we indicate at the appropriate points, not all of the analysis is needed in subsequent sections.</p><p>After this, we turn to the applications to small doubling in Z D . We begin, in Section 6, by proving Theorem 1.6 (and thus reproving Theorem 1.5). Then, we give a new proof of the result of the second author, Theorem 1.8.</p><p>Next, we take a brief detour into structural results over F 2 , establishing the equivalence of the Polynomial Freiman-Ruzsa conjecture in this setting with its entropic formulation (Proposition 1.10).</p><p>Finally, we return to small doubling in Z D , establishing Theorem 1.11 (and hence Corollary 1.12) in Section 9.</p><p>Finally, we remark that although we have written the paper in the context of an abelian group G, many of the arguments (e.g., the proof of Theorem 1.3) do not require this assumption.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">AN IMPROVED ENTROPIC RUZSA TRIANGLE INEQUALITY</head><p>In this section we prove Lemma 1.1.</p><p>Proof of Lemma 1.1. We begin with part (i). It suffices to show that</p><p>This is equivalent to establishing</p><p>whenever Y is independent of (X, Z) (but X and Z are not required to be independent of each other). We apply the submodularity inequality (A5) with A = X -Y, B = Z, C = X -Z. With these choices we have</p><p>In the second display we used (A3). Applying (A5) gives part (i) of Lemma 1.1.</p><p>For part (ii), the first inequality &#119889; ent (X, Y) &#8804; &#119889; * ent (X, Y) is trivial. For the second inequality, we apply (i) and (1.3) to conclude that</p><p>giving the claim. &#9642;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">FROM RANDOM VARIABLES TO SETS</head><p>The objective of this section is to prove Proposition 1.2. Let C, X, Y be as in that proposition. We may assume without loss of generality that X, Y are independent. For brevity we adopt the notation k &#8758;= &#119889; ent (X, Y). We need to locate a set S satisfying (1.5) and (1.6). The key lemma is the following.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lemma 3.1. There exists a finite non-empty subset S of G such that</head><p>and such that</p><p>whenever Z is an S-valued random variable.</p><p>Proof. As X, Y are independent and k = &#119889; ent (X, Y), we have</p><p>and hence by (A14) it follows that</p><p>Applying the equality case of (A18), we conclude that</p><p>Inspired by this, we define S by the formula</p><p>Denote by A the random variable A &#8758;= 1 X&#8712;S , and write p &#8758;= P(A = 1) = P(X &#8712; S). By Markov's inequality and (3.4), it follows that</p><p>Now we make some observations. First,</p><p>Second, since Y is independent of X and A, it follows using (A13) that</p><p>.8) Combining (3.7), (3.8) with (3.3) we conclude after a short computation that k &#8805; p 2 H(Y) -p 2 H(X|A = 1) -1 2 h(p). (3.9) By (A1), we have H(X|A = 1) &#8804; log |S|. Substituting into (3.9) and rearranging yields log |S| &#8805; H(Y) -h(p) p -2k p . Using (3.6) (and the monotone decreasing nature of h(p) for p &#8805; 1&#8725;2), we obtain (3.1). Now we prove (3.2). From (A18) (replacing X by X -Y there) we have</p><p>Note here that the Kullback-Leibler divergence is well-defined and finite. Indeed, Z takes values z &#8712; S, and hence by the definition of S we have p X (z) &gt; 0 for such z. Proof of Proposition 1.2. We begin by establishing (1.5). Let S be as in Lemma 3.1.</p><p>Taking Z = U S in (3.2), we have</p><p>The required bound (1.5) then follows from (3.1). Now we prove (1.6). Let Z, Z &#8242; be any pair of S-valued random variables. From Lemma 1.1 (ii) and (3.2) we have</p><p>or equivalently</p><p>Now we observe that it is possible to choose Z, Z &#8242; supported on S so that Z -Z &#8242; has the uniform distribution on S -S. To do this, simply take (Z, Z &#8242; ) to have distribution function</p><p>Using (3.10) and (3.1), (1.6) follows. &#9642; Remark. The last part of this argument has considerable similarity with [16, sect. 5].</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">ENTROPY DISTANCE UNDER HOMOMORPHISMS</head><p>In this section we establish Proposition 1.4. Let notation be as in the statement of that proposition.</p><p>Proof of Proposition 1.4. We have</p><p>and similarly</p><p>Subtracting half of (4.2) and half of (4.3) from (4.1) gives</p><p>Now X i determines Y i , and so</p><p>Moreover, by (A7),</p><p>. Combining (4.4), (4.5), (4.6) gives the result.</p><p>In fact one sees that the difference between the LHS and the RHS in the proposition is</p><p>. &#9642;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">VERY SMALL ENTROPY DOUBLING</head><p>In this section we prove Proposition 1.3, which states that random variables X, Y for which &#119889; ent (X, Y) is small are close to uniform on a subgroup. We first handle the case X = Y (in which case we will establish Proposition 1.3 with the improved constant of 6). Assume henceforth that X is a G-valued random variable with &#119889; ent (X, X) = &#120576; &#8804; &#120576; 0 .</p><p>We first observe that a weak version of Proposition 1.3 (which in fact suffices for many applications) follows quickly from Proposition 1.2. As observed in item (ii) after Proposition 1.2, we see that there is S such that</p><p>and</p><p>where the last inequality holds if &#120576; 0 is small enough. A well-known classical observation of Freiman <ref type="bibr">[5]</ref> now implies that H &#8758;= S -S is a group. We recall the short proof here. For any x, y &#8712; S, x -S and y -S both lie in S -S and so</p><p>For each such pair, we have xy = uv. Now let x &#8242; , y &#8242; &#8712; S be any other elements. Similarly, there are</p><p>|S| values of u &#8242; , and all these values lie in S; therefore we must have v = u &#8242; for some pair of these elements. It then follows that</p><p>Since x, y, x &#8242; , y &#8242; were arbitrary, it follows that S -S is closed under addition. Since it contains 0 and is closed under taking inverses, it must be a group.</p><p>From (A16) (noting that S is contained in a single coset of H) and (5.2) we have</p><p>Therefore from (5.1) and (1.3) we have</p><p>This is weaker than Proposition 1.3 only in the non-linear dependency on &#120576;, which in many applications is not important. We will deduce the stronger statement of Proposition 1.3 by bootstrapping this bound. To do this, we require two lemmas which are essentially special cases of Proposition 1.3 itself. First, we consider the case in which X is highly concentrated near one point.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lemma 5.1.</head><p>There is &#120575; 0 &gt; 0 such that the following is true. Suppose that X is a G-valued random variable and x 0 &#8712; G is a value such that P(X = x 0 ) &#8805; 1 -&#120575; 0 . Then H(X) &#8804; 2&#119889; ent (X, X).</p><p>Proof. By replacing X by Xx 0 if necessary, we may assume without loss of generality that x 0 = 0. Let X 1 , X 2 be independent copies of X. Then our task is equivalent to showing that</p><p>(5.4)</p><p>Write p &#8758;= P(X &#8800; 0) (thus p &#8804; &#120575; 0 ), and let A denote the indicator function of the event that X 1 , X 2 &#8800; 0; then P(A = 0) = 1p 2 and P(A = 1) = p 2 . As a consequence, we have</p><p>where we used (A13) in the last line. Now note that for any z, if A = 0 and X 1 -X 2 = z then (X 1 , X 2 ) can take only two values (z, 0) and (0, -z) if z &#8800; 0, and only one value (0, 0) if z = 0. Hence</p><p>Combining with (5.5), we obtain</p><p>We also observe that</p><p>(5.7)</p><p>We further note that, writing I for the indicator of X &#8800; 0,</p><p>.8) 10982418, 0, Downloaded from <ref type="url">https://onlinelibrary.wiley.com/doi/10.1002/rsa.21252</ref> by Princeton University, Wiley Online Library on [23/08/2024]. See the Terms and Conditions (<ref type="url">https://onlinelibrary.wiley.com/terms-and-conditions</ref>) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License</p><p>Taking (5.7) minus p times (5.8) gives</p><p>Combining this with (5.6), we obtain</p><p>(5.9)</p><p>Recall that our aim is to demonstrate (5.4). To get this from (5.9), we first note that expansion to leading order gives</p><p>provided &#120575; 0 is small enough: the LHS here is &#8764; 2p log 2, whilst the right hand side is &#8764; 1 2 p log 1  p . (A more careful analysis shows that &#120575; 0 = 1 20 is sufficient.) We also have</p><p>(5.11)</p><p>The desired bound (5.4) then follows immediately from (5.9), (5.10) and <ref type="bibr">(5.11)</ref>. &#9642;</p><p>Remark. The constant 2 in the statement of Lemma 5.1 can be replaced by anything larger than 1, at the expense of making &#120575; 0 smaller. This may be shown with very minor modifications of the above argument.</p><p>We next consider the case of a random variable supported on H.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lemma 5.2. Suppose that X is an H-valued random variable with H(X)</head><p>To prove this we will use the following lemma concerning couplings of almost uniform random variables, which is plausibly of independent interest. Here, for a probability distribution p on a group</p><p>for the &#120001; 1 -distance of p from the uniform distribution (or, equivalently, twice the total variation distance of p from the uniform distribution).</p><p>(5.12)</p><p>Then there exists a pair of random variables (X, Y) on H (not necessarily independent) having the marginal distributions p X = p 1 , p Y = p 2 and p X-Y = p 3 .</p><p>Proof. We wish to show that the triple of distributions</p><p>Here (as usual) &#120575; t (u) = 1 if u = t, and &#120575; t (u) = 0 otherwise. By the (finite-dimensional) Hahn-Banach theorem, this is equivalent to showing that there is no hyperplane separating (p 1 , p 2 , p 3 ) from &#931;, or in other words whenever</p><p>for all x, y &#8712; H, one also has</p><p>Henceforth, assume (5.13). Note that (5.13) and (5.14) are both unaffected if we shift f 1 , f 2 , f 3 by constants c 1 , c 2 , c 3 summing to zero. Thus we may normalize so that</p><p>If this quantity is non-negative then (5.14) is immediate, so we may assume that it is negative. By rescaling we may thus normalize so that</p><p>(5.15)</p><p>In particular, there exists x 0 &#8712; H such that f 1 (x 0 ) = -1. From (5.13) and (5.15), we conclude that for every y &#8712; H one has</p><p>This implies that f 2 (y)p 2 (y) + f 3 (x 0y)p 3 (x 0y) &#8805; (f 2 (y) + f 3 (x 0y)) min(p 2 (y), p 3 (x 0y))</p><p>Summing over y, we conclude that</p><p>Cyclically permuting the roles of f 1 , f 2 , f 3 and p 1 , p 2 , p 3 and averaging, the desired bound (5.14) then follows from (5.12). &#9642;</p><p>Proof of Lemma 5.2. By (A10) and Pinsker's inequality (A12) it follows that</p><p>(5.16) Applying Lemma 5.3 (with p 1 = p 2 = p X and p 3 = 1 |H| ), it follows that there exists a pair of random variables (X 1 , X 2 ) such that X 1 , X 2 each have the same marginal distribution as X, and X 1 -X 2 is uniform on H. </p><p>which immediately implies the result. &#9642;</p><p>Proof of Proposition 1.3. We first establish the case X = Y (with the constant 12 replaced by 6). Suppose, as we have throughout the section, that &#119889; ent (X, X) = &#120576; &#8804; &#120576; 0 . Let &#120587; &#8758; G &#8594; G&#8725;H be the quotient projection. Recall from (A16) that</p><p>.17) From (5.3) we have the weak bound &#119889; ent (X, U H ) &#8810; &#120576; 1&#8725;2 log 1 &#120576; . Thus H(&#120587;(X)) &#8810; &#120576; 1&#8725;2 log 1 &#120576; (5.18) and H(X) &#8805; log |H| -O ( &#120576; 1&#8725;2 log 1 &#120576; ) . (5.19) Now by Proposition 1.4 (replacing H there by G&#8725;H, and recalling that &#119889; ent (X, X) = &#120576;) we obtain &#119889; ent (&#120587;(X), &#120587;(X)) &#8804; &#120576; (5.20) and &#8721; y 1 ,y 2 &#8712;G&#8725;H p &#120587;(X) (y 1 )p &#120587;(X) (y 2 )&#119889; ent (X y 1 , X y 2 ) &#8804; &#120576;, (5.21)</p><p>where X y denotes X conditioned to the event &#120587;(X) = y. By (5.18) and (A2), we see that there is some y 0 &#8712; G&#8725;H such that P(&#120587;(X)</p><p>. By translating X if necessary, we may assume without loss of generality that y 0 = 0, that is to say</p><p>where &#120575; 0 is the constant from Lemma 5.1, and we assume that &#120576; 0 from the statement of Proposition 1.3 is sufficiently small. Applying Lemma 5.1 to &#120587;(X) using (5.20), (5.22), we conclude that H(&#120587;(X)) &#8804; 2&#120576;.</p><p>(5.23)</p><p>Meanwhile, discarding all terms in the sum over y 1 in (5.21) except the term y 1 = 0, and using <ref type="bibr">(5.22)</ref>, it follows that</p><p>Using H(X) = H(X|&#120587;(X)) + H(&#120587;(X)) and (5.23), we conclude that</p><p>(5.24)</p><p>In particular, from (5.19) we deduce</p><p>) .</p><p>(5.25)</p><p>Now by discarding all terms in (5.21) except the one with y 1 = y 2 = 0, and using (5.22), we have</p><p>It follows from Lemma 5.2 that H(X 0 ) &#8805; log |H| -2.42&#120576;, and hence by (5.24) we obtain H(X) &#8805; log |H| -6.62&#120576;.</p><p>Combining this with (5.17) and <ref type="bibr">(5.23)</ref> gives &#119889; ent (X, U H ) &#8804; 5.31&#120576; &#8804; 6&#120576;, which is the statement of Proposition 1.3 (with a better constant) in the symmetric case X = Y.</p><p>Finally, we deduce the general case in which X and Y may be different. Suppose now that &#119889; ent (X, Y) = &#120576; &#8804; &#120576; &#8242; 0 , where &#120576; &#8242; 0 &#8758;= &#120576; 0 &#8725;2 with &#120576; 0 the constant above. By the triangle inequality, &#119889; ent (X, X) &#8804; 2&#120576; &#8804; &#120576; 0 , and so by the symmetric case of Proposition 1.3 established above we have &#119889; ent (X, U H ) &#8804; 12&#120576; for some subgroup H &#8804; G. Similarly, we have &#119889; ent (Y, U H &#8242; ) &#8804; 12&#120576; for some subgroup H &#8242; &#8804; G.</p><p>It remains to argue that H = H &#8242; . For this, we observe that by the triangle inequality we have</p><p>.26) If H &#8800; H &#8242; , then H+H &#8242; is a subgroup of G properly containing H, H &#8242; and therefore of size at least 2 max(|H|, |H &#8242; |). Since U H -U H &#8242; is uniform on H + H &#8242; , we have &#119889;(U H , U H &#8242; ) &#8805; log 2, which contradicts (5.26) if &#120576; 0 is small enough. Therefore we do indeed have H = H &#8242; , and this concludes the proof. &#9642;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">SKEW DIMENSION AND A RESULT OF P&#193;LV&#214;LGYI AND ZHELEZOV</head><p>In this section we give the proof of Theorem 1.6 (and thus Theorem 1.5).</p><p>Proof of Theorem 1.6. Let &#120576; &gt; 0 be a small constant to be specified later, and set C &#8758;= 2&#8725;&#120576;. We will prove Theorem 1.6 with this particular value of C.</p><p>We proceed by on |A||B| and on D. Denote by &#120587; &#8758; Z D &#8594; Z projection onto the first coordinate. We may assume that at least one of the sets &#120587;(A), &#120587;(B) is not a singleton (otherwise D may be reduced to D -1).</p><p>Let X 1 , X 2 be uniform random variables on A, B respectively, and let Y i = &#120587;(X i ). Applying Proposition 1.4 and rearranging, we obtain</p><p>where</p><p>and</p><p>We now divide into two cases, according to whether</p><p>Let &#120576; 0 be the constant from Proposition 1.3, and assume that &#120576; &#8804; min</p><p>. By Proposition 1.3 and the fact that H = {0} is the only finite subgroup of Z, we have</p><p>by the choice of C, &#120576;. Inserting this into (6.1) and rearranging, we obtain</p><p>In particular, there exist i, j such that</p><p>Invoking the induction hypothesis (with A, B replaced by A &#8745; &#120587; -1 ({i}) and B &#8745; &#120587; -1 ({j}) respectively), we see that there are sets</p><p>This closes the induction in Case 1.</p><p>In this case we see from (6.1) that</p><p>The contribution from those (i, j) with log</p><p>By the induction hypothesis, for each pair (i, j) &#8712; S there are sets</p><p>and all with skew-dimension at most</p><p>(Here we used the fact that C = 2&#8725;&#120576;.) For each i &#8712; Z, set A &#8242; i to be the largest of the sets A &#8242; i,j , (i, j) &#8712; S (or A &#8242; i = &#8709; if (i, j) &#8713; S for every j), and similarly for each j &#8712; Z set B &#8242; j to be the largest of the sets B &#8242; i,j , (i, j) &#8712; S, breaking ties arbitrarily. Finally, set A &#8242; &#8758;= &#8899; i&#8712;Z A &#8242; i and B &#8242; &#8758;=</p><p>By the definition of skew-dimension and the bound (6.6), we have dim</p><p>From the elementary inequality t &#8805; log t for t &#8805; 1 applied to t = (K&#8725;K i,j ) C (noting by (6.4) that we do indeed have t &#8805; 1), we have</p><p>for any (i, j) &#8712; S. From this and (6.5), ( <ref type="formula">6</ref>.3) we have</p><p>10982418, 0, Downloaded from <ref type="url">https://onlinelibrary.wiley.com/doi/10.1002/rsa.21252</ref> by Princeton University, Wiley Online Library on [23/08/2024]. See the Terms and Conditions (<ref type="url">https://onlinelibrary.wiley.com/terms-and-conditions</ref>) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License</p><p>This completes the induction, and the theorem is proved. &#9642;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">DIMENSION AND A RESULT OF THE SECOND AUTHOR</head><p>We turn now to the question of the dimension (as opposed to the weaker skew-dimension) of subsets of Z D with small doubling. Our aim in this section is to give an entropic proof of Theorem 1.8. In so doing, we will also lay the groundwork for the proof of Theorem 1.11, our improvement upon this result.</p><p>As in [14, slogan 2.5], a key idea is that a set A &#8838; Z D with small doubling must look rather singular under the projection map &#120601; &#8758; Z D &#8594; F D 2 . In Lemma 7.2 below, we give an entropic formulation of this principle. We isolate the following lemma from the proof. Lemma 7.1. Let G be torsion-free, and let X, Y be G-valued random variables. Then &#119889; ent (X, 2Y) &#8804; 5&#119889; (X, Y).</p><p>Proof. We assume X, Y are independent. Then<ref type="foot">foot_2</ref> </p><p>by definition of &#119889; * ent . By Lemma 1.1,</p><p>Letting Y 1 , Y 2 be independent copies of Y (which are also independent of X) we have</p><p>and</p><p>so applying the submodularity inequality (A5) gives</p><p>Combining this with <ref type="bibr">(7.3)</ref> gives</p><p>which, together with (7.1) and (7.2), yields</p><p>and so</p><p>where we used (A14) in the last step. &#9642; </p><p>However, &#120601;(2Y) is identically zero and so</p><p>Combining this with (7.4) gives the stated bound for H(&#120601;(X)). The bound for H(&#120601;(Y)) follows in the same way. &#9642; Remark. It is perhaps worth remarking on the meaning and proof of this statement. Supposing that A &#8834; Z D is a set with small (combinatorial) doubling K, it follows that the dilate 2 &#8901; A, which is contained in A + A, is commensurate (up to polynomial factors in K) with A. Projecting mod 2, one therefore expects the projection &#120587;(A) to be commensurate with the projection &#120587;(2 &#8901; A) = {0}. A version of this argument appears in <ref type="bibr">[14, appendix B]</ref>. In the entropy setting, Lemma 7.1 acts as a replacement for the trivial observation that 2 &#8901; A is contained in A + A.</p><p>Now we are ready for the proof of Theorem 1.8 itself. To make the argument work, we will in fact need to establish the following bipartite variant of the result.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Theorem 7.3.</head><p>There is an absolute constant C 1 such that, setting f (t) &#8758;= C 1 t(1 + t), the following is true. Let D &#8712; N, and suppose that A, B &#8838; Z D are finite non-empty sets. Then there exist nonempty</p><p>)</p><p>It is clear from (1.1) that this indeed implies Theorem 1.8. We first prove a simple lemma which will be used several times in what follows.</p><p>Lemma 7.4. Let &#120601; &#8758; G &#8594; H be a homomorphism, and A, B &#8838; G finite subsets. For x, y &#8712; H write A x = A &#8745; &#120601; -1 (x) and B y = B &#8745; &#120601; -1 (y) for the fibres of A and B, and write</p><p>Then there exist x, y &#8712; H such that A x , B y are non-empty and with</p><p>Proof. First observe that the random variables (U A |&#120601;(U A ) = x) and (U B |&#120601;(U B ) = y) are equal in distribution to U A x , U B y respectively, that is to say the uniform distributions on the fibres. It follows from Proposition 1.4 that</p><p>By definition,</p><p>It follows by the pigeonhole principle that there is at least one choice of x, y such that &#120572; x , &#120573; y &gt; 0 and</p><p>Rearranging gives <ref type="bibr">(7.6)</ref>. &#9642;</p><p>We now turn to the proof of Theorem 7.3.</p><p>Proof of Theorem 7.3. Let us begin by noting the simple inequality</p><p>Let us turn now to the main proof. We will proceed by induction on |A| + |B|. We may also assume that A, B do not sit inside cosets of a proper subgroup of Z D , else we may replace Z D by that subgroup. We also suppose D &#8805; 1, as the result is trivial otherwise.</p><p>Let &#120601; &#8758; Z D &#8594; F D 2 be the natural homomorphism. Then, by the preceding remark and the fact that ker &#120601; is a proper subgroup of Z D , we may assume that at least one of &#120601;(A), &#120601;(B) is not a singleton. For x, y &#8712; F D 2 , denote by A x &#8758;= A &#8745; &#120601; -1 (x) and B y &#8758;= B &#8745; &#120601; -1 (y) the fibres of A, B. Note that</p><p>for all x, y. Write k &#8758;= &#119889; ent (U A , U B ) and &#120576; &#8758;= &#119889; ent (&#120601;(U A ), &#120601;(U B )). Let &#120575; &gt; 0 be a small positive constant to be determined later, and set C 1 &#8758;= max(20&#8725;&#120575;, 100). We will divide into two cases, according to whether or not &#120576; &#8804; &#120575;.</p><p>Case 1: &#120576; &gt; &#120575;. By Lemma 7.2 with X = U A , Y = U B , we have H(&#120601;(U A )) + H(&#120601;(U B )) &#8804; 20k. (7.10) By Lemma 7.4 applied to G = Z D and H = F D 2 , we may find x, y &#8712; F D 2 such that (7.6) holds. Fix such x, y, and for brevity set k &#8242; &#8758;= &#119889; ent (U A x , U B y ). Then (7.6) implies that k &#8242; &#8804; k and log |A| |A x | + log |B| |B y | &#8804; 20k &#120576; (kk &#8242; ). (7.11)</p><p>Noting (7.9), we may apply the induction hypothesis to conclude that there are</p><p>This and (7.11) immediately imply log |A| |A &#8242; | + log |B| |B &#8242; | &#8804; f (k &#8242; ) + 20k &#120576; (kk &#8242; ). By (7.8), this is at most f (k), since C 1 &#8805; 20&#8725;&#120575; &#8805; 20&#8725;&#120576;. This closes the induction in Case 1. Case 2: &#120576; &#8804; &#120575;. Recall here that &#120576; = &#119889; ent (&#120601;(U A ), &#120601;(U B )), and note that &#120576; &#8804; k by Proposition 1.4. Let &#120576; 0 be the constant from Proposition 1.3 and suppose &#120575; &#8804; &#120576; 0 . By Proposition 1.3 there is some H &#8804; F D 2 such that &#119889; ent (&#120601;(U A ), U H ), &#119889; ent (&#120601;(U B ), U H ) &#8804; 12&#120576;. It is possible that H = F D 2 . In this case, we have by (A14) and Lemma 7.2 that log(2 D ) = H(U H ) &#8804; H(&#120601;(U A )) + 2&#119889; ent (&#120601;(U A ), U H ) &#8804; 10k + 24&#120576; &#8804; 34k, and so D &#8804; 100k. This gives Theorem 7.3 simply by taking A = A &#8242; , B = B &#8242; , since C 1 &#8805; 100. Alternatively, suppose that H is a proper subgroup of F D 2 . Denote by &#966; the composition of &#120601; with projection to F D 2 &#8725;H. By (A17) we have</p><p>By (A2) there is some x 0 such that P( &#966;(U A ) = x 0 ) &#8805; e -32&#120576; &#8805; e -32&#120575; . Choosing &#120575; sufficiently small, this is &#8805; 1 -&#120575; 0 where &#120575; 0 is the constant in Lemma 5.1, and so by Lemma 5.1</p><p>where the second inequality is by <ref type="bibr">(1.3)</ref>. The same bound holds for H( &#966;(U B )).</p><p>Hence by Lemma 7.4 applied to &#966;, A and B (noting that we cannot have H( &#966;(U A )) = H( &#966;(U B )) = 0, as then A, B would be contained in cosets of a proper subgroup) we deduce that there exist</p><p>where</p><p>We now finish the proof as before. Set k &#8242; = &#119889; ent (U A x , U A y ), which is &#8804; k by (7.12). Since A, B are not contained in cosets of a proper subgroup of Z D , we have</p><p>and so by induction we may find</p><p>By (7.8) (and since C 1 &#8805; 8) this is at most f (k). This closes the induction in Case 2 and the proof of Theorem 7.3 is complete. &#9642; Remark. For this argument, the full strength of Proposition 1.3 was not needed, and the weaker bound (5.3) would have sufficed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8">ENTROPY FORMULATION OF PFR OVER F 2</head><p>In this section we establish Proposition 1.10. Recall that the content of this proposition is that the following two statements are equivalent: (1) ) cosets of some subspace</p><p>Proof of Proposition 1.10. We first derive the entropic statement, that is to say Statement 2 above, from the combinatorial one (Statement 1). Write k &#8758;= &#119889; ent (X, Y) and set K &#8758;= e k . We may assume that k &#8805; &#120576; 0 , where &#120576; 0 is the constant in Proposition 1.</p><p>3, since the claim follows immediately from that proposition otherwise. Applying Proposition 1.2 with C = 4, we obtain a set S &#8838; F D 2 with &#119889; ent (X, U S ) &#8810; k (8.1) and (recalling that |S+S| |S| &#8804; ( |S-S| |S| ) 3 ; see for example, [22, corollary 2.12]) |S + S| &#8810; K O(1) |S|. (8.2) By Statement 1 there is a subgroup H &#8804; F D 2 , |H| &#8804; |S|, such that S is covered by O(K O (1) ) cosets of H. Note, in particular, that S + H is contained in the union of the aforementioned cosets, and so |S + H| &#8810; K O (1) min(|S|, |H|). Now for any sets A, B we have</p><p>) .</p><p>(This is the bipartite version of (</p><p>1.1).) Applying this with A = S and B = H (and noting H = -H) gives &#119889; ent (U S , U H ) &#8810; k, and so by the triangle inequality and (8.1) we have &#119889; ent (X, U H ) &#8810; k, which is the conclusion in Statement 2. We now to the reverse implication, deriving the combinatorial Statement 1 from the entropic Statement 2. Suppose that A &#8838; F D 2 is a set and write K &#8758;= &#120590;[A] and k&#8758;= log K. Then, by (1.1), we have &#119889; ent (A, -A) = log &#120590; ent [A] &#8804; k. Assuming Statement 2, there is some finite subgroup H &#8804; F D 2 with &#119889; ent (U A , U H ) &#8810; k. By (A14) and the fact that H(U A ) = log |A|, H(U H ) = log |H|, we have</p><p>Writing p(x) for the density function of U A -U H , thus p(x) = |A&#8745;(H+x)| |A||H| , it follows from (A2) that there is some x 0 such that</p><p>Recall the Ruzsa covering lemma (see e.g., <ref type="bibr">[22, lemma 2.14]</ref>), which states that if |U + V| &#8804; K|U| then V is covered by K translates of U -U. Applying this with U = A &#8745; (H + x 0 ) and V = A, and using the fact that U + V &#8838; A + A and U -U &#8838; H, it follows that A is covered by O(K O (1) ) translates of H.</p><p>If |H| &#8804; |A|, we are done. If |H| &gt; |A|, pass to a subgroup H &#8242; &#8804; H of size in the range ( 1 2 |A|, |A|]; then A is covered by O(K O(1) ) translates of H &#8242; , and the proof is complete in this case also. &#9642; A minor modification of the first part of the above proof, using the quantity C PFR from the introduction in place of Statement 1, gives the following statement. Proposition 8.1. Let X, Y be F D 2 -valued random variables, and suppose that &#119889; ent</p><p>), for some absolute constant C (which may depend on C PFR ).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9">DIMENSION AND THE WEAK PFR CONJECTURE</head><p>We now prove Theorem 1.11 (and hence Corollary 1.12). The proof is along somewhat similar lines to the proof of Theorem 1.8 given in Section 7, but more involved. An important ingredient will be the following lemma. Throughout this section, C will be the constant in Proposition 8.1 (but the precise nature of this constant is not important).</p><p>Lemma 9.1. Suppose that X and Y are F D 2 -valued random variables. Then there is a subgroup H &#8804; F D 2 such that, denoting by &#120595; &#8758; F D 2 &#8594; F D 2 &#8725;H the natural projection, and setting k &#8758;= &#119889; ent (&#120595;(X), &#120595;(Y)), we have log |H| &#8804; 2(H(X) + H(Y)) (9.1) and H(&#120595;(X)) + H(&#120595;(Y)) &#8804; 8Ck ( 1 + k C PFR -1 ) . (9.2) We isolate the following (sub-) lemma from the proof. Lemma 9.2. Let n &#8712; N. Let X, Y be F n 2 -valued random variables. Set k&#8758;= &#119889; ent (X, Y), suppose that H(X) + H(Y) &gt; 8Ck ( 1 + k C PFR -1 ) . (9.3) Then there is a nontrivial subgroup H &#8804; F n 2 such that log |H| &#8804; H(X) + H(Y) (9.4) and (writing &#120595; &#8758; F n 2 &#8594; F n 2 &#8725;H as above) H(&#120595;(X)) + H(&#120595;(Y)) &#8804; 1 2 (H(X) + H(Y)) . (9.5) Proof. Set k &#8758;= &#119889; ent (X, Y). Applying Proposition 8.1, we obtain a subgroup H such that &#119889; ent (X, U H ), &#119889; ent (Y, U H ) &#8804; Ck(1 + k C PFR -1 ). By (A17) and (9.3), it follows that</p><p>which is (9.5). To prove (9.4), an application of (A14) yields log</p><p>and similarly for Y. Therefore using (9.3) we have</p><p>which gives the required bound <ref type="bibr">(9.4)</ref>.</p><p>If H were trivial we would have &#120595;(X) = X, &#120595;(Y) = Y and so (9.5) would imply H(X)+ H(Y) = 0, which then contradicts <ref type="bibr">(9.3)</ref>. &#9642; Proof of Lemma 9.1. We iteratively define a sequence</p><p>2 &#8725;H i the ith associated projection operator, and set k i &#8758;= &#119889; ent (&#120595; i (X), &#120595; i (Y)). We stop the iteration at the ith stage if we have</p><p>Otherwise, we apply Lemma 9.2 to &#120595; i (X), &#120595; i (Y), obtaining a nontrivial subgroup</p><p>and</p><p>Clearly from iterated application of (9.8) we obtain</p><p>Then, from a telescoping application of (9.7) we get log |H i | &#8804; 2(H(X) + H(Y)). (9.9)</p><p>Since the groups H i form a strictly increasing sequence, the iteration does terminate at some time i. At this time we have both (9.6) and (9.9) and so, setting &#120595; = &#120595; i , the proof of Lemma 9.1 is concluded. &#9642;</p><p>Now we turn our attention to Theorem 1.11. It is a consequence of the following bipartite statement, which should be compared to Theorem 7.3. ). Applying Lemma 7.4 once again, and noting (9.13) and (9.15), we find x, y &#8712; F D 2 &#8725;H such that log |A| |A x | + |B| |B y | &#8804; 20C(1 + k 1-&#120574; ) ( k -&#119889; ent (U A x , U B y ) ) (9.16) Set k &#8242; = &#119889; ent (U A x , U B y ). By induction on A x , B y we may find A &#8242; &#8838; A x and B &#8242; &#8838; B y such that dim A &#8242; , dim B &#8242; &#8804; C 2 k &#8242; &#8804; C 2 k and log |A x | |A &#8242; | + log |B y | |B &#8242; | &#8804; f (k &#8242; ). Adding this to (9.16) yields log |A| |A &#8242; | + log |B| |B &#8242; | &#8804; f (k &#8242; ) + 20C(1 + k 1-&#120574; )(kk &#8242; ). (9.17) However,</p><p>10982418, 0, Downloaded from <ref type="url">https://onlinelibrary.wiley.com/doi/10.1002/rsa.21252</ref> by Princeton University, Wiley Online Library on [23/08/2024]. See the Terms and Conditions (<ref type="url">https://onlinelibrary.wiley.com/terms-and-conditions</ref>) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License</p><p>This, provided C 1 &#8805; 20C, the right-hand side of (9.17) is at most f (k), and this closes the induction. The proof is complete. &#9642;</p><p>We conclude with a simple example showing that both inequalities (B1) can be far from tight. Suppose that n = 2m is even and A = H &#8746; {x 1 , &#8230; , x m }, with H a subgroup of size m and x 1 , &#8230; , x m highly dissociated with respect to H, for instance with x i + x jx kx l &#8712; H only if {i, j} = {k, l}. Then we have |A| 3 &#8725;E[A] =</p><p>1 16 + o(1) as n &#8594; &#8734;. Turning to &#120590; ent [A], we of course have H(U A ) = log n. The variable U A + U &#8242; A may be conditioned to subvariables which are, respectively, uniformly distributed on H, on the set &#8899; m i=1 (x i + H), and on the multiset &#8899; m i,j=1 {x i + x j }, with the conditioning probabilities being 1 4 , 1 2 , 1 4 . One therefore computes that H(U A + U &#8242; A ) = ( 7 4 + o(1)) log n and so &#120590; ent [A] = n 3&#8725;4+o(1) . Finally, &#120590;[A] = ( 3 4 + o(1))n.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>10982418, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/rsa.21252 by Princeton University, Wiley Online Library on [23/08/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_1"><p>P&#225;lv&#246;lgyi and Zhelezov use the term query complexity instead of skew dimension. 10982418, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/rsa.21252 by Princeton University, Wiley Online Library on [23/08/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_2"><p>The use of &#119889; * ent here simplifies an earlier version of the argument, and was suggested to the authors by Noah Kravitz. 10982418, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/rsa.21252 by Princeton University, Wiley Online Library on [23/08/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_3"><p>We use the natural logarithm in this paper, but one could easily work with other bases of the logarithm if desired. 10982418, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/rsa.21252 by Princeton University, Wiley Online Library on [23/08/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License</p></note>
		</body>
		</text>
</TEI>
