<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Lower Bounds for Convexity Testing</title></titleStmt>
			<publicationStmt>
				<publisher>Society for Industrial and Applied Mathematics</publisher>
				<date>01/01/2025</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10592284</idno>
					<idno type="doi">10.1137/1.9781611978322.13</idno>
					
					<author>Xi Chen</author><author>Anindya De</author><author>Shivam Nadimpalli</author><author>Rocco A Servedio</author><author>Erik Waingarten</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[We consider the problem of testing whether an unknown and arbitrary set S ⊆ R n (given as a black-box membership oracle) is convex, versus ε-far from every convex set, under the standard Gaussian distribution. The current state-of-the-art testing algorithms for this problem make 2 Õ( √ n)•poly(1/ε) non-adaptive queries, both for the standard testing problem and for tolerant testing.We give the first lower bounds for convexity testing in the black-box query model:• We show that any one-sided tester (which may be adaptive) must use at least n Ω(1) queries in order to test to some constant accuracy ε > 0. • We show that any non-adaptive tolerant tester (which may make two-sided errors) must use at least 2 Ω(n 1/4 ) queries to distinguish sets that are ε1-close to convex versus ε2-far from convex, for some absolute constants 0 < ε1 < ε2. Finally, we also show that for any constant c > 0, any non-adaptive tester (which may make two-sided errors) must use at least n 1/4-c queries in order to test to some constant accuracy ε > 0.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>High-dimensional convex geometry is a rich topic at the intersection of geometry, probability, and analysis (see [B + 97, GW93, LL15, Tro18, Tko18, HW20], among many other works, for general overviews). Apart from its intrinsic interest, a strong motivation for the study of high-dimensional convex sets from the perspective of theoretical computer science is that convexity often translates into a form of mathematical niceness which facilitates efficient computation, as witnessed by the plethora of positive results in algorithms and optimization for convex functions and convex sets. In this work, the object of study is the convex set:</p><p>A set K &#8834; R n is convex if and only if for every two points x, y &#8712; R n , any point z on the segment between x and y lies in K whenever x and y lie in K.</p><p>The above gives a "local" characterization of convex sets, where "local" refers to the fact that (aside from quantifying over all co-linear points x, z, y,) an algorithm may make three membership queries to check the condition -in particular, non-convexity can be verified with three queries. Can one relax the "for all" quantification to give a local condition which characterizes approximately convex sets? Is there an algorithm which, by making very few queries, can determine whether or not a set is (almost) convex?</p><p>A natural vantage point for this broad question is that of property testing <ref type="bibr">[BLR93,</ref><ref type="bibr">RS96]</ref>, which provides an algorithmic framework for studying the above questions. In our setting, we consider property testing of convex sets with respect to the standard Gaussian distribution, arguably the most natural distribution over R n . Indeed, various learning, property testing, and other algorithmic problems in the Gaussian setting have been intensively studied in theoretical computer science [KOS08, Vem10, MORS10, Kan11, Kan12, Kan14, KK14, KNOW14, Kan15, CFSS17, CDS19, DMN19, OSTK21, DMN21, HSSV22, DNS23]. Furthermore, while a large body of mathematical work (e.g. [Bor75, Bal93, LO99, Lat02, Naz03, Bor03, CEFM04, LO05, Bor08, Roy14]) investigates the geometry of high-dimensional convex sets against the Gaussian distribution, convexity over Gaussian space arises naturally within theoretical computer science in the context of algorithmic discrepancy theory [Glu89, Ban10, LM15, Rot17, LRR17, Eld22, RR23b] and lattice problems <ref type="bibr">[RR23a,</ref><ref type="bibr">Rot23,</ref><ref type="bibr">RSD24]</ref>.</p><p>We consider the following algorithmic task: A (randomized) testing algorithm has black-box query access to an unknown and arbitrary function f : R n &#8594; {0, 1} (the indicator function of a subset of R n ), and its goal is to make as few membership queries on f as possible while deciding whether f is convex or &#949;-far from convex (meaning f and any indicator of a convex set g : R n &#8594; {0, 1} disagree on x &#8764; N (0, I n ) with probability at least &#949;). Thus, a testing algorithm gives an efficiently-checkable (randomized) condition which all convex sets satisfy, and furthermore, any set which satisfies this condition is "almost" convex (with respect to the standard Gaussian distribution). For example, the definition of a convex set naturally leads to the following property testing question, whose positive resolution would directly give a "constant-query" testing algorithm (i.e. an algorithm whose query complexity depends only on &#949; and not on the ambient dimension n): Does there exist a probability distribution over co-linear points x, z, y in R n such that the condition Pr[z &#8712; K | x, y &#8712; K] &#8805; 1&#948;(&#949;) implies that the set K must be &#949;-close to convex with respect to the standard Gaussian? 1   In this work, we show the first non-trivial lower bounds for testing convexity under the standard Gaussian distribution. Our lower bounds not only give a negative resolution to the above question, they imply that, in a variety of property testing models (non-adaptive, adaptive, one-sided, two-sided, and tolerant), a dependence on the ambient dimension n is always necessary. Prior to this work, an O(1/&#949;)-query test was entirely possible for all of those models. 2  As further discussed in Section 1.3, a number of prior works have studied convexity testing in a range of different settings, yet large gaps remain in our understanding. Most closely related are the works of <ref type="bibr">[KOS08]</ref>, who study learning convex sets over N (0, I n ), and <ref type="bibr">[CFSS17]</ref>, who study testing convexity over N (0, I n ) when restricted to sample-based testers (i.e. the algorithm can only query a given number of random points independently drawn from N (0, I n )). On the upper bound side, the best algorithm for convexity testing <ref type="bibr">[CFSS17]</ref> is based on <ref type="bibr">[KOS08]</ref> and queries 2 &#213;( &#8730; n)/&#949; 2 randomly sampled points from N (0, I n ). Hence, this "sample-based" tester gives a non-adaptive property testing algorithm. 3 Turning to lower bounds, <ref type="bibr">[CFSS17]</ref> showed that, when restricted to sample-based testers, (i) algorithms which incur one-sided error must make 2 &#8486;(n) queries, 4 and (ii) algorithms which incur two-sided error must make 2 &#8486;( &#8730; n) queries. Importantly, lower bounds on sample-based testers do not imply any lower bounds for algorithms which are allowed to make unrestricted queries. There are many prominent property testing problems (e.g., linearity and monotonicity) where the complexity of sample-based testing is significantly higher than the complexity in the (standard) query-based model. 5   1.1 Our Results and Discussion This work gives the first non-trivial lower bounds for query-based convexity testing. We prove three different lower bounds for three variants of the property testing model, which we now describe. As mentioned, the best known algorithm for convexity testing is the non-adaptive algorithm of <ref type="bibr">[CFSS17]</ref>, which makes 2 &#213;( &#8730; n)/&#949; 2 non-adaptive queries (and makes two-sided error).</p><p>Our first result gives a polynomial lower bound for one-sided adaptive testers:</p><p>Theorem 1.1. (One-sided adaptive lower bound) For some absolute constant &#949; &gt; 0, any one-sided (potentially adaptive) &#949;-tester for convexity over N (0, I n ) must use n &#8486; (1) queries.</p><p>We also consider a challenging and well-studied extension of the standard testing model which is known as tolerant testing <ref type="bibr">[PRR06]</ref>. Recall that an (&#949; 1 , &#949; 2 )-tolerant tester for a class of functions is a testing algorithm which must accept with high probability if the input is &#949; 1 -close to some function in the class and reject with high probability if the input is &#949; 2 -far from every function in the class; thus the standard property testing model corresponds to (0, &#949;)-tolerant testing.</p><p>The sample-based algorithm for convexity testing that is given in <ref type="bibr">[CFSS17]</ref> is based on agnostic learning results from <ref type="bibr">[KOS08]</ref>. It follows easily from the analysis in <ref type="bibr">[CFSS17]</ref> and results of <ref type="bibr">[KOS08]</ref> that for any 0 &#8804; &#949; 1 &lt; &#949; 2 with &#949; 2&#949; 1 = &#949;, the <ref type="bibr">[CFSS17]</ref> approach gives a 2 &#213;( &#8730; n)/&#949; 4 -query sample-based algorithm for (&#949; 1 , &#949; 2 )-tolerant testing of convexity. As our final result, we give a mildly exponential lower bound on the query complexity of two-sided non-adaptive tolerant convexity testing:</p><p>Theorem 1.2. (Two-sided non-adaptive tolerant testing lower bound) There exist absolute constants 0 &lt; &#949; 1 &lt; &#949; 2 &lt; 0.5 such that any non-adaptive (&#949; 1 , &#949; 2 )-tolerant tester for convexity over N (0, I n ) (which may make two-sided errors) must use at least 2 &#8486;(n 1/4 ) queries.</p><p>Returning to the standard testing model, our final result gives a polynomial lower bound for two-sided non-adaptive testers: Theorem 1.3. (Two-sided non-adaptive lower bound) For any constant c &gt; 0, there is a constant &#949; = &#949; c &gt; 0 such that any non-adaptive &#949;-tester for convexity over N (0, I n ) (which may make two-sided errors) must use at least n 1/4-c queries.</p><p>Since q-query non-adaptive lower bounds imply (log q)-query adaptive lower bounds, Theorem 1.3 implies an &#8486;(log n) two-sided adaptive convexity testing lower bound. (This is in contrast to the n &#8486;(1) -query lower bound against one-sided adaptive testers given by Theorem 1.1.)</p><p>1.2 Techniques Our lower bounds rely on a wide range of techniques and constructions, and draw inspiration from prior work on monotonicity testing of Boolean functions f : {0, 1} n &#8594; {0, 1} [CST14, BB16, CWX17, PRW22, CDL + 24].<ref type="foot">foot_0</ref> Indeed, a conceptual contribution of our work is to highlight a (perhaps unexpected) connection between ideas in monotonicity testing and convexity testing. Our work thus adds to and strengthens a recently emerging analogy between monotone Boolean functions and high-dimensional convex sets <ref type="bibr">[DNS21,</ref><ref type="bibr">DNS22,</ref><ref type="bibr">DNS24]</ref>. Establishing this connection requires a number of technical and conceptual innovations for each of our main results; we highlight some of the key ideas below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>1.2.1</head><p>The Nazarov Body A central role in our lower bounds in Theorem 1.1 and Theorem 1.2 is played by the so-called "Nazarov body" [Naz03, <ref type="bibr">KOS08,</ref><ref type="bibr">CFSS17]</ref>. This is a randomized construction of a convex set B which is a slight variation of a construction originally due to Nazarov <ref type="bibr">[Naz03]</ref>, which is essentially as follows: we choose N &#8776; 2</p><p>&#8730; n halfspaces H 1 , . . . , H N in the space R n , where each halfspace H i is a random halfspace at a distance roughly n 1/4 from the origin. In more detail, each halfspace is H i (x) := 1 g i &#8226; x &#8805; r where r &#8776; n 3/4 and g i is drawn from N (0, I n ). The convex set B is obtained by taking the intersection of all N halfspaces with Ball( &#8730; n), the origin-centered ball of radius &#8730; n. <ref type="foot">7</ref> The exact parameters r and N are set carefully so that with high probability the "Gaussian volume" of B, i.e. Pr g&#8764;N (0,In) [g &#8712; B], is a constant bounded away from 0 and 1. Note that for the Nazarov body B and any point</p><p>such that j &#8712; J x iff H j (x) = 0, i.e., the point x violates the halfspace H j for all j &#8712; J x . Now, define a point</p><p>of the Nazarov construction is that the Gaussian volume of the set of points which are uniquely violated, i.e., Gaussian volume of the set U , is "large" compared to the Gaussian volume of the set Ball( &#8730; n) \ B (see Lemma 3.5</p><p>for the precise statement).</p><p>The original construction of Nazarov may be viewed as a Gaussian-space analogue of Talagrand's random CNF formula <ref type="bibr">[Tal96]</ref> (see <ref type="bibr">[DNS24]</ref> for a discussion of this connection). Talagrand's random CNF has been very useful in lower bounds for monotonicity testing over the Boolean hypercube, as demonstrated by [BB16, CWX17, CDL + 24]. We use our modified Nazarov body to obtain new lower bounds for convexity testing, as described below.</p><p>1.2.2 One-Sided Adaptive Lower Bound Recall that a one-sided tester always outputs "accept" on convex sets and outputs "reject" on far-from-convex sets with probability at least 2/3 -this requirement implies that the tester rejects only if a certificate of non-convexity is found (i.e. a set of queries x 1 , . . . , x t which lie in the body, and a query y in the convex hull of x 1 , . . . , x t which is not in the body). In order to argue a q-query lower bound, it suffices to (1) design a distribution D no over sets which are far-from-convex with high probability, and (2) argue that no q-query deterministic algorithm can find a certificate of non-convexity.</p><p>The key will be to "hide" the non-convexity within the uniquely violated sets of the Nazarov body. Consider working in R 2n and first randomly draw an n-dimensional "control subspace" C and the orthogonal n-dimensional "action subspace" A; we embed the n-dimensional Nazarov body in the control subspace C. A point x &#8712; R 2n lies in our (non-convex) body iff:</p><p>&#8226; It has Euclidean norm at most &#8730; 2n, and in addition, x C (the projection onto the control subspace) has norm at most &#8730; n; and</p><p>&#8226; The point x C lies within an n-dimensional Nazarov body that we randomly sample within the control subspace C; or, for every j &#8712; [N ] where H j (x C ) = 0, the projection x A on the action subspace lies outside of a "strip" of width 1 along a randomly sampled direction v j in the action subspace. (See Section 4.1.2).</p><p>Consider a line through a point x &#8712; R 2n in direction v j , for j &#8712; [N ] such that x C lies in the uniquely violated region U j and x A lies inside the strip along v j (and therefore outside our body). Then, as the line proceeds out from x in directions v j and -v j , it remains in the uniquely violated region U j (since v j is orthogonal to C) but exits the strip, thereby entering the body and exhibiting non-convexity. By design, the uniquely violated regions and the strips are large enough to constitute a constant fraction of the space, giving the desired distance to convexity (Lemma 4.1). Intuitively, detecting non-convexity is hard because the algorithm does not know C, the halfspace H j (&#8226;), or the direction v j . In fact, we show that an algorithm which makes few queries cannot find, with probability at least 2/3, two points x, z outside the same halfspace H j (&#8226;) such that x lies inside and z outside the strip in direction v j . Roughly speaking, the proof proceeds as follows. First, we show that, except with o(1) probability, any two queries x, z which are far (at distance at least 1000 &#8730; qn 1/4 ) cannot lie outside the same halfspace H j (&#8226;) while having projections onto C with norm at most &#8730; n (Lemma 4.4), and moreover it is extremely unlikely for a query to be falsified by more than q halfspaces (this follows from a calculation in Lemma 3.2). The argument is geometric in nature and is given in Section 4.5, and essentially argues that it is unlikely, since the algorithm does not know the control subspace C or the vector defining the halfspace, that two far-away queries happen to uniquely falsify the same halfspace.</p><p>On the other hand, consider the halfspaces which are falsified by some query (and notice there are most q 2 such halfspaces, since each query is falsified by at most q halfspaces). Since all such queries are within distance 1000 &#8730; qn 1/4 of each other, the projection of any two such queries onto the direction defining the strip is a segment of length O( &#8730; q/n 1/4 ) with high probability, and the precise location of this segment is uniform(-like, from Gaussian anti-concentration) (see Section 4.5.1). Therefore, the probability of any particular segment of length O( &#8730; q/n 1/4 ) which goes from inside to outside the strip of width 1 is roughly O( &#8730; q/n 1/4 ). We take a union bound over the q 2 possible halfspaces, each containing at most q queries which define segments which may "cross" the strip with probability O( &#8730; q/n 1/4 ), for a total probability of O(q 3.5 /n 1/4 ). Since this must be at least 2/3 for the algorithm to succeed, this gives the n &#8486;(1) lower bound.</p><p>1.2.3 Two-Sided Non-Adaptive Tolerant Lower Bound Continuing the analogy with monotonicity testing lower bounds, the proof of Theorem 1.2 is inspired by recent lower bounds on tolerant monotonicity testing, namely [PRW22] and the follow-up work of [CDL + 24]. The basic idea of [PRW22] is to construct a family of functions by randomly partitioning the space of variables into control variables and action variables: if the control variables are not balanced, i.e. there are more 1s than 0s (or vice-versa), then the function is trivially set to 1 (resp. to 0) both for f &#8764; D yes and for f &#8764; D no . If the control variables are balanced, then, at a high level, 1. for f &#8764; D yes the function on the action variables is close to monotone; 2. for f &#8764; D no the function on the action variables is far from monotone.</p><p>Roughly speaking, the analysis in <ref type="bibr">[PRW22]</ref> shows that unless the algorithm queries two points such that both these points (a) have the same setting of the control variables, and (b) the control variables are balanced, the algorithm cannot distinguish between f &#8764; D yes and f &#8764; D no . As the control and action variables are partitioned at random, it turns out that satisfying both (a) and (b) is not possible for a non-adaptive algorithm unless the algorithm makes 2 &#8486;( &#8730; n) many queries. In particular, <ref type="bibr">[PRW22]</ref> shows that distinguishing between functions which are c 1 / &#8730; n-close to monotone versus c 2 / &#8730; n-far from monotone (where</p><p>queries.</p><p>The main modification in [CDL + 24] vis-a-vis <ref type="bibr">[PRW22]</ref> is the following: one can think of the balanced setting of the control variables in the construction described above as the "minimal satisfying assignments" of the Majority function. In [CDL + 24], the Majority function is replaced by Talagrand's random monotone DNF <ref type="bibr">[Tal96]</ref>, a well-studied function in Boolean function analysis and related areas <ref type="bibr">[MO03,</ref><ref type="bibr">OW07]</ref>. The specific properties of Talagrand's monotone DNF allows [CDL + 24] to obtain a 2 n 1/4 query lower bound for non-adaptive testers where the functions in D yes are c 1 -close to monotone and functions in D no are c 2 -far from monotone, where c 2 &gt; c 1 are positive constants.</p><p>For Theorem 1.2, the goal is to obtain lower bounds for tolerant convexity testing rather than monotonicity testing. Towards that goal, let us assume that the ambient space is R n+1 . We choose a random n-dimensional subspace C and think of it as the control subspace, and we view its one-dimensional orthogonal complement as the action subspace A (analogous to the notion of control and action variables in [PRW22, CDL + 24]). <ref type="foot">8</ref> We embed the Nazarov body B (described earlier) in the control subspace. We define the D yes and D no distributions in analogy with [CDL + 24], roughly as follows: for x &#8712; R n+1 , 1. If the projection x C does not lie in the uniquely violated set of B, then f (x) is defined the same way for f &#8764; D yes and f &#8764; D no ;</p><p>2. If the projection x C lies in the uniquely violated set, then f (x) is set differently for f &#8764; D yes and f &#8764; D no (depending on the projection x A to the action subspace). In particular, for f &#8764; D yes , f is defined in such a way that f -1 (1) is close to a convex set, and for f &#8764; D no , f is defined in such a way that f -1 (1) is far from every convex set. This crucially uses the fact that the Gaussian volume of U is "large" compared to the Gaussian volume of the set Ball( &#8730; n) \ B, as mentioned in our earlier discussion of the Nazarov body.</p><p>At a high level, the indistinguishability argument showing that q = 2 &#8486;(n 1/4 ) non-adaptive queries are required to distinguish f &#8764; D yes from f &#8764; D no is a case analysis based on the distance between any given pair of query vectors x and y (see Lemma 5.3 and Lemma 5.4), combined with a union bound over all q 2 possible pairs of query vectors. Roughly speaking, if xy is small, then the way that f depends on the projection to the action subspace makes it very unlikely to reveal a difference between f &#8764; D yes and f &#8764; D no . On the other hand, if xy is large, then it is very unlikely for x and y to lie in the same set U i , which must be the case for the pair x, y to reveal a difference between f &#8764; D yes and f &#8764; D no . There are many technical issues and geometric arguments required to carry out this rough plan, but when all the dust settles the argument gives a 2 &#8486;(n 1/4 ) lower bound for tolerant convexity testing.</p><p>1.2.4 Two-Sided Non-Adaptive Bound Our approach to prove Theorem 1.3 is inspired by the lower bounds of <ref type="bibr">[CDST15]</ref> on non-adaptive monotonicity testing. As in most property testing lower bounds for non-adaptive algorithms, the high-level approach is to use Yao's principle; we follow <ref type="bibr">[CDST15]</ref> in that we use a suitable high-dimensional central limit theorem as the key technical ingredient for establishing indistinguishability between the yes-and no-distributions. In <ref type="bibr">[CDST15]</ref> both the yes-and no-functions are linear threshold functions over {-1, +1} n , but since any linear threshold function is trivially a convex set, the <ref type="bibr">[CDST15]</ref> construction cannot be directly used to prove a convexity testing lower bound. Instead, in order to ensure that our no-functions are both indistinguishable from the yes-functions and are far from every convex set, we work with degree-2 polynomial threshold functions (PTFs) over R n rather than linear threshold functions over {-1, +1} n . At a high level, degree-2 PTFs of the form i &#955; i x 2 i where each &#955; i is positive (note that any such PTF is a convex set) play the "yes-function" role that monotone LTFs play in the <ref type="bibr">[CDST15]</ref> argument, and degree-2 PTFs of the form i &#955; i x 2 i where a constant fraction of the &#955; i 's are negative play the "no-function" role that far-from-monotone LTFs play in the <ref type="bibr">[CDST15]</ref> argument. We show that having a constant fraction of the &#955; i 's be negative is sufficient, in the context of our construction, to ensure that no-functions are far from convex, and we show that the multi-dimensional central limit theorem used in <ref type="bibr">[CDST15]</ref> can be adapted to our context to establish indistinguishability and thereby prove the desired lower bound.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.3">Related Work</head><p>A number of earlier papers have considered different aspects of convexity testing. One strand of work deals with testing convexity of (real-valued) functions f : [N ] &#8594; R, where convexity means the second derivative is positive.<ref type="foot">foot_4</ref> This study was initiated by Parnas et al. <ref type="bibr">[PRR03]</ref>, and extended by Pallavor et al. <ref type="bibr">[PRV18]</ref>, who gave an improved result parameterized by the image size of the function being tested; by Blais et al. <ref type="bibr">[BRY14b]</ref>, who gave lower bounds on testing convexity of real-valued functions over the hypergrid [N ] d ; and by Belovs et al. <ref type="bibr">[BBB20]</ref>, who gave upper and lower bounds on the number of queries required to test convexity of real-valued functions over various discrete domains including the discrete line, the "stripe" [3] &#215; [N ], and the hypergrid [N ] d . (See also the work of Berman et al. <ref type="bibr">[BRY14a]</ref>, who investigated a notion of "L 1 -testing" real-valued functions over [N ] d for convexity.)</p><p>A different body of work, which is closer to this paper, deals with testing convexity of high-dimensional sets (equivalently, Boolean indicator functions). The earliest work we are aware of along these lines is that of Rademacher and Vempala <ref type="bibr">[RV05]</ref>. <ref type="foot">10</ref> In their formulation, a body</p><p>for every convex set C, where Leb(&#8226;) denotes the Lebesgue volume (note that, in contrast, our model uses absolute volume under the Gaussian measure, rather than relative volume under the Lebesgue measure). Moreover, <ref type="bibr">[RV05]</ref> allow the testing algorithm access to a black-box membership oracle (as in our model) as well as a "random sample" oracle which can generates a uniform random point from K (for testing with respect to relative measures, such an oracle is necessary). The main positive result of <ref type="bibr">[RV05]</ref> is a (cn/&#949;) n sample-and query-algorithm for testing convexity in their model. <ref type="bibr">[RV05]</ref> also give an exponential lower bound for a simple "line segment tester," which checks whether a line segment connecting two (uniformly random) points from the body is contained within the body. This lower bound was strengthened and extended to an exponential lower bound for a "convex hull tester" in recent work of Blais and Bommireddi <ref type="bibr">[BB20]</ref>. We note that the negative results of <ref type="bibr">[RV05]</ref> and <ref type="bibr">[BB20]</ref>, while they deal with natural and interesting candidate testing algorithms, only rule out very specific kinds of testers and do not provide lower bounds against general testing algorithms in their framework.</p><p>The most closely related work for us is the study of sample-based testing algorithms for convexity under the N (0, I n ) distribution <ref type="bibr">[CFSS17]</ref>. As was mentioned earlier, <ref type="bibr">[CFSS17]</ref> gave a 2 &#213;( &#8730; n)/&#949; 2 -sample algorithm for convexity testing and showed that any sample-based tester must use 2 &#8486;( &#8730; n) samples; we remark that lower bounds for sample-based testers do not have any implications for query-based testing.<ref type="foot">foot_6</ref> Finally, another closely related paper is the recent work of Blais et al. [BBH24] which gives nearly matching upper and lower bounds of 3 &#937;( &#8730; n)</p><p>queries for one-sided non-adaptive convexity testing over {-1, 0, 1} n . [BBH24] cites the high-dimensional Gaussian testing problem as motivation for their study of the ternary cube, and asks "Can queries improve upon the bounds of <ref type="bibr">[CFSS17,</ref><ref type="bibr">HY22]</ref> for testing convex sets with samples in R n under the Gaussian distribution?" (Question 1.15 of <ref type="bibr">[BBH24]</ref>). Our work makes progress on this question by establishing the first lower bounds for query-based testing under the Gaussian distribution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Preliminaries</head><p>We use boldfaced letters such as x, X, etc. to denote random variables (which may be real-or vector-valued; the intended type will be clear from the context). We write x &#8764; D to indicate that the random variable x is distributed according to probability distribution D. We will frequently identify a set K &#8838; R n with its 0/1-valued indicator function, i.e., K(x) = 1 if x &#8712; K and K(x) = 0 otherwise. We write ln to denote natural logarithm and log to denote base-two logarithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Geometry</head><p>We write S n-1 for the unit sphere in R n , i.e. S n-1 = {x &#8712; R n : x = 1} where x denotes the &#8467; 2 -norm of x. We write Ball(r) &#8838; R n to denote the &#8467; 2 -ball of radius r in R n , i.e.</p><p>Ball(r) := x &#8712; R n : x &#8804; r .</p><p>We will frequently write Ball := Ball( &#8730; n). We recall the following standard bound on the volume of spherical caps (see e.g. Lemma 2.2 of [B + 97]):</p><p>, where u &#8764; S n-1 , i.e. u is a Haar random vector drawn uniformly from the unit sphere S n-1 .</p><p>2.2 Gaussian and Chi-Squared Random Variables For &#181; &#8712; R n and &#931; &#8712; R n&#215;n , we write N (&#181;, &#931;) to denote the n-dimensional Gaussian distribution centered at &#181; and with covariance matrix &#931;. In particular, identifying 0 &#8801; 0 n and writing I n for the n&#215;n identity matrix, we will denote the n-dimensional standard Gaussian distribution by N (0, I n ). We write Vol(K) to denote the Gaussian measure of a (Lebesgue measurable) set K &#8838; R n , i.e.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Vol(K) := Pr</head><p>g&#8764;N (0,In)</p><p>We recall the following standard tail bound on Gaussian random variables: Then for all r &gt; 0, we have</p><p>where &#981; is the one-dimensional standard Gaussian density which is given by</p><p>It is well known that if g &#8764; N (0, I n ), then g is distributed according to the chi distribution with n degrees of freedom, i.e. g &#8764; &#967;(n). It is well known (see e.g. <ref type="bibr">[Wik23]</ref>) that the mean of the &#967; 2 (n) distribution is n, the median is n(1 -&#920;(1/n)), and for n &#8805; 2 the probability density function is everywhere at most 1. We note that an easy consequence of these facts is that the origin-centered ball Ball(</p><p>We will require the following tail bound on &#967; 2 (n) random variables:</p><p>Then for any t &gt; 0, we have</p><p>and Pr</p><p>2.3 Property Testing and Tolerant Property Testing Let P conv := P conv (n) denote the class of convex subsets of R n , i.e.</p><p>Given a set K &#8838; R n , we define its distance to convexity as dist(K,</p><p>where K L = (K \ L) &#8746; (L \ K) denotes the symmetric difference of K and L. In particular, we will say that K is &#949;-close to (resp. &#949;-far from) a convex set if dist(K, P conv ) &#8804; &#949; (resp. &#8805; &#949;).</p><p>Definition 1. (Property testers and tolerant property testers</p><p>An algorithm A is an &#949;-tester for convexity if, given black-box query access to an unknown set K &#8838; R n , it has the following performance guarantee:</p><p>&#8226; If K is convex, then A outputs "accept" with probability at least 2/3;</p><p>&#8226; If dist(K, P conv ) &#8805; &#949;, then A outputs "reject" with probability at least 2/3.</p><p>An algorithm A is an (&#949; 1 , &#949; 2 )-tolerant tester (or simply an (&#949; 1 , &#949; 2 )-tester) for convexity if it has the following performance guarantee:</p><p>&#8226; If dist(K, P conv ) &#8804; &#949; 1 , then A outputs "accept" with probability at least 2/3;</p><p>&#8226; If dist(K, P conv ) &#8805; &#949; 2 , then A outputs "reject" with probability at least 2/3.</p><p>In particular, note that every &#949;-tester is a (0, &#949;)-tolerant tester.</p><p>Our query-complexity lower bounds for non-adaptive property testing algorithms are obtained via Yao's minimax principle <ref type="bibr">[Yao77]</ref>, which we recall below. (We remind the reader that an algorithm for the problem of (&#949; 1 , &#949; 2 )-tolerant testing is correct on an input function f provided that it outputs "yes" if f is &#949; 1 -close to the property and outputs "no" if f is &#949; 2 -far from the property; if the distance to the property is between &#949; 1 and &#949; 2 then the algorithm is correct regardless of what it outputs.) Theorem 2.1. (Yao's principle) To prove an &#8486;(q)-query lower bound on the worst-case query complexity of any non-adaptive randomized testing algorithm, it suffices to give a distribution D on instances such that for any q-query non-adaptive deterministic algorithm A, we have</p><p>where 0 &#8804; c &lt; 1 is a universal constant.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Nazarov's Body</head><p>Our constructions in Sections 4 and 5 will employ modifications of a probabilistic construction of a convex body due to Nazarov <ref type="bibr">[Naz03]</ref>. Nazarov's randomized construction yields a convex set with asymptotically maximal Gaussian surface area <ref type="bibr">[Bal93,</ref><ref type="bibr">Naz03]</ref>, and modifications thereof have found applications in learning theory and polyhedral approximation <ref type="bibr">[KOS08,</ref><ref type="bibr">DNS24]</ref>. Definition 2. (Nazarov's body) For r, N &gt; 0, we write Naz(r, N ) to be the distribution over convex subsets of R n where a draw B &#8764; Naz(r, N ) is obtained as follows:</p><p>2. Output the convex set B &#8838; R n where</p><p>Note that for any fixed</p><p>where &#934;(&#8226;) is the univariate Gaussian cumulative density function. Consequently, because of the independence of g i , we have</p><p>Note that B can be also written as</p><p>For each i &#8712; [N ], we define F i (for "flap") to be points in Ball( &#8730; n) which are falsified by H i , i.e.</p><p>Given a non-empty T &#8838; [N ], we write F T := i&#8712;T F i . We will be interested in points in Ball( &#8730; n) that are falsified by a unique halfspace H i and denote the set of such points as U i (for "unique"):</p><p>3.1 Useful Estimates Suppose N satisfies N = n &#969;n (1) ; in both Sections 4 and 5 we will take N = 2 &#8730; n . Let c 1 &gt; 0 be a parameter; in Section 4, we will set c 1 = ln 2 &#177; O(1)/N , and in Section 5 we will take c 1 to be a suitable small absolute constant. Throughout this section we will take r to be the unique positive number such that</p><p>Gaussian tail bounds allow us to relate r and N :</p><p>Lemma 3.1. We have</p><p>contradicting Equation (3.4). Next, it follows from Proposition 2.1 and Equation (3.4) that</p><p>The upper bound implies that</p><p>In particular, this implies that</p><p>Next, note that the lower bound from Equation (3.5) implies that</p><p>This in turn implies that</p><p>The result follows from Equations (3.6) and (3.7).</p><p>We need the following lemma which will be useful in analyzing our construction in Section 4.</p><p>Proof. Note that</p><p>where the penultimate equality relies on Equation (3.4).</p><p>We will also require a lower bound on the expected volume of</p><p>where the penultimate inequality follows from Equation (3.4) and the final inequality relies on the fact that (1y) z &#8805; 1yz.</p><p>Next, at the cost of 0.01 probability mass (thanks to Proposition 2.2), we can assume that</p><p>It follows that</p><p>.</p><p>Standard Gaussian tail bounds give that</p><p>(Proposition 2.1)</p><p>where the last line uses our bounds on r from Lemma 3.1 and our bounds on c 1 from the statement of the current lemma. Putting everything together and recalling that c 1 &lt; 0.9, we get that for n large enough,</p><p>thanks to Equation (3.4). Consequently, we have</p><p>completing the proof.</p><p>Next we show that the volume of i U i is highly concentrated:</p><p>Proof. Let Naz * (r, N ) be the same distribution as Naz(r, N ) except that when drawing B, each g i is drawn from</p><p>we have</p><p>Moreover, it suffices to show that when B &#8764; Naz * (r, N ), we have</p><p>with probability at least 1o(1). To this end, we recall McDiarmid's inequality:</p><p>. . , X S be independent random variables taking values in a set &#8486;. Let G : &#8486; S &#8594; R be such that for all i &#8712; [S] we have</p><p>Then for all &#964; &gt; 0, we have</p><p>We will take S = N , X i to be the halfspaces H i and G(&#8226;) to be the volume of i&#8712;[N ] U i , as we draw B &#8764; Naz * (r, N ). Given the way g i is drawn in B &#8764; Naz * (r, N ), the volume of each H i is always at least (using</p><p>It follows from McDiarmid that Equation (3.12) holds with probability at least 1o(1).</p><p>Finally, the following lemma will allow us to obtain bounds on the distance to convexity of the "yes"-and "no"-distributions in Section 5: Lemma 3.5. For r satisfying Equation (3.4), we have</p><p>Proof. Fix x &#8712; R n and i &#8712; [N ]. Recall Equation (3.8). On the other hand, we have</p><p>where we once again used Equation (3.2). It follows from Equations (3.8) and (3.13) that for x &#8712; Ball( &#8730; n) (i.e</p><p>x &#8804; &#8730; n), we have</p><p>where Equation (3.14) relies on the fact that x &#8804; &#8730; n and &#934;(&#8226;) being increasing, Equation (3.15) relies on our definition of r from Equation (3.4), and Equation (3.16) relies on Bernoulli's inequality: (1y) z &#8805; 1yz. (Note that for x with x &gt; &#8730; n, we have Pr[x &#8712;</p><p>To conclude, we have</p><p>where Equation (3.17) follows from the earlier calculation, completing the proof.</p><p>4 One-Sided Adaptive Lower Bound For this section, it will be most convenient for us to work over R 2n . Let us restate Theorem 1.1 in this setting:</p><p>Theorem 4.1. (One-sided adaptive lower bound, restated) For some absolute constant &#949; &gt; 0, any onesided &#949;-tester for convexity over N (0, I 2n ) (which may be adaptive) must use n &#8486;(1) queries.</p><p>At a high level, the proof of Theorem 1.1 works by (1) first defining a distribution D no of "no-functions" (Boolean-valued functions over R 2n , or equivalently, subsets of R 2n ), and showing that an &#8486;(1) fraction of draws from D no yield sets which are &#8486;(1)-far from convex; and (2) then arguing that for a suitable absolute constant c &gt; 0, any n c -query algorithm (even an adaptive one) has only an o(1) probability of querying a set of points whose labels are inconsistent with every convex set in R 2n . In the next subsection we describe the distribution D no .</p><p>4.1 The distribution D no of far-from-convex sets 4.1.1 Setup We will see that every function f in the support of D no outputs 0 on every input point x &#8712; R 2n with x &gt; &#8730; 2n. To describe how f behaves within the &#8730; 2n-ball, denoted by</p><p>we require some more setup.</p><p>The "control subspace," the "action subspace," and the Nazarov body. Let C be a Haar random n-dimensional subspace of R 2n ; we call C the control subspace. Let A be the orthogonal complement of C (which is also an n-dimensional subspace); we call A the "action subspace." Given a vector x &#8712; R n , we write x C to denote the orthogonal projection of x onto C and we write x A to denote the orthogonal projection of x onto A, so every vector satisfies x = x A + x C . Fix N := 2 &#8730; n (we assume without loss of generality that n is a perfect square, so N is an integer). For this choice of N , let B &#8764; Naz(r, N, C) where Naz(r, N, C) is as defined in Definition 2 but with the n-dimensional control subspace C playing the role of R n . (We emphasize that B &#8764; Naz(r, N, C) is a subset of R 2n which is an "n-subspace junta," meaning that for any x &#8712; R 2n , membership of x in B depends only on x C .) We take r to be the unique positive number such that</p><p>In other words, we choose r to be the unique value such that any point x with</p><p>by the Taylor expansion of e -ln(2)/N and setting of r (Lemma 3.1).</p><p>The "action directions." For each i &#8712; [N ], draw a random vector v i from the standard Normal distribution N (0, I n ) over the n-dimensional action subspace A (independent of everything else). We say that v i is the action direction for the i-th flap F i of the Nazarov body B. We note that for every pair i, j &#8712; [N ], the vector g i defining the i-th halfspace H i of the Nazarov body is orthogonal to the vector v j (because g i &#8712; C and v j &#8712; A).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>4.1.2</head><p>The distribution D no For a fixed setting of the control subspace C and the (orthogonal) action subspace A, of H := (H 1 , . . . , H N ) (which also specifies B and F i 's) and of v := (v 1 , . . . , v N ), we define the function f C,A, H, v : R 2n &#8594; {0, 1} as follows:</p><p>A random function f &#8764; D no is drawn as follows: first we draw a Haar random n-dimensional subspace C; then A is taken to be the n-dimensional (Haar random) orthogonal complement of C; then we draw B &#8764; Naz(r, N, C) (which gives a draw of H as in Equation (3.1)); then we draw v = (v 1 , . . . , v N ) from A as described above; then we set the function f to be f C,A, H, v .</p><p>4.2 Sets in D no are far from convex We need a constant fraction of the no-functions to be constant-far from convex. This is given by the following lemma:</p><p>Lemma 4.1. With probability &#8486;(1) over a draw of f &#8764; D no , we have that Vol(f g) = &#8486;(1) for every g : R 2n &#8594; {0, 1} that is the indicator function of a convex set in R 2n .</p><p>We require a few definitions. Define ThinShell : <ref type="bibr">(1)</ref>. We view the draw of f as taking place in two stages: in the first one C, A, and g = (g 1 , . . . , g N ) are drawn, and in the second one v = (v 1 , . . . , v N ) is drawn. Observe that the set U depends only on the first stage. Say that any outcome of the first stage for which Vol[U &#8745; ThinShell] = &#8486;(1) holds is a good outcome of the first stage, so an &#8486;(1) fraction of outcomes of the first stage are good.</p><p>Fix any good outcome C, A, g of the first stage (note that this fixes U 1 , . . . , U N and hence U ), and consider a draw of x &#8764; N (0, I 2n ). We have the following claim: Claim 3. For a suitable absolute constant a &gt; 0, we have</p><p>). Proof. Since we have fixed a good outcome C, A, g of the first stage, we have that Pr x&#8764;N (0,I2n) [x &#8712; U &#8745;ThinShell] &#8805; c for some absolute constant c &gt; 0. Moreover, every outcome of x &#8712; U &#8745; ThinShell has x C &#8804; &#8730; n, since U is a subset of B. So to prove the claim we need only show that Pr x&#8764;N (0,I2n)</p><p>We first observe that by standard bounds on the chi-square distribution (Proposition 2.2), we have that </p><p>Let us say that an outcome of v for which f (x) = 0, f (x + ) = 1, f (x -) = 1 is a fine outcome of v for x. We will use the following claim:</p><p>Claim 4. For any fixed x that lies in U &#8745; ThinShell and has</p><p>Proof. Since x &#8712; U i &#8745; ThinShell for some i, it must be the case that also x + , x -&#8712; U i (because every possible outcome of v i is orthogonal to every possible outcome of g j for every j &#8712; [N ]). So v is fine if and only if</p><p>Now since v i is drawn from a standard N (0, I n ) distribution over the subspace A, a routine calculation using (i) Equation (4.19); (ii) the fact that v iv i , x x x 2 and v i , x are independent and are distributed as a draw from the &#967; 2 (n -1) distribution and a draw from N (0, x A 2 ) respectively; and (iii) the fact that a draw from the &#967; 2 (n -1) distribution takes value n(1 &#177; o(1)) except with vanishingly small probability, gives that Equation (4.18) holds with &#8486;(1) probability.</p><p>As an immediate consequence of Claim 4, we get that an &#8486;(1) fraction of outcomes of v are such that (4.20)</p><p>Fix any outcome v of v for which Equation (4.20) holds. To conclude the proof of Lemma 4.1, we observe that since x &#8712; U i implies that x + , x -are also in U i , it follows that any z &#8712; R n can participate in at most three triples of the form (x, x -, x + ), so the maximum possible degree of overlap across all of the triples is at most a factor of three. Moreover, for any x &#8712; ThinShell, it holds that</p><p>, and hence the pdf of the &#967; 2 (2n) distribution is within a constant factor on each of the three inputs x , x + and x - (so the N (0, I 2n ) Gaussian's pdf is within a constant factor on each of the three inputs x, x + , x -). Combining this with Claim 3, we get that for an &#8486;(1) fraction of outcomes of f &#8764; D no , the value of f needs to be altered on at least an &#8486;(1) fraction of inputs drawn from N (0, I n ) in order to "repair" all of the violating triples (x, x + , x -) for which x &#8712; U &#8745; ThinShell and </p><p>&#8730; n], and consider the "completion" of the draw of f &#8764; D no (i.e. the draw of B &#8764; Naz(r, N, C) which induces an outcome of U ). We have</p><p>, so to complete the proof of Lemma 4.1 it suffices to show that (5.29) = &#8486;(1). We have</p><p>where the first equality is Equation (3.4) and the second is because c 1 = &#920;(1). Similar to the proof of Lemma 3.3, we have</p><p>(Proposition 2.1)</p><p>= &#920;(1), using Lemma 3.1.</p><p>So</p><p>where the first equality is by Equation (3.4). This concludes the proof of Lemma 4.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Proof of Theorem 1.1</head><p>Definition 5. (One-sided Adaptive Algorithms as Binary Trees) Fix n, q &#8712; N. A q-query one-sided deterministic algorithm, Alg, for testing convexity in R 2n is specified by a rooted binary tree of depth q where each node contains the following information:</p><p>&#8226; Each node v which is a not a leaf contains a query vector x v &#8712; R 2n , as well as two out-going edges, one labeled 0 and one labeled 1, to nodes which we label v(0) and v(1), respectively.</p><p>&#8226; Each leaf node v contains an output o v which is set to "accept" or "reject." Let Q 1 (or Q 0 ) denote the set of points queried along the path that are labelled 1 (or 0, respectively). Then o v is set to be "reject" if and only if</p><p>By adding nodes which repeat the queries, we may assume, without loss of generality, that the depth of every leaf of the tree is exactly q.</p><p>A q-query deterministic algorithm Alg executes on a function f : R 2n &#8594; {0, 1} by taking the natural root-to-leaf path given by following the function values which the oracle returns at the queries within each of the nodes. In particular, we will make repeated use of the following definitions which capture the execution of the algorithm Alg on a function f :</p><p>&#8226; The node v 0 is the root of the tree, which is the starting point of the root-to-leaf path. Then, the nodes v 1 , . . . , v q indicate the root-to-leaf path generated by executing the algorithm on the function f . In particular, at time step t &#8712; {0, . . . , q -1}, we have</p><p>&#8226; The set Q 0 is defined to be &#8709;, and for t &#8712; {0, . . . , q -1} the set Q t+1 is defined to be</p><p>is the set of vectors that are queried at time steps prior to t + 1.</p><p>Once the algorithm reaches the leaf node v q , the algorithm outputs o v q , and we will refer to Alg(f ) as the output ("accept" or "reject") produced by the algorithm. It is trivial to see that since any q-query deterministic algorithm corresponds to a tree of depth q, the total number of query vectors x v &#8712; R 2n across all nodes of the tree is at most 2 q . Our goal is to show that, if Alg is a q-query deterministic algorithm which makes one-sided error, then</p><p>Recall that implicit in a fixed function f in the support of D no are the control and action subspaces C, A &#8834; R 2n , as well as the vectors g 1 , . . . , g N &#8712; C and v 1 , . . . , v N &#8712; A, and that g 1 , . . . , g N define B, H i and F i regions. In order to simplify our notation, we will often refer to a subset of the queries Qk for any k &#8804; q whose norm on the control subspace is bounded,</p><p>Toward showing the above upper bound, we define two important events (which will depend on the draw f &#8764; D no ).</p><p>Definition 6. Given Alg and a function f from D no , we consider the following three events:</p><p>&#8226; E 1 (f ): This event occurs if at the end of the execution of Alg on f , every point x &#8712; Qq lies in at most q flaps, and for every flap F i with Qq &#8745; F i = &#8709;, (4.24)</p><p>xy &#8804; 1000 &#8730; qn 1/4 for all x, y &#8712; Qq &#8745; F i .</p><p>&#8226; E 2 (f ): This event occurs if at the end of the execution of Alg on f , for every flap F i with Qq &#8745; F i = &#8709; and every x, y &#8712; Qq &#8745; F i , we have</p><p>Theorem 1.1 follows immediately from the following three lemmas:</p><p>Lemma 4.3. Let Alg be a one-sided, deterministic, q-query algorithm for testing convexity. Then, if Alg(f ) outputs "reject," the event E 2 (f ) occurred.</p><p>Lemma 4.4. Let Alg be a one-sided, deterministic, q-query algorithm. Then,</p><p>Lemma 4.5. Let Alg be a one-sided, deterministic, q-query algorithm, where q &#8804; n 0.05 . Then,</p><p>Proof. [Proof of Theorem 1.1 Assuming Lemmas 4.3 to 4.5] We upper bound the expression</p><p>using Lemmas 4.4 and 4.5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Proof of Lemma 4.3</head><p>Since Alg is a q-query deterministic algorithm which has one-sided error, in order for the algorithm to output "reject," the set Q q queried by the root-to-leaf path obtained by executing Alg on f must contain x 1 , . . . , x &#8467; , y &#8712; Q q satisfying y &#8712; conv(x 1 , . . . , x &#8467; ), f (y) = 0, and f (x</p><p>In particular, from y &#8712; conv(x 1 , . . . , x &#8467; ), we must have that, for any vector u &#8712; R 2n , there exists a j &#8712; [&#8467;] such that x j , u &#8805; y, u . This implies that:</p><p>&#8226; We must have that all x 1 , . . . , x &#8467; satisfy (x i ) C 2 &#8804; &#8730; n, and y C 2 &#8804; &#8730; n, and this means these vectors lie in Qq . The part of (</p><p>On the other hand, if</p><p>&#8730; n, letting u &#8712; C be the unit vector u = y C / y C 2 , there exists an x j with</p><p>and hence f (x j ) = 0, which would be a contradiction with f (x j ) = 1.</p><p>&#8226; We must have y / &#8712; B since f (y) = 0. As a result, there is a nonempty T such that y &#8712; F T . In addition, f (y) = 0 implies that there exists an i &#8712; T such that y &#8712; F i but</p><p>Given that y &#8712; F i , setting u = g i , there exists an x j such that x j , g i &gt; y, g i &#8805; r and thus, x j &#8712; F i . It follows from f (x j ) = 1 and the construction that</p><p>This concludes the proof using i, y and x j .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5">Proof of</head><p>Lemma 4.4 To prove Lemma 4.4, we introduce five new, easy-to-analyze events E 1,1 , E 1,2 , E 1,3 , E 1,4 and E 1,5 , show that each happens with probability at least 1o(1), and that E 1,1 &#8745; E 1,2 &#8745; E 1,3 &#8745; E 1,4 &#8745; E 1,5 implies E 1 . For the n-dimensional subspace C &#8834; R 2n (in particular, the control subspace for f ), we denote Shell(C) := {x &#8712; R 2n : &#8730; n -100q &#8804; x C 2 &#8804; &#8730; n}, where x C denotes the orthogonal projection of x onto the subspace C. &#8226; E 1,1 (f ): This event occurs if no query x in Alg with x C 2 &#8804; &#8730; n lies in |T |&#8805;q F T defined by f ; &#8226; E 1,2 (f ): This event occurs if no query x in Alg satisfies x / &#8712; Shell(C) and x / &#8712; B (or equivalently, x C 2 &lt;</p><p>&#8730; n -100q and x &#8712; F i for some i &#8712; [N ]);</p><p>x, g i &#8805; r + 100qn 1/4 , for some i &#8712; [N ];</p><p>&#8226; E 1,4 (f ): This event does not occur if there exist i &#8712; [N ] and two queries x, z in Alg where (i) x C and z C are not scalar multiples of each other, z C = (1 + a)x C + by denotes the unique decomposition with x C &#8869; y, y 2 = 1 and b &gt; 0, such that x &#8712; F i and | y, g i | &#8805; 100 &#8730; q.</p><p>&#8226; E 1,5 (f ): The event occurs whenever every pair x, y &#8712; Alg satisfy xy 2 &#8804; 2 (xy) C 2 .</p><p>We first prove that E 1 (f ) is implied by the five events together. Then, we show that each of the events holds individually with probability 1o(1). By a union bound over the five events, this gives Lemma 4.4.</p><p>Lemma 4.6.</p><p>Proof. Recall that Qq denotes the set of (at most q) queries made by Alg when running on f whose orthogonal projections onto C each have norm at most &#8730; n. First E 1,1 (f ) implies that the number of "nonempty" flaps i &#8712; [N ],</p><p>i.e. flaps F i that have Qq &#8745; F i = &#8709;, is at most q 2 . Fix any nonempty flap F i and any two points x, z &#8712; Qq &#8745; F i . First consider the case that x C and z C are scalar multiples of each other. Note that we have x, z &#8712; Shell(C) by E 1,2 (f ) and thus, (xz) C 2 &#8804; 100q (since they are scalar multiples of each other). By E 1,5 (f ), xz 2 &#8804; 200q, which is consistent with the requirement of E 1 (f ) since q = o( &#8730; qn 1/4 ).</p><p>So consider the case when x C , z C are not scalar multiples of each other, and let z C = (1 + a)x C + by be the unique decomposition with x C &#8869; y and y &#8712; C with y 2 = 1 and b &gt; 0. Let &#945; :</p><p>so that we may use E 1,5 (f ) to deduce that (4.24) holds for x and z.</p><p>We have z C</p><p>By E 1,2 (f ), we have x &#8712; Shell(C) and thus, x C 2 &#8805; &#8730; n -100q. Plugging this in, we have</p><p>Let's consider two cases:</p><p>Case 1: a &#8805; -200q/ &#8730; n. We have from Equation (4.26) (note that the coefficient of a is negative and is a value larger than -2n)</p><p>and we get Equation (4.25).</p><p>Case 2: a &lt; -200q/ &#8730; n. In this case, using r &#8804; z, g i , x, g i &#8804; r + 100qn 1/4 (where the first inequality is because x, z &#8712; F i and the second is from E 1,3 (f )) gives</p><p>and hence,</p><p>Recalling that a &lt; 0, dividing through by -a we get</p><p>by the setting of r in Lemma 3.1. Recalling Equation (6.54), we get</p><p>as was to be shown.</p><p>Event E 1,1 (f ). We now show that with probability at least 1o(1) over the draw of f &#8764; D no , all 2 q queries specified by Alg avoid the region which is the intersection of at least q flaps. Consider any fixed query x and fix any setting of the control subspace C &#8834; R 2n with x C 2 &#8804; &#8730; n. Using Lemma 3.2 (and the fact</p><p>so that a union bound over 2 q queries gives (2c 1 ) q /q! = o(1) for large q.</p><p>Event E 1,2 (f ). Similarly to above, we proceed by a union bound over all 2 q queries. We consider a fixed control subspace C and we let x be a query with</p><p>by the setting of r from Lemma 3.1. The desired claim then follows from a union bound over all 2 q queries. Event E 1,3 (f ). Consider any query x, and consider a fixed setting of the control subspace C with x C 2 &#8804; &#8730; n.</p><p>Then,</p><p>where the computation proceeds similarly to E 1,2 (f ).</p><p>Event E 1,4 (f ). For a fixed control subspace C, we may consider two arbitrary queries x, z among the set of all 2 q queries with x C 2 , z C 2 &#8804; &#8730; n. This gives 2 2q possible settings of the unit vector y which is orthogonal to x C . In order for the event to fail, there must exists some i &#8712; [N ] where g i , x C &#8805; r and | g i , y | &#8805; 100 &#8730; q.</p><p>Furthermore, since x C and y are orthogonal, these two events are independent:</p><p>hence, we may take a union bound over all i &#8712; [N ] and all 2 2q pairs of vectors x and z.</p><p>Event E 1,5 (f ). Finally, consider any two vectors x and y which are queries among the 2 q possible queries in Alg. The Johnson-Lindenstrauss lemma (see Theorem 5.3.1 in <ref type="bibr">[Ver18]</ref>) says that a random n-dimensional subspace C of R 2n will satisfy (xy) C 2 &#8805; (1/ &#8730; 2&#949;) xy 2 except with probability exp -&#8486;(&#949; 2 n) . Thus, for large enough n, xy 2 &#8804; 2 (xy) C 2 except with probability exp -&#8486;(n) , and since q n, we may union bound over all 2 2q pairs of queries in Alg. 4.5.1 Proof of Lemma 4.5 For E 2 (f ) &#8745; E 1 (f ) to happen, there must exist a level k &#8712; [q] such that &#8226; After the the first k -1 queries Q = Qk-1 , E 1 (f ) holds, i.e., the number of flaps F i with Q &#8745; F i = &#8709; is at most q 2 . In every such F i , every two points in Q &#8745; F i have distance at most 1000 &#8730; qn 1/4 and share the same value of</p><p>which we denote by b i &#8712; {0, 1}.</p><p>&#8226; Let y be the k-th query. There exists an i such that Q &#8745; F i = &#8709; and y &#8712; F i such that</p><p>for all x &#8712; Q &#8745; F i (the number of such i is at most q) but</p><p>We prove below that when q &#8804; n 0.05 , the probability of the event above for a fixed k is o(1/q) and thus, Lemma 4.5 follows by applying a union bound on k. This follows from a union bound on all i &#8712; [N ] such that Q &#8745; F i = &#8709; and every x &#8712; Q &#8745; F i satisfies (Equation (4.27)), taking X (or z) below as Q &#8745; F i (or y, respectively) projected on the space orthogonal to g i and b as b i .</p><p>Lemma 4.7. Let b &#8712; {0, 1}, and let X be a set of at most q points in Ball( &#8730; 2n) and y &#8712; Ball( &#8730; 2n). Suppose that every pair x &#8712; X and y satisfy (xy) A 2 &#8804; 1000 &#8730; qn 1/4 .</p><p>Then over the draw of v &#8764; N (0, I n ) in A, the probability of</p><p>Before proving Lemma 4.7, we show how it implies Lemma 4.5 from a union bound. In particular, we have concluded that</p><p>q &#215; u.b over i q 2 &#215;O &#8730; q log n/n 1/4 = O(q 3.5 log n/n 1/4 ).</p><p>Proof. [Proof of Lemma 4.7] Fix a point x * &#8712; X. We show that (1) the probability of</p><p>b for all x is at least &#8486;(1); and (2) the probability of</p><p>is at most O( &#8730; q log n/n 1/4 ). The lemma then follows.</p><p>To analyze (2), we consider any 0 &lt; &#947; &#8804; &#8730; n/4 and any sign &#958; &#8712; {-1, 1}. We have that a draw of a Gaussian v lying in the action subspace A satisfies</p><p>where we have used Gaussian anti-concentration (to conclude it does not lie within an interval of width 2&#947;/ x * A 2 ), as well as Gaussian tail-bounds to say g is is larger than &#8730; n/(4 x * A 2 ). The minimum over &#964; &gt; 0 is meant to quantify over possible values of x * A 2 . Letting &#947; = 50000 &#8730; qn 1/4 log n allows us to conclude that the probability that v, x * lies within distance &#947; of -&#8730; n/2 or &#8730; n/2 is at most O( &#8730; q log n/n 1/4 ).</p><p>On the other hand, given that (yx * ) A &#8804; 1000 &#8730; qn 1/4 , we also have that</p><p>which is smaller than any inverse polynomial in n. Therefore, we have that, except with probability at most O( &#8730; q log n/n 1/4 ), for both &#958; &#8712; {-1, 1},</p><p>&#8804; 50000 &#8730; qn 1/4 log n; and</p><p>When these two events occur, event (2) cannot occur, which shows that (2) occurs with probability at most O( &#8730; q log n/n 1/4 ).</p><p>To conclude (1), we note that any</p><p>, which shows that conditioning on event (1) does not significantly affect the probability of (2).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">A Mildly-Exponential Lower Bound for Non-Adaptive Tolerant Testers</head><p>We will prove the following:</p><p>Theorem 5.1. (Two-sided non-adaptive tolerant testing lower bound) There exist absolute constants 0 &lt; &#949; 1 &lt; &#949; 2 &lt; 0.5 such that any non-adaptive (&#949; 1 , &#949; 2 )-tolerant tester for convexity over N (0, I n ) (which may make two-sided errors) must use at least 2 &#8486;(n 1/4 ) queries.</p><p>5.1 The D yes and D no Distributions Before specifying the D yes and D no distributions, we first describe some necessary objects.</p><p>The Control and Action Subspaces. Throughout, we will work over R n+1 for convenience. Let A denote a random 1-dimensional subspace of R n+1 , i.e. A = {tv : t &#8712; R} where v &#8764; S n is a Haar-random unit vector.</p><p>Let C be the orthogonal complement of A; note that C is a random n-dimensional subspace of R n+1 . We call C the control subspace and we call A the action subspace.</p><p>Given a vector x &#8712; R n , we write x C to denote the projection of x onto C and we write x A to denote the projection of x onto A, so every vector satisfies x = x A + x C . Recalling that A is a 1-dimensional subspace, when there is no risk of confusion we write x A to denote the scalar value t such that x A = tv.</p><p>Constants. We will use four positive absolute constants c 0 , c 1 , c 2 and &#964; in the construction. Here c 0 is the constant hidden in the statement of Lemma 3.3. We set c 1 , c 2 and &#964; as follows: Functions on the Action Subspace. Let c 2 be the absolute constant given in Equation (5.28). Intuitively, 2c 2 will be the Gaussian measure of two symmetric intervals in the action subspace A. More formally we define</p><p>Using the absolute constant &#964; given in Equation (5.28), we also define the interval</p><p>and we observe that a random draw of x from N (0, I n+1 ) has x &#8712; I with probability at least 1&#964; . We write Shell n+1 to denote the corresponding spherical shell in R n+1 , i.e.</p><p>Finally, let P be a uniformly random subset of <ref type="bibr">[N ]</ref>.</p><p>For a fixed setting of C (which defines the complementary A), B (which in turn defines H i , F i , the F T 's, and U i ), and P , we define the function g C,B,P : R n+1 &#8594; {0, 1, 0 , 1 } as follows:</p><p>The D yes and D no Distributions. To sample a set from either D yes or D no , first draw C, B, P as described above; note that this induces a draw of A, H i , F i , the F T 's, and U i . Draws from D yes and D no are identical on points x &#8712; R n+1 where g C,B,P (x) &#8712; {0, 1}; on the other values, however,</p><p>&#8226; For functions in D yes , we set 0 &#8594; 0 and 1 &#8594; 1.</p><p>&#8226; For functions in D no , we set</p><p>where we define</p><p>See Figures <ref type="figure">2</ref> and <ref type="figure">3</ref> for illustrations of D yes and D no .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Distance to Convexity</head><p>Recall that a draw of a function from either D yes or D no induces a draw of B, C, and P . First, we give an upper bound on the expected distance to convexity of a function drawn from D yes . Let &#949; 1 be the constant given by</p><p>Proposition 5.1. We have dist(f yes , P conv ) &#8804; &#949; 1 with probability at least 0.5 when f yes &#8764; D yes .</p><p>Proof. Consider a fixed choice of C, B and P . We then consider the convex set G C,B,P &#8834; R n+1 defined as the intersection of H i (for i &#8712; P ) and the set {x : x C 2 &#8804; &#8730; n}. By construction, the set G C,B,P is a convex set. Let g C,B,P denote the corresponding indicator function. We now analyze the distance between the functions g C,B,P (where, as for functions in the support of D yes , we identify 0 with 0 and 1 with 1) and g C,B,P . First of all, by construction, if g C,B,P (x) = 1, then g C,B,P (x) is also 1. So to bound the distance, we note that there are three possible ways in which g C,B,P (x) can be 0 but g C,B,P (x) can be 1: 1. x &#8712; &#8746; |T |&#8805;2 F T ; 2. x &#8712; Shell n+1 ; 3. There is some i &#8712; [N ] such that x &#8712; U i and x A &#8712; Curb. By definition, (i) the Gaussian volume of the first set is Vol(&#8746; |T |&#8805;2 F T ); (ii) the Gaussian volume of the second set is bounded by &#964; ; (iii) the Gaussian volume of the third set is bounded by Vol(Curb) which by definition is 2c 2 Thus, for a specific instantiation of C, B and P , dist(g C,B,P , P conv ) &#8804; 2c 2 + &#964; + Vol &#63723; &#63725; |T |&#8805;2 F T &#63734; &#63736; .</p><p>The claim follows from Markov's inequality, that the last term on the RHS above is at most twice the expectation with probability at least 1/2. Next, we give a lower bound on the expected distance to convexity of a function drawn from D no . Let &#949; 2 be the constant given as</p><p>Proposition 5.2. We have dist(f no , P conv ) &#8805; &#949; 2 with probability at least 1o(1) when f no &#8764; D no .</p><p>Proof. As before, consider a fixed choice of C, B and P . Take any x C such that x C &#8712; U i for some i &#8712; P and x C &#8805; &#8730; n + 1 -2 ln(2/&#964; ) (which is the left end of I). For any such x C , the line along x A looks like the picture at the top of Figure <ref type="figure">3</ref>, except that the function is set to 0 when x C makes x go outside of Shell n+1 . Given that x C &#8804; &#8730; n, this only happens when |x A | is at least &#8486;(n 1/4 ) given that &#964; &#8804; 1/10000 in Equation (5.28).</p><p>As a result, the distance to convexity along this line in the action space (with x C fixed) is at least Vol(Middle) = (1 -2c 2 )/3. This follows immediately from the fact that Middle is a symmetric interval about 0 and the choice of parameters in the definition of Middle.</p><p>Given that the mass of x with</p><p>The result follows by a straight forward modification of Lemma 3.4 to show that with probability at least 1o(1),</p><p>Setting Parameters. We verify that &#949; 2&#949; 1 = &#8486;(1):</p><p>(where the last line is by Lemma 3.3) which is &#8486;(1) given choices of c 0 , c 1 , c 2 and &#964; made in Equation (5.28).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">Proof of Theorem 1.2</head><p>We introduce some helpful notation and outline the high-level structure of the argument.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.1">Setup and Outline of Argument</head><p>We introduce the following notation: Notation 7. Given an outcome of the control subspace C and of Nazarov's body</p><p>within C as defined earlier, for x &#8712; R n+1 we define the set S B (x) as</p><p>Note that if x and y have x C = y C , then S B (x) = S B (y), i.e. only the C-part of x affects S B .</p><p>We define the regions Left, Middle, Right &#8834; R as follows:</p><p>Note that Left Middle Right Curb = R (where as before we identify R with an outcome of the one-dimensional action subspace A).</p><p>To establish indistinguishability, we show that no non-adaptive deterministic algorithm A that makes q = 2 c3n 1/4 queries, for some sufficiently small constant c 3 , can distinguish D yes from D no . Specifically, for any nonadaptive deterministic algorithm A with query complexity q, we show that (5.30) Pr</p><p>To this end, we define Bad to be the following event:</p><p>Bad: There are x, y &#8712; Shell n+1 queried by A that (i) satisfy S B (x) = S B (y) = {&#8467;} for some &#8467; &#8712; [N ] (or equivalently, x, y &#8712; U &#8467; for some &#8467;), and (ii) have x A , y A belonging to two distinct sets among Left, Middle, Right.</p><p>We will first show in Lemma 5.1 that A can distinguish D yes from D no only when Bad occurs. On the other hand, in Lemma 5.2, we show Bad occurs with probability o(1) when the number of queries is q = 2 c3n 1/4 and c 3 is sufficiently small. Lemmas 5.1 and 5.2 together establishe Equation (5.30); the proof of this is analogous to the proof of Theorem 1 in Section 4.2 of [CDL + 24] and we refer the reader to [CDL + 24] for full details. Theorem 1.2 then follows from Equation (5.30) via Yao's minimax principle (Theorem 2.1).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.2">Indistinguishability of D yes and D no</head><p>We write A(f ) to denote the sequence of q answers to the queries made by A to f . We write view A (D yes ) (respectively view A (D no )) to be the distribution of A(f yes ) for f yes &#8764; D yes (respectively f no &#8764; D no ). The following claim asserts that conditioned on Bad not happening, the distributions view A (D yes | Bad ) and view A (D no | Bad ) are identical.</p><p>Proof. Let Q be the set of points queried by A. Recall that the distributions of the subspaces C and action variables A are identical for D yes and D no . So fix an arbitrary outcome of the n-dimensional subspace C and the orthogonal one-dimensional subspace A. As the distribution of the Nazarov body B &#8764; Naz(r, N, C) is also identical for D yes and D no , we fix an arbitrary outcome B of B. Let f be a random function drawn from either D yes or D no .</p><p>Note that for any point x &#8712; R n+1 such that |S B (x)| = 1 or x / &#8712; Shell n+1 or x A &#8712; Curb, by construction we have that f (x) can be determined directly in the same way for both D yes and D no (no query is required). So it suffices for us to consider the points x such that |S B (x)| = 1, x &#8712; Shell n+1 , and x A / &#8712; Curb. We call these points important points.</p><p>We divide these important points into disjoint groups according to S B (x). More precisely, for every &#8467; &#8712;</p><p>denote the function f (x) restricted to X &#8467; (where as stated above, f denotes either a function drawn from D yes or from D no ). The condition that Bad does not happen implies that either x A &#8712; Left for all x &#8712; Q &#8745; X &#8467; , or x A &#8712; Middle for all x &#8712; Q &#8745; X &#8467; , or x A &#8712; Right for all x &#8712; Q &#8745; X &#8467; . In particular, this means f &#8467; (x) = f &#8467; (y) for all x, y &#8712; Q &#8745; X &#8467; , and this holds for both D yes and D no .</p><p>Since f &#8467; (x) are the same for all x &#8712; Q &#8745; X &#8467; , the distribution of f &#8467; is actually one random bit. Indeed, f &#8467; (x) = 0 with probability 1/2 and f &#8467; (x) = 1 with probability 1/2 (because each element &#8467; &#8712; [N ] belongs to P with probability 1/2) independently, and this holds for both D yes and D no . This completes the proof of the lemma.</p><p>Next, we show that Bad happens with probability o(1) (recall that q = 2 c3n 1/4 ). The proof of the following lemma follows the proof of an analogous lemma from [CDL + 24]: Lemma 5.2. For any fixed set of points</p><p>Proof. Fix a pair of query points x, y &#8712; R n+1 that belong to Q. By the definition of Bad, we may assume without loss of generality that x, y &#8712; Shell n+1 . Let Bad x,y be the event that Recall that each of the two intervals defining Curb (cf. Section 5.3.1) has the same width which we will denote &#961;(c 2 ) for succinctness, i.e.</p><p>On one hand, for (b) to happen on x, y, we must have ( )</p><p>On the other hand, (a) means Proof. Fix x, y such that x-y &#8804; cn 3/8 . For succinctness we write z to denote x-y, so z &#8712; R n+1 and z &#8804; cn 3/8 ; our goal is to show that |z A | &#8804; &#961;(c 2 ) except with probability at most 2 -4c3n 1/4 . Since A is a Haar-random direction in R n+1 , the distribution of z A is the same as the distribution of z &#8226; v 1 where v &#8764; S n-1 . Hence by standard bounds on spherical caps (Lemma 2.1),</p><p>Taking t = &#8730; 8c 3 &#8226; n 1/8 , this probability is at most e -4c3n 1/4 &lt; 2 -4c3n 1/4 . So we set our threshold as</p><p>2c3 , and the lemma is established. By a union bound over all (at most 2 2c3n 1/4 ) pairs of points x, y from Q, we get that</p><p>which completes the proof.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.3">Proof of Lemma 5.4</head><p>For the remainder of this section, we will always assume that x, y &#8712; Shell n+1 satisfy xy &gt; cn 3/8 . Note that we can view the construction g C,B,P as a two stage process:</p><p>&#8226; We first draw C, which is a Haar random n dimensional subspace of R n+1 .</p><p>&#8226; We then draw B &#8764; Naz(r, N, C), and we draw P as a uniformly random subset of [N ].</p><p>We require the following claim:</p><p>Claim 8. Suppose x, y &#8712; Shell n+1 satisfy xy &gt; cn 3/8 . Then with probability at least -2 -&#8486;(n 1/4 ) over the outcomes of C, we have</p><p>Proof. Fix x, y &#8712; Shell n+1 such that xy &gt; cn 3/8 . Because A is drawn Haar-randomly (and since it defines C), it follows from Lemma 2.1 that</p><p>Let t = &#946;n 1/8 for a suitable constant &#946; &gt; 0. The previous inequality gives</p><p>Thus, with probability 1e -&#946; 2 n 1/4 /2 , we have</p><p>As x &#8712; Shell n+1 and thus x &#8804; 2 &#8730; n, it follows that the last expression is at least x -1. Identical calculations yield the corresponding lower bounds on y C and (xy) C .</p><p>Fix an outcome C of C such that Equation (5.32) holds. For convenience, we will write x for x C , y for y C , both of which lie in R n . For the rest of the argument, we will work over C, i.e. we view sets such as H &#8467; , H &#8467; and B as lying in C (which we identify with R n ) rather than in R n+1 .</p><p>The following argument is analogous to (parts of) the proof of Lemma 15 of [CDL + 24]. Recall</p><p>By Claim 8, we have that</p><p>Pr [ ] &#8804; 2 -&#8486;(n 1/4 ) + Pr Observe that the event "S B (y ) = {1}" that we are conditioning on is an event over the random draw of B, i.e. over the draw of g 1 , . . . , g N . To analyze this event it is helpful to introduce the following notation: For z &#8712; R n , define</p><p>Consequently, the event "S B (y ) = {1}" is the same as the event g 1 &#8712; hfsp(y ) &#8743; g i / &#8712; hfsp(y ) for i &#8712; {2, . . . , N } .</p><p>We may fix any outcome g 2 * , . . . , g N * of g 2 , . . . , g N all of which lie outside of hfsp(y ), and we get (writing g to denote (g 1 , . . . , g N )) that</p><p>where Equation (5.34) uses Pr[X| &#8467; E &#8467; ] &#8804; sup &#8467; Pr[X|E &#8467; ] as earlier; Equation (5.35) uses that if g 1 &#8712; hfsp(y ) then in order to have S B (x ) = S B (y ) it must be the case that g 1 &#8712; hfsp(x ) &#8745; hfsp(y ); and Equation (5.36) is because the event g 1 &#8712; hfsp(x ) &#8745; hfsp(y ) is independent of the outcome of g 2 , . . . , g S . So in what follows our goal is to upper bound (5.36). In other words, recalling that we write Vol(K) to denote the Gaussian measure of the set K (cf. Section 2.2), our goal is to obtain an upper bound on (5.37)</p><p>(5.36) = Vol(hfsp(x ) &#8745; hfsp(y )) Vol(hfsp(x )) , which is a two-dimensional problem because the only thing that matters about the outcome of g &#8764; N (0, I n ) vis-a-vis (5.37) is the projection of g in the directions of x and y . Towards this goal, we recall the following tail bound for bivariate Gaussian random variables:</p><p>Proposition 5.3. (Equation (2.11) of <ref type="bibr">[Wil05]</ref>) Suppose (Z 1 , Z 2 ) &#8764; N (0, &#931;) where</p><p>Then for h, k &gt; 0, we have</p><p>Let g &#8764; N (0, I n ). Define the random variables</p><p>and set h := r x , k := r y . It is immediate that</p><p>Furthermore, note that Var[Z 1 ] = Var[Z 2 ] = 1. We also have &#961; :</p><p>x y . Thanks to Claim 8, we have (cn 3/8 -1) 2 &#8804; x -</p><p>Using Claim 8 and the fact that x, y &#8712; Shell n+1 , we have that x , y &#8805; &#8730; n + 1 -2 ln(2/&#964; ) -1 and combining this with Equation (5.38) gives</p><p>Note that Vol(hfsp(x )) = &#934;(-h). Consequently, using Proposition 5.3 we get</p><p>and we will obtain an upper bound on this in the remainder of this section. In particular, note that</p><p>Recall that x , y &#8804; &#8730; n and that x &#8805; &#8730; n + 1-2 ln(2/&#964; )-1. Hence, for n large enough and an appropriate constant &#964; , we have (5.40)</p><p>and consequently, we get that</p><p>for an appropriate choice of &#964; . The final inequality relies on the above lower bound on x and Lemma 3.1. An identical calculation gives that &#961;k -h &#8804; -&#8486;(1). It follows that</p><p>where the final inequality relies on a standard Gaussian tail bound (cf. Proposition 2.1). To see Equation (5.41), note that exp &#63723; &#63725; r 2</p><p>x 2 1 -</p><p>where we used Lemma 3.1 and Equation (5.38). Equation (5.42) immediately follows from Equation (5.39). Finally, note that Equation (5.43) completes the proof.</p><p>6 Two-Sided Non-Adaptive Lower Bound Our goal in this section is to prove Theorem 1.3 restated below:</p><p>Theorem 6.1. (Two-sided non-adaptive lower bound) For any constant c &gt; 0, there is a constant &#949; = &#949; c &gt; 0 such that any non-adaptive &#949;-tester convexity over N (0, I n ) (which may make two-sided errors) must use at least n 1/4-c queries.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Setup</head><p>We recall some necessary tools and results from <ref type="bibr">[CDST15]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.1">Distributions with Matching Moments</head><p>The first results we need, stated below as Propositions 6.1 and 6.2, establish the existence of two finitely supported random variables that match the first &#8467; moments of a univariate Gaussian, for any &#8467;. Crucially, one of the random variables is supported entirely on non-negative reals, while the other puts nonzero probability on negative values (so if &#8467; is any fixed constant, it puts a constant amount of probability on negative values): As a result, we have</p><p>and</p><p>Combining these, we have the proposition.</p><p>We adopt the following notation: for J = (J 1 , . . . , J q ) &#8712; N q a q-dimensional multi-index, we let |J| denote J 1 + &#8226; &#8226; &#8226; + J q and let J! denote J 1 !J 2 ! &#8226; &#8226; &#8226; J q !. We write #J to denote |{i &#8712; [q] : J i = 0}| (and we observe that #J &#8804; |J|). Given X &#8712; R q we write X J to denote q i=1 (X i ) Ji , and we write X| J &#8712; R #J to denote the projection of X onto the coordinates for which J i = 0. For f : R q &#8594; R, we write f (J) to denote the J-th derivative, i.e.</p><p>.</p><p>We recall the standard multivariate Taylor expansion: Fact 6.1. (Multivariate Taylor expansion) Given a smooth function f : R q &#8594; R and k &#8712; N,</p><p>for X, &#8710; &#8712; R q , where &#964; is uniform random over the interval [0, 1].</p><p>We recall the standard Berry-Esseen theorem for sums of independent real random variables (see for example, <ref type="bibr">[Fel68]</ref>), which is a quantitative form of the Central Limit Theorem: For g &#8764; N (0, I n ), the value n i=1 g 2 i is distributed according to a chi-squared distribution with n degrees of freedom, denoted &#967;(n) 2 . We recall the following tail bound: Lemma 6.1. (Tail bound for the chi-squared distribution, from</p><p>Following <ref type="bibr">[CDST15]</ref>, our proof will employ a carefully chosen "mollifier," i.e. a particular smooth function which approximates the indicator function of a set (the use of such mollifiers is standard in Lindeberg-type "replacement method" analyses). We will use a specific mollifier, given in <ref type="bibr">[CDST15]</ref>, whose properties are tailored to our sets of interest (unions of orthants). The key properties of this mollifier are as follows: Proposition 6.4. ( [CDST15] Proposition 4.3: "product mollifier") Let O be a union of orthants in R q . For all &#949; &gt; 0, there exists a smooth function &#936; O : R q &#8594; [0, 1] with the following properties:</p><p>6.1.3 Clipping Given C &gt; 0, we define the "clipping" function clip C : R n &#8594; {0, 1} which, on input a vector x &#8712; R n , outputs 1 if and only if x &#8804; &#8730; n + C.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>6.2</head><p>The Yes-and No-Distributions Let c &gt; 0 (this is the c of Theorem 1.3 ). Let u and v be the random variables given by Propositions 6.1 and 6.2, where we take &#8467; to be the smallest odd integer that is at least 1/c and take &#181; = &#181;(&#8467;).</p><p>A set K drawn from our "yes-distribution" D yes has indicator function defined as follows:</p><p>&#8226; First, choose a Haar random orthonormal basis normalized so that each vector has Euclidean length 1/ &#8730; n, and denote those vectors a (1) , . . . , a (n) . (So a (1) &#8712; R n is a Haar random unit vector in R n scaled by 1/ &#8730; n; a (2) is Haar random over the radius-(1/ &#8730; n) sphere in the (n -1)-dimensional subspace of R n that is orthogonal to a (1) ; and so on.)</p><p>&#8226; Then n independent draws u 1 , . . . , u n are made from the real random variable u of Proposition 6.1.</p><p>(Here C &gt; 0 is a suitable constant, depending only on c but not on n, that will be fixed later in our argument.)</p><p>A set K drawn from our "no-distribution" D no is defined very similarly, with the only difference being that v takes the place of u:</p><p>&#8226; The vectors a (1) , . . . , a (n) are chosen exactly as in the yes-case.</p><p>&#8226; Then n independent draws v 1 , . . . , v n are made from the real random variable v of Proposition 6.2.</p><p>(Here C &gt; 0 is the same constant as in the yes-case.)</p><p>We remark that our yes-and no-functions differ from the yes-and no-functions of <ref type="bibr">[CDST15]</ref> in a number of ways: Our functions involve a random orthonormal basis, they are degree-2 polynomial threshold functions rather than linear threshold functions, and they involve clipping. (In contrast the <ref type="bibr">[CDST15]</ref> functions do not involve choosing a random orthonormal basis, are LTFs, and do not incorporate any clipping.)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.1">Distance to Convexity</head><p>We first consider yes-functions. Since u is supported on non-negative real values and a (1) , . . . , a (n) are orthogonal vectors, every outcome of 1 u 1 (a (1) </p><p>is the indicator function of a ball in R n , and the intersection of a ball and an ellipsoid is a convex set, we immediately have the following: Corollary 6.1. For every C &gt; 0, every K &#8834; R n in the support of D yes is convex.</p><p>The following lemma shows that a constant fraction of draws of K &#8764; D no are constant-far from being convex (intuitively, this is because with extremely high probability a constant fraction of the coefficients v 1 , . . . , v n are negative, which causes the degree-2 PTF to be far from an ellipsoid): Lemma 6.2. For a suitable choice of the constant C &gt; 0, with probability at least 1/2 a random K &#8764; D no is &#954;-far from convex (where &#954; &gt; 0 depends on &#181; and &#8467; and hence only on c).</p><p>Proof. By the rotational symmetry of the N (0, I n ) distribution, we may assume that the orthonormal basis a (1) , . . . , a (n) is the canonical basis e 1 , . . . , e n scaled by 1/ &#8730; n. Thus a draw of K &#8764; D no (after a suitable rotation)</p><p>Given this, it suffices to show that a random set (6.46)</p><p>is 2&#954;-far from convex with probability at least 1/2. If we have this, then since K has distance at most &#954; from K (which holds for a suitable choice of the constant C, using Lemma 6.1), the lemma follows.</p><p>To analyze Equation (6.46), we begin by recalling that by Proposition 6.2, the random variable v has probability p i &gt; 0 of taking value d i for 1 &#8804; i &#8804; &#8467; , where &#8467; is some value that is at most &#8467; + 1, and we have </p><p>Fix any outcome (v 1 , . . . , v n ) of (v 1 , . . . , v n ) such that the event on the LHS of Equation (6.48) is satisfied. In the rest of the proof we will argue that for such an outcome the set (6.49)</p><p>corresponding to Equation (6.46) is &#8486;(1)-far from convex.</p><p>For each i &#8712; [&#8467; ], let S i &#8834; [n] denote the set of indices j &#8712; [n] such that v j = d i . Let c i be such that |S i | = p i n + c i &#8730; n, and observe that |c i | &#8804; c i . For ease of notation we may suppose that S 1 consists of the first coordinates {1, . . . , p 1 n + c 1 &#8730; n} (this is without loss of generality by the rotational invariance of N (0, I n )).</p><p>Fix any i &#8712; {2, . . . , &#8467; } and consider the tuple of random Gaussian coordinates (x j ) j&#8712;Si for a draw of x = (x 1 , . . . , x n ) &#8764; N (0, I n ). We have</p><p>and by the Berry-Esseen theorem (Theorem 6.2), we get that (6.50) Pr</p><p>for suitable positive absolute constants A 2 , . . . , A &#8467; (depending on the p i 's and the c i 's but not on n).</p><p>Let A := &#8467; &#8226; max{|d 2 |, . . . , |d &#8467; |} &#8226; max{A 2 , . . . , A &#8467; }.. By a union bound applied to Equation (6.50) over all i &#8712; {2, . . . , &#8467; }, with probability at least 9/10 we have that (6.51)</p><p>let us say that any such outcome of (x j ) j&#8712;S2&#8746;</p><p>&#8226;&#8226;&#8226;&#8746;S &#8467; is good. Fix any good outcome (x j ) j&#8712;S2&#8746;&#8226;&#8226;&#8226;&#8746;S &#8467; of the last n-(p 1 n+c 1 &#8730; n) coordinates of x &#8764; N (0, I n ), and let A &#8712; [-A, A] be the value such that the LHS of Equation (6.51) is equal to &#8467; i=2 d i p i n + A &#8730; n. Recalling Equation (6.47), for this good outcome of the last n -(p 1 n + c 1 &#8730; n) coordinates, the set (6.49) (viewed as an indicator function of coordinates 1, . . . , p 1 n + c 1 &#8730; n) becomes (6.52) 1 &#63726; &#63728; p1n+c 1 &#8730; n j=1 d 1 x 2 j &#8804; p 1 d 1 n -A &#8730; n &#63737; &#63739; . sphere of radius r/ &#8730; n. Hence, writing u &#8764; S n-1 to denote a Haar random point from the n-dimensional unit sphere, we have Pr (a (j) &#8226; X t * ) 2 &#8805; 10 ln n n = Pr u&#8764;S n-1 |u 1 |r &#8730; n &#8805; &#8730; 10 ln n &#8730; n &#8804; Pr u&#8764;S n-1 u 1 &#8805; &#8730; 10 ln n r &#8804; Pr u&#8764;S n-1 u 1 &#8805; 3 &#8730; ln n &#8730; n (using r &#8804; &#8730; n + C)</p><p>&#8804; e -(9/2) ln n = 1/n 9/2 , using a standard bound on spherical caps (see Lemma 2.1). Since there are only qn &lt; n 5/4 many pairs (t, j) &#8712; [q]&#215;[n], a union bound concludes the proof.</p><p>Fix a = (a (1) , . . . , a (n) ) to be any non-bad outcome of a. Recalling Equation (6.54), by Lemma 6.3 it suffices to show that d TV (R a yes , R a no ) &#8804; o(1); this is our goal in the rest of the proof. Let S &#8712; R q be the random column vector whose t-th entry is u 1 (a (1) &#8226; X t * ) 2 + &#8226; &#8226; &#8226; + u n (a (n) &#8226; X t * ) 2 -&#181;, and let T &#8712; R q be the random column vector whose t-th entry is</p><p>The response vector R a yes is determined by the orthant of R q in which S lies and the response vector R a no is determined by the orthant of R q in which T lies. So to prove a q-query monotonicity testing lower bound for non-adaptive algorithms, it suffices to upper bound In what follows, we will show that d UO (S, T ) &#8804; o(1) when q = O(n 1/4-c ). To this end, let O denote a union of orthants such that and then apply Proposition 6.3 to bound (6.56).</p><p>For all i &#8712; {0, 1 . . . , n} we introduce the R q -valued hybrid random variable Q (i) whose t-th coordinate is</p><p>Observe that Q (0) = S and Q (n) = T . Informally, we are considering a sequence of hybrid distributions between S and T obtained by swapping out each of the u-summands for a corresponding v-summand one by one. The main idea is to bound the difference in expectations (6.58)</p><p>)] for each i, since summing (6.58) over all i &#8712; [n] gives an upper bound on</p><p>using the triangle inequality.</p><p>To bound (6.58), we define the R q -valued random variable R -i whose t-th coordinate is (6.59) (R -i ) t = i-1 j=1 v j (a (j) &#8226; X t * ) 2 + n j=i+1 u j (a (j) &#8226; X t * ) 2 .</p><p>Writing &#934;(v i , a (i) ) to denote the random vector in R q whose t-th coordinate is v i (a (i) &#8226; X t * ) 2 and likewise for &#934;(u i , a (i) ), we have that</p><p>Truncating the Taylor expansion of &#936; O at the &#8467;-th term (Fact 6.1), we get (6.60)</p><p>where &#964; is a random variable uniformly distributed on the interval [0, 1] (so the very last expectation is with respect to &#964; , v i and R -i ). Writing the analogous expression for E[&#936; O (R -i + &#934;(v i , a (i) ))], we observe that by Propositions 6.1 and 6.2 the first sums are equal term by term, i.e. we have</p><p>O (R -i ) &#8226; (&#934;(u i , a (i) )) J</p><p>for each |J| &#8804; h. Thus we may cancel all but the last terms to obtain</p><p>Observe that there are |{J &#8712; N q : |J| = &#8467; + 1}| = &#920;(q &#8467;+1 ) many terms in this sum. Recalling that each value of (a (j) &#8226; X t * ) 2 is at most 10 log n n (because &#257; is not bad), that both u i and v i are supported on at most &#8467; + 1 real values that depend only on &#8467; (by Propositions 6.1 and 6.2), and Proposition 6.4, we have that for any &#964; &gt; 0 (we will choose a value for &#964; soon), (6.61)</p><p>.</p><p>Summing over all i &#8712; [n] costs us a factor of n and so we get Equation (6.62) gives us the desired bound on Equation (6.57); it remains only to apply Proposition 6.3 to finish the argument. To do this, let</p><p>(B &#964; corresponds to the region A \ A in of Proposition 6.3). Since both v and u are supported on values of magnitude O &#8467; (1), using the one-dimensional Berry-Esseen inequality (Theorem 6.2) and a union bound across the q coordinates we get that (6.63)</p><p>Pr[S &#8712; B &#964; ], Pr[T &#8712; B &#964; ] &#8804; O &#8467; (q&#964; ) + O &#8467; (q/ &#8730; n).</p><p>So by applying Proposition 6.3, we get that d UO (S, T ) &#8804; O &#8467; (q&#964; ) + O &#8467; (q/ &#8730; n) + O &#8467; (1) &#8226; q &#964; &#8467;+1 &#8226; (10 log n) (&#8467;+1)/2 n (&#8467;-1)/2 .</p><p>Choosing &#964; = 1/n 1/4 and recalling that &#8467; is the smallest odd integer that is at least 1/c, we get that for q = O(n 1/4-c ) the RHS above is O &#8467; ((10 log n) (&#8467;+1)/2 n -c ). This is o(1) for any constants c &gt; 0, &#8467; &#8712; N, and the proof of Theorem 1.3 is complete.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_0"><p>Recall that a Boolean function f : {0, 1} n &#8594; {0, 1} is monotone if whenever x, y &#8712; {0, 1} n satisfy x i &#8804; y i for i &#8712; [n], we have f (x) &#8804; f (y).</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_1"><p>We remark that the original construction of<ref type="bibr">[Naz03]</ref> differs from our construction in a number of ways: the distribution over random halfspaces is slightly different, and the body is not intersected with Ball( &#8730; n). For technical reasons, our specific construction</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_2"><p>Downloaded 05/22/25 to 128.59.18.124 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_3"><p>We remark that in the one-sided adaptive lower bound described above, it would not have been possible to use a one-dimensional action subspace because an adaptive algorithm would be able to detect that "global structure," which is shared across all the U i 's; this is why the dimension of the action subspace A was n in the earlier construction, and there was a different random "action direction" v j from A for each j &#8712; [N ] in the earlier construction.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_4"><p>These works study discrete domains, where a discrete derivative is used.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_5"><p>The study of convexity testing in two dimensions was initiated in earlier work of Raskhodnikova<ref type="bibr">[Ras03]</ref> for the domain [N ] 2 , and has since been extended to sample-based testing<ref type="bibr">[BMR16]</ref>, testing over the continuous domain [0, 1] 2 [BMR19], and tolerant testing<ref type="bibr">[BMR22]</ref>; see also<ref type="bibr">[BF18]</ref>.</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_6"><p>[CFSS17] also gave a 2 O(n log(n/&#949;)) -sample one-sided algorithm, which was generalized to testing under arbitrary product distributions by<ref type="bibr">[HY22]</ref>.</p></note>
		</body>
		</text>
</TEI>
