skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.

Search for: All records

Creators/Authors contains: "Lange, J."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.


    The joint analysis of different cosmological probes, such as galaxy clustering and weak lensing, can potentially yield invaluable insights into the nature of the primordial Universe, dark energy, and dark matter. However, the development of high-fidelity theoretical models is a necessary stepping stone. Here, we present public high-resolution weak lensing maps on the light-cone, generated using the N-body simulation suite abacussummit, and accompanying weak lensing mock catalogues, tuned to the Early Data Release small-scale clustering measurements of the Dark Energy Spectroscopic Instrument. Available in this release are maps of the cosmic shear, deflection angle, and convergence fields at source redshifts ranging from z = 0.15 to 2.45 as well as cosmic microwave background convergence maps for each of the 25 base-resolution simulations ($L_{\rm box} = 2000\, h^{-1}\, {\rm Mpc}$ and Npart = 69123) as well as for the two huge simulations ($L_{\rm box} = 7500\, h^{-1}\, {\rm Mpc}$ and Npart = 86403) at the fiducial abacussummit cosmology. The pixel resolution of each map is 0.21 arcmin, corresponding to a healpix Nside of 16 384. The sky coverage of the base simulations is an octant until z ≈ 0.8 (decreasing to about 1800 deg2 at z ≈ 2.4), whereas the huge simulations offer full-sky coverage until z ≈ 2.2. Mock lensing source catalogues are sampled matching the ensemble properties of the Kilo-Degree Survey, Dark Energy Survey, and Hyper Suprime-Cam data sets. The mock catalogues are validated against theoretical predictions for various clustering and lensing statistics, such as correlation multipoles, galaxy–shear, and shear–shear, showing excellent agreement. All products can be downloaded via a Globus endpoint (see Data Availability section).

    more » « less
  2. We give the first reconstruction algorithm for decision trees: given queries to a function f that is opt-close to a size-s decision tree, our algorithm provides query access to a decision tree T where: - T has size S := s^O((log s)²/ε³); - dist(f,T) ≤ O(opt)+ε; - Every query to T is answered with poly((log s)/ε)⋅ log n queries to f and in poly((log s)/ε)⋅ n log n time. This yields a tolerant tester that distinguishes functions that are close to size-s decision trees from those that are far from size-S decision trees. The polylogarithmic dependence on s in the efficiency of our tester is exponentially smaller than that of existing testers. Since decision tree complexity is well known to be related to numerous other boolean function properties, our results also provide a new algorithm for reconstructing and testing these properties. 
    more » « less

    Combining different observational probes, such as galaxy clustering and weak lensing, is a promising technique for unveiling the physics of the Universe with upcoming dark energy experiments. The galaxy redshift sample from the Dark Energy Spectroscopic Instrument (DESI) will have a significant overlap with major ongoing imaging surveys specifically designed for weak lensing measurements: the Kilo-Degree Survey (KiDS), the Dark Energy Survey (DES), and the Hyper Suprime-Cam (HSC) survey. In this work, we analyse simulated redshift and lensing catalogues to establish a new strategy for combining high-quality cosmological imaging and spectroscopic data, in view of the first-year data assembly analysis of DESI. In a test case fitting for a reduced parameter set, we employ an optimal data compression scheme able to identify those aspects of the data that are most sensitive to cosmological information and amplify them with respect to other aspects of the data. We find this optimal compression approach is able to preserve all the information related to the growth of structures.

    more » « less
  4. We initiate the study of a fundamental question concerning adversarial noise models in statistical problems where the algorithm receives i.i.d. draws from a distribution D. The definitions of these adversaries specify the {\sl type} of allowable corruptions (noise model) as well as {\sl when} these corruptions can be made (adaptivity); the latter differentiates between oblivious adversaries that can only corrupt the distribution D and adaptive adversaries that can have their corruptions depend on the specific sample S that is drawn from D. We investigate whether oblivious adversaries are effectively equivalent to adaptive adversaries, across all noise models studied in the literature, under a unifying framework that we introduce. Specifically, can the behavior of an algorithm A in the presence of oblivious adversaries always be well-approximated by that of an algorithm A′ in the presence of adaptive adversaries? Our first result shows that this is indeed the case for the broad class of {\sl statistical query} algorithms, under all reasonable noise models. We then show that in the specific case of {\sl additive noise}, this equivalence holds for {\sl all} algorithms. Finally, we map out an approach towards proving this statement in its fullest generality, for all algorithms and under all reasonable noise models. 
    more » « less
  5. We design an algorithm for finding counterfactuals with strong theoretical guarantees on its performance. For any monotone model f:Xd→{0,1} and instance x⋆, our algorithm makes S(f)O(Δf(x⋆))⋅logd {queries} to f and returns an {\sl optimal} counterfactual for x⋆: a nearest instance x′ to x⋆ for which f(x′)≠f(x⋆). Here S(f) is the sensitivity of f, a discrete analogue of the Lipschitz constant, and Δf(x⋆) is the distance from x⋆ to its nearest counterfactuals. The previous best known query complexity was dO(Δf(x⋆)), achievable by brute-force local search. We further prove a lower bound of S(f)Ω(Δf(x⋆))+Ω(logd) on the query complexity of any algorithm, thereby showing that the guarantees of our algorithm are essentially optimal. 
    more » « less
  6. We study the problem of certification: given queries to a function f : {0,1}n → {0,1} with certificate complexity ≤ k and an input x⋆, output a size-k certificate for f’s value on x⋆. For monotone functions, a classic local search algorithm of Angluin accomplishes this task with n queries, which we show is optimal for local search algorithms. Our main result is a new algorithm for certifying monotone functions with O(k8 logn) queries, which comes close to matching the information-theoretic lower bound of Ω(k logn). The design and analysis of our algorithm are based on a new connection to threshold phenomena in monotone functions. We further prove exponential-in-k lower bounds when f is non-monotone, and when f is monotone but the algorithm is only given random examples of f. These lower bounds show that assumptions on the structure of f and query access to it are both necessary for the polynomial dependence on k that we achieve. 
    more » « less
  7. Using the framework of boosting, we prove that all impurity-based decision tree learning algorithms, including the classic ID3, C4.5, and CART, are highly noise tolerant. Our guarantees hold under the strongest noise model of nasty noise, and we provide near-matching upper and lower bounds on the allowable noise rate. We further show that these algorithms, which are simple and have long been central to everyday machine learning, enjoy provable guarantees in the noisy setting that are unmatched by existing algorithms in the theoretical literature on decision tree learning. Taken together, our results add to an ongoing line of research that seeks to place the empirical success of these practical decision tree algorithms on firm theoretical footing. 
    more » « less