Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Understanding generalization and robustness of machine learning models funda- mentally relies on assuming an appropriate metric on the data space. Identifying such a metric is particularly challenging for non-Euclidean data such as graphs. Here, we propose a pseudometric for attributed graphs, the Tree Mover’s Distance (TMD), and study its relation to generalization. Via a hierarchical optimal transport problem, TMD reflects the local distribution of node attributes as well as the distri- bution of local computation trees, which are known to be decisive for the learning behavior of graph neural networks (GNNs). First, we show that TMD captures properties relevant to graph classification: a simple TMD-SVM performs competi- tively with standard GNNs. Second, we relate TMD to generalization of GNNs under distribution shifts, and show that it correlates well with performance drop under such shifts.more » « less
-
Understanding the generalization of deep neural networks is one of the most important tasks in deep learning. Although much progress has been made, theoretical error bounds still often behave disparately from empirical observations. In this work, we develop margin-based generalization bounds, where the margins are normalized with optimal transport costs between independent random subsets sampled from the training distribution. In particular, the optimal transport cost can be interpreted as a generalization of variance which captures the structural properties of the learned feature space. Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets. Theoretically, we demonstrate that the concentration and separation of features play crucial roles in generalization, supporting empirical results in the literature. The code is available at https://github.com/chingyaoc/kV-Margin.more » « less
-
Abstract The Dark Energy Spectroscopic Instrument (DESI) completed its 5 month Survey Validation in 2021 May. Spectra of stellar and extragalactic targets from Survey Validation constitute the first major data sample from the DESI survey. This paper describes the public release of those spectra, the catalogs of derived properties, and the intermediate data products. In total, the public release includes good-quality spectral information from 466,447 objects targeted as part of the Milky Way Survey, 428,758 as part of the Bright Galaxy Survey, 227,318 as part of the Luminous Red Galaxy sample, 437,664 as part of the Emission Line Galaxy sample, and 76,079 as part of the Quasar sample. In addition, the release includes spectral information from 137,148 objects that expand the scope beyond the primary samples as part of a series of secondary programs. Here, we describe the spectral data, data quality, data products, Large-Scale Structure science catalogs, access to the data, and references that provide relevant background to using these spectra.
-
Abstract The Dark Energy Spectroscopic Instrument (DESI) embarked on an ambitious 5 yr survey in 2021 May to explore the nature of dark energy with spectroscopic measurements of 40 million galaxies and quasars. DESI will determine precise redshifts and employ the baryon acoustic oscillation method to measure distances from the nearby universe to beyond redshift
z > 3.5, and employ redshift space distortions to measure the growth of structure and probe potential modifications to general relativity. We describe the significant instrumentation we developed to conduct the DESI survey. This includes: a wide-field, 3.°2 diameter prime-focus corrector; a focal plane system with 5020 fiber positioners on the 0.812 m diameter, aspheric focal surface; 10 continuous, high-efficiency fiber cable bundles that connect the focal plane to the spectrographs; and 10 identical spectrographs. Each spectrograph employs a pair of dichroics to split the light into three channels that together record the light from 360–980 nm with a spectral resolution that ranges from 2000–5000. We describe the science requirements, their connection to the technical requirements, the management of the project, and interfaces between subsystems. DESI was installed at the 4 m Mayall Telescope at Kitt Peak National Observatory and has achieved all of its performance goals. Some performance highlights include an rms positioner accuracy of better than 0.″1 and a median signal-to-noise ratio of 7 of the [Oii ] doublet at 8 × 10−17erg s−1cm−2in 1000 s for galaxies atz = 1.4–1.6. We conclude with additional highlights from the on-sky validation and commissioning, key successes, and lessons learned.