Descriptors are physically-inspired, symmetry-preserving schemes for representing atomistic systems that play a central role in the construction of models of potential energy surfaces. Although physical intuition can be flexibly encoded into descriptor schemes, they are generally ultimately guided only by the spatial or topological arrangement of atoms in the system. However, since interatomic potential models aim to capture the variation of the potential energy with respect to atomic configurations, it is conceivable that they would benefit from descriptor schemes that implicitly encode both structural and energetic information rather than structural information alone. Therefore, we propose a novel approach for the optimisation of descriptors based on encoding information about geodesic distances along potential energy manifolds into the hyperparameters of commonly used descriptor schemes. To accomplish this, we combine two ideas: (1) a differential-geometric approach for the fast estimation of approximate geodesic distances [Zhu et al., J. Chem. Phys. 150, 164103 (2019)]; and (2) an information-theoretic evaluation metric – information imbalance – for measuring the shared information between two distance measures [Glielmo et al. PNAS Nexus, 1, 1 (2022)]. Using three example molecules – ethanol, malonaldehyde, and aspirin – from the MD22 dataset, we first show that Euclidean (in Cartesian coordinates) and geodesic distances are inequivalent distance measures, indicating the need for updated ground-truth distance measures that go beyond the Euclidean (or, more broadly, spatial) distance. We then utilize a Bayesian optimisation framework to show that descriptors (in this case, atom-centred symmetry functions) can be optimized to maximally express a certain type of distance information, such as Euclidean or geodesic information. We also show that modifying the Bayesian optimisation algorithm to minimise a combined objective function – the sum of the descriptor↔Euclidean and descriptor↔geodesic information imbalances – can yield descriptors that not only optimally express both Euclidean and geodesic distance information simultaneously, but in fact resolve substantial disagreements between descriptors optimized to encode only one type of distance measure. We discuss the relevance of our approach to the design of more physically rich and informative descriptors that can encode useful, alternative information about molecular systems. 
                        more » 
                        « less   
                    
                            
                            Transport inequalities on Euclidean spaces for non-Euclidean metrics
                        
                    
    
            We explore upper bounds on Kantorovich transport distances between probability measures on the Euclidean spaces in terms of their Fourier-Stieltjes transforms, with focus on non-Euclidean metrics. The results are illustrated on empirical measures in the optimal matching problem on the real line. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1855575
- PAR ID:
- 10222271
- Date Published:
- Journal Name:
- Journal of fourier analysis applications
- Volume:
- 26
- Issue:
- 4
- ISSN:
- 1531-5851
- Page Range / eLocation ID:
- 1-27
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            We study the problem of representing all distances between n points in Rd, with arbitrarily small distortion, using as few bits as possible. We give asymptotically tight bounds for this problem, for Euclidean metrics, for ℓ1 (a.k.a.~Manhattan) metrics, and for general metrics. Our bounds for Euclidean metrics mark the first improvement over compression schemes based on discretizing the classical dimensionality reduction theorem of Johnson and Lindenstrauss (Contemp.~Math.~1984). Since it is known that no better dimension reduction is possible, our results establish that Euclidean metric compression is possible beyond dimension reduction.more » « less
- 
            Meka, Raghu (Ed.)Recent years have seen great progress in the approximability of fundamental clustering and facility location problems on high-dimensional Euclidean spaces, including k-Means and k-Median. While they admit strictly better approximation ratios than their general metric versions, their approximation ratios are still higher than the hardness ratios for general metrics, leaving the possibility that the ultimate optimal approximation ratios will be the same between Euclidean and general metrics. Moreover, such an improved algorithm for Euclidean spaces is not known for Uncapaciated Facility Location (UFL), another fundamental problem in the area. In this paper, we prove that for any γ ≥ 1.6774 there exists ε > 0 such that Euclidean UFL admits a (γ, 1 + 2e^{-γ} - ε)-bifactor approximation algorithm, improving the result of Byrka and Aardal [Byrka and Aardal, 2010]. Together with the (γ, 1 + 2e^{-γ}) NP-hardness in general metrics, it shows the first separation between general and Euclidean metrics for the aforementioned basic problems. We also present an (α_Li - ε)-(unifactor) approximation algorithm for UFL for some ε > 0 in Euclidean spaces, where α_Li ≈ 1.488 is the best-known approximation ratio for UFL by Li [Li, 2013].more » « less
- 
            Euclidean geometry is among the earliest forms of mathematical thinking. While the geometric primitives underlying its constructions, such as perfect lines and circles, do not often occur in the natural world, humans rarely struggle to perceive and reason with them. Will computer vision models trained on natural images show the same sensitivity to Euclidean geometry? Here we explore these questions by studying few-shot generalization in the universe of Euclidean geometry constructions. We introduce Geoclidean, a domain-specific language for Euclidean geometry, and use it to generate two datasets of geometric concept learning tasks for benchmarking generalization judgements of humans and machines. We find that humans are indeed sensitive to Euclidean geometry and generalize strongly from a few visual examples of a geometric concept. In contrast, low-level and high-level visual features from standard computer vision models pretrained on natural images do not support correct generalization. Thus Geoclidean represents a novel few-shot generalization benchmark for geometric concept learning, where the performance of humans and of AI models diverge. The Geoclidean framework and dataset are publicly available for download.more » « less
- 
            null (Ed.)Lightness and sparsity are two natural parameters for Euclidean (1+ε)-spanners. Classical results show that, when the dimension d ∈ ℕ and ε > 0 are constant, every set S of n points in d-space admits an (1+ε)-spanners with O(n) edges and weight proportional to that of the Euclidean MST of S. Tight bounds on the dependence on ε > 0 for constant d ∈ ℕ have been established only recently. Le and Solomon (FOCS 2019) showed that Steiner points can substantially improve the lightness and sparsity of a (1+ε)-spanner. They gave upper bounds of Õ(ε^{-(d+1)/2}) for the minimum lightness in dimensions d ≥ 3, and Õ(ε^{-(d-1))/2}) for the minimum sparsity in d-space for all d ≥ 1. They obtained lower bounds only in the plane (d = 2). Le and Solomon (ESA 2020) also constructed Steiner (1+ε)-spanners of lightness O(ε^{-1}logΔ) in the plane, where Δ ∈ Ω(log n) is the spread of S, defined as the ratio between the maximum and minimum distance between a pair of points. In this work, we improve several bounds on the lightness and sparsity of Euclidean Steiner (1+ε)-spanners. Using a new geometric analysis, we establish lower bounds of Ω(ε^{-d/2}) for the lightness and Ω(ε^{-(d-1)/2}) for the sparsity of such spanners in Euclidean d-space for all d ≥ 2. We use the geometric insight from our lower bound analysis to construct Steiner (1+ε)-spanners of lightness O(ε^{-1}log n) for n points in Euclidean plane.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    