skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Data efficiency of classification strategies for chemical and materials design
We benchmark the performance of space-filling and active learning algorithms on classification problems in materials science, revealing trends in optimally data-efficient algorithms.  more » « less
Award ID(s):
2118861 2237470
PAR ID:
10647294
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Royal Society of Chemistry
Date Published:
Journal Name:
Digital Discovery
Volume:
4
Issue:
1
ISSN:
2635-098X
Page Range / eLocation ID:
135 to 148
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Online Allocation of Reusable Resources: New Algorithms and Analytical Tools In the paper “Asymptotically Optimal Competitive Ratio for Online Allocation of Reusable Resources,” the authors develop novel algorithms and analysis techniques for online allocation of reusable resources. Their approach leads to an algorithm with the highest possible competitive ratio, a result that was previously out of reach with the algorithms and techniques that are used in classic settings in which resources are nonreusable. More generally, their LP-free analysis approach is useful for analyzing the performance of online algorithms for various other settings in which the standard primal-dual approach fails. 
    more » « less
  2. We present dynamic algorithms withpolylogarithmicupdate time for estimating the size of the maximum matching of a graph undergoing edge insertions and deletions with approximation ratiostrictly better than 2. Specifically, we obtain a\(1+\frac{1}{\sqrt {2}}+\epsilon \approx 1.707+\epsilon \)approximation in bipartite graphs and a 1.973 + ϵ approximation in general graphs. We thus answer in the affirmative the value version of the major open question repeatedly asked in the dynamic graph algorithms literature. Our randomized algorithms’ approximation and worst-case update time bounds both hold w.h.p. against adaptive adversaries. Our algorithms are based on simulating new two-pass streaming matching algorithms in the dynamic setting. Our key new idea is to invoke the recent sublinear-time matching algorithm of Behnezhad (FOCS’21) in a white-box manner to efficiently simulate the second pass of our streaming algorithms, while bypassing the well-known vertex-update barrier. 
    more » « less
  3. Stencil computations are widely used to simulate the change of state of physical systems across a multidimensional grid over multiple timesteps. The state-of-the-art techniques in this area fall into three groups: cache-aware tiled looping algorithms, cache-oblivious divide-and-conquer trapezoidal algorithms, and Krylov subspace methods. In this article, we present two efficient parallel algorithms for performing linear stencil computations. Current direct solvers in this domain are computationally inefficient, and Krylov methods require manual labor and mathematical training. We solve these problems for linear stencils by using discrete Fourier transforms preconditioning on a Krylov method to achieve a direct solver that is both fast and general. Indeed, while all currently available algorithms for solving general linear stencils perform Θ(NT) work, whereNis the size of the spatial grid andTis the number of timesteps, our algorithms performo(NT) work. To the best of our knowledge, we give the first algorithms that use fast Fourier transforms to compute final grid data by evolving the initial data for many timesteps at once. Our algorithms handle both periodic and aperiodic boundary conditions and achieve polynomially better performance bounds (i.e., computational complexity and parallel runtime) than all other existing solutions. Initial experimental results show that implementations of our algorithms that evolve grids of roughly 107cells for around 105timesteps run orders of magnitude faster than state-of-the-art implementations for periodic stencil problems, and 1.3× to 8.5× faster for aperiodic stencil problems. Code Repository:https://github.com/TEAlab/FFTStencils 
    more » « less
  4. Summary Expert based ensemble learning algorithms often serve as online learning algorithms for an unknown, possibly time‐varying, probability distribution. Their simplicity allows flexibility in design choices, leading to variations that balance adaptiveness and consistency. This article provides an analytical framework to quantify the adaptiveness and consistency of expert based ensemble learning algorithms. With properly selected states, the algorithms are modeled as a Markov chains. Then quantitative metrics of adaptiveness and consistency can be calculated through mathematical formulas, other than relying on numerical simulations. Results are derived for several popular ensemble learning algorithms. Success of the method has also been demonstrated in both simulation and experimental results. 
    more » « less
  5. In this article, we advance divide-and-conquer strategies for solving the community detection problem in networks. We propose two algorithms that perform clustering on several small subgraphs and finally patch the results into a single clustering. The main advantage of these algorithms is that they significantly bring down the computational cost of traditional algorithms, including spectral clustering, semidefinite programs, modularity-based methods, likelihood-based methods, etc., without losing accuracy, and even improving accuracy at times. These algorithms are also, by nature, parallelizable. Since most traditional algorithms are accurate, and the corresponding optimization problems are much simpler in small problems, our divide-and-conquer methods provide an omnibus recipe for scaling traditional algorithms up to large networks. We prove the consistency of these algorithms under various subgraph selection procedures and perform extensive simulations and real-data analysis to understand the advantages of the divide-and-conquer approach in various settings. 
    more » « less