Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design

Nguyen, Quan; Dieng, Adji_Bousso

Citation Details

Experimental design techniques such as active search and Bayesian optimization are widely used in the natural sciences for data collection and discovery. However, existing techniques tend to favor exploitation over exploration of the search space, which causes them to get stuck in local optima. This collapse problem prevents experimental design algorithms from yielding diverse high-quality data. In this paper, we extend the Vendi scores—a family of interpretable similarity-based diversity metrics—to account for quality. We then leverage these quality-weighted Vendi scores to tackle experimental design problems across various applications, including drug discovery, materials discovery, and reinforcement learning. We found that quality-weighted Vendi scores allow us to construct policies for experimental design that flexibly balance quality and diversity, and ultimately assemble rich and diverse sets of high-performing data points. Our algorithms led to a 70%–170% increase in the number of effective discoveries compared to baselines. more »

Award ID(s):: 2118201

PAR ID:: 10535334

Author(s) / Creator(s):: Nguyen, Quan; Dieng, Adji_Bousso

Publisher / Repository:: MLR

Date Published:: 2024-07-27

ISSN:: 2640-3498

Format(s):: Medium: X

Location:: International Conference on Machine Learning

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this