Surrogate- and invariance-boosted contrastive learning for data-scarce applications in science

Loh, Charlotte; Christensen, Thomas; Dangovski, Rumen; Kim, Samuel; Soljačić, Marin

doi:10.1038/s41467-022-31915-y

Citation Details

Surrogate- and invariance-boosted contrastive learning for data-scarce applications in science

Abstract Deep learning techniques have been increasingly applied to the natural sciences, e.g., for property prediction and optimization or material discovery. A fundamental ingredient of such approaches is the vast quantity of labeled data needed to train the model. This poses severe challenges in data-scarce settings where obtaining labels requires substantial computational or labor resources. Noting that problems in natural sciences often benefit from easily obtainable auxiliary information sources, we introduce surrogate- and invariance-boosted contrastive learning (SIB-CL), a deep learning framework which incorporates three inexpensive and easily obtainable auxiliary information sources to overcome data scarcity. Specifically, these are: abundant unlabeled data, prior knowledge of symmetries or invariances, and surrogate data obtained at near-zero cost. We demonstrate SIB-CL’s effectiveness and generality on various scientific problems, e.g., predicting the density-of-states of 2D photonic crystals and solving the 3D time-independent Schrödinger equation. SIB-CL consistently results in orders of magnitude reduction in the number of labels needed to achieve the same network accuracies. more »

Award ID(s):: 2019786

PAR ID:: 10376790

Author(s) / Creator(s):: Loh, Charlotte; Christensen, Thomas; Dangovski, Rumen; Kim, Samuel; Soljačić, Marin

Date Published:: 2022-12-01

Journal Name:: Nature Communications

Volume:: 13

Issue:: 1

ISSN:: 2041-1723

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1038/s41467-022-31915-y

More Like this