Out-of-Distribution Detection through Soft Clustering with Non-Negative Kernel Regression

Gulati, Aryan; Dong, Xingjian; Hurtado, Carlos; Shekkizhar, Sarath; Swayamdipta, Swabha; Ortega, Antonio

Citation Details

This content will become publicly available on November 6, 2025

Out-of-Distribution Detection through Soft Clustering with Non-Negative Kernel Regression

As language models become more general pur- pose, increased attention needs to be paid to detecting out-of-distribution (OOD) instances, i.e., those not belonging to any of the distribu- tions seen during training. Existing methods for detecting OOD data are computationally complex and storage-intensive. We propose a novel soft clustering approach for OOD detec- tion based on non-negative kernel regression. Our approach greatly reduces computational and space complexities (up to 11× improve- ment in inference time and 87% reduction in storage requirements). It outperforms existing approaches by up to 4 AUROC points on four benchmarks. We also introduce an entropy- constrained version of our algorithm, leading to further reductions in storage requirements (up to 97% lower than comparable approaches) while retaining competitive performance. Our soft clustering approach for OOD detection highlights its potential for detecting tail-end phenomena in extreme-scale data settings. Our source code is available on Github. more »

Award ID(s):: 2009032

PAR ID:: 10555789

Author(s) / Creator(s):: Gulati, Aryan; Dong, Xingjian; Hurtado, Carlos; Shekkizhar, Sarath; Swayamdipta, Swabha; Ortega, Antonio

Publisher / Repository:: ACL

Date Published:: 2024-11-06

Format(s):: Medium: X

Location:: https://aclanthology.org/2024.findings-emnlp.758/

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on November 6, 2025
Conference Paper:
The DOI is not currently available.

More Like this