On The Effectiveness of Active Learning by Uncertainty Sampling in Classification of High-Dimensional Gaussian Mixture Data

Mai, Xiaoyi; Avestimehr, Salman; Ortega, Antonio; Soltanolkotabi, Mahdi

doi:10.1109/ICASSP43922.2022.9747685

Citation Details

On The Effectiveness of Active Learning by Uncertainty Sampling in Classification of High-Dimensional Gaussian Mixture Data

Active learning aims to reduce the cost of labeling through selective sampling. Despite reported empirical success over passive learning, many popular active learning heuristics such as uncertainty sampling still lack satisfying theoretical guarantees. Towards closing the gap between practical use and theoretical understanding in active learning, we propose to characterize the exact behavior of uncertainty sampling for high-dimensional Gaussian mixture data, in a modern regime of big data where the numbers of samples and features are commensurately large. Through a sharp characterization of the learning results, our analysis sheds light on the important question of when uncertainty sampling works better than passive learning. Our results show that the effectiveness of uncertainty sampling is not always ensured. In fact it depends crucially on the choice of i) an adequate initial classifier used to start the active sampling process and ii) a proper loss function that allows an adaptive treatment of samples queried at various steps. more »

Award ID(s):: 1813877 1846369

PAR ID:: 10483554

Author(s) / Creator(s):: Mai, Xiaoyi; Avestimehr, Salman; Ortega, Antonio; Soltanolkotabi, Mahdi

Publisher / Repository:: IEEE

Date Published:: 2022-05-23

ISBN:: 978-1-6654-0540-9

Page Range / eLocation ID:: 4238 to 4242

Format(s):: Medium: X

Location:: Singapore, Singapore

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ICASSP43922.2022.9747685

More Like this