Systematic Testing of the Data-Poisoning Robustness of KNN

Li, Yannan; Wang, Jingbo; Wang, Chao

doi:10.1145/3597926.3598129

Citation Details

This content will become publicly available on July 12, 2024

Systematic Testing of the Data-Poisoning Robustness of KNN

Data poisoning aims to compromise a machine learning based software component by contaminating its training set to change its prediction results for test inputs. Existing methods for deciding data-poisoning robustness have either poor accuracy or long running time and, more importantly, they can only certify some of the truly-robust cases, but remain inconclusive when certification fails. In other words, they cannot falsify the truly-non-robust cases. To overcome this limitation, we propose a systematic testing based method, which can falsify as well as certify data-poisoning robustness for a widely used supervised-learning technique named k-nearest neighbors (KNN). Our method is faster and more accurate than the baseline enumeration method, due to a novel over-approximate analysis in the abstract domain, to quickly narrow down the search space, and systematic testing in the concrete domain, to find the actual violations. We have evaluated our method on a set of supervised-learning datasets. Our results show that the method significantly outperforms state-of-the-art techniques, and can decide data-poisoning robustness of KNN prediction results for most of the test inputs. more »

Award ID(s):: 2220345

NSF-PAR ID:: 10467190

Author(s) / Creator(s):: Li, Yannan; Wang, Jingbo; Wang, Chao

Publisher / Repository:: ACM

Date Published:: 2023-07-12

Page Range / eLocation ID:: 1207 to 1218

Format(s):: Medium: X

Location:: Seattle WA USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on July 12, 2024
Conference Paper:
https://doi.org/10.1145/3597926.3598129

More Like this