Multi-probe random projection clustering to secure very large distributed datasets

Carraher, Lee A.; Wilsey, Philip A.; Moitra, Anindya; Dey, Sayantan

doi:10.1109/BigData.2015.7363964

Citation Details

Multi-probe random projection clustering to secure very large distributed datasets

This paper presents a solution to the approximate k-means clustering problem for very large distributed datasets. Distributed data models have gained popularity in recent years following the efforts of commercial, academic and government organizations, to make data more widely accessible. Due to the sheer volume of available data, in-memory single-core computation quickly becomes infeasible, requiring distributed multi-processing. Our solution achieves comparable clustering performance to other popular clustering algorithms, with improved overall complexity growth while being amenable to distributed processing frameworks such as Map-Reduce. Our solution also maintains certain guarantees regarding data privacy deanonimization. more »

Award ID(s):: 1440420

PAR ID:: 10193710

Author(s) / Creator(s):: Carraher, Lee A.; Wilsey, Philip A.; Moitra, Anindya; Dey, Sayantan

Date Published:: 2015-10-01

Journal Name:: 2015 IEEE International Conference on Big Data

Page Range / eLocation ID:: 1891 to 1900

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/BigData.2015.7363964

More Like this