  Abstract

    Cancer cell lines serve as modelin vitrosystems for investigating therapeutic interventions. Recent advances in high‐throughput genomic profiling have enabled the systematic comparison between cell lines and patient tumor samples. The highly interconnected nature of biological data, however, presents a challenge when mapping patient tumors to cell lines. Standard clustering methods can be particularly susceptible to the high level of noise present in these datasets and only output clusters at one unknown scale of the data. In light of these challenges, we present NetCellMatch, a robust framework for network‐based matching of cell lines to patient tumors. NetCellMatch first constructs a global network across all cell line‐patient samples using their genomic similarity. Then, a multi‐scale community detection algorithm integrates information across topologically meaningful (clustering) scales to obtain Network‐Based Matching Scores (NBMS). NBMS are measures ofcluster robustnesswhich map patient tumors to cell lines. We use NBMS to determine representative “avatar” cell lines for subgroups of patients. We apply NetCellMatch to reverse‐phase protein array data obtained from The Cancer Genome Atlas for patients and the MD Anderson Cell Line Project for cell lines. Along with avatar cell line identification, we evaluate connectivity patterns for breast, lung, and colon cancer and explore the proteomic profiles of avatars and their corresponding top matching patients. Our results demonstrate our framework's ability to identify both patient‐cell line matches and potential proteomic drivers of similarity. Our methods are general and can be easily adapted to other'omic datasets.

