- Award ID(s):
- 1936968
- NSF-PAR ID:
- 10287982
- Date Published:
- Journal Name:
- IEEE Symposium on Security and Privacy
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Today’s mobile apps employ third-party advertising and tracking (A&T) libraries, which may pose a threat to privacy. State-of-the-art detects and blocks outgoing A&T HTTP/S requests by using manually curated filter lists (e.g. EasyList), and recently, using machine learning approaches. The major bottleneck of both filter lists and classifiers is that they rely on experts and the community to inspect traffic and manually create filter list rules that can then be used to block traffic or label ground truth datasets. We propose NoMoATS – a system that removes this bottleneck by reducing the daunting task of manually creating filter rules, to the much easier and scalable task of labeling A&T libraries. Our system leverages stack trace analysis to automatically label which network requests are generated by A&T libraries. Using NoMoATS, we collect and label a new mobile traffic dataset. We use this dataset to train decision tree classifiers, which can be applied in real-time on the mobile device and achieve an average F-score of 93%. We show that both our automatic labeling and our classifiers discover thousands of requests destined to hundreds of different hosts, previously undetected by popular filter lists. To the best of our knowledge, our system is the first to (1) automatically label which mobile network requests are engaged in A&T, while requiring to only manually label libraries to their purpose and (2) apply on-device machine learning classifiers that operate at the granularity of URLs, can inspect connections across all apps, and detect not only ads, but also tracking.more » « less
-
null (Ed.)Existing Graph Neural Network (GNN) methods that learn inductive unsupervised graph representations focus on learning node and edge representations by predicting observed edges in the graph. Although such approaches have shown advances in downstream node classification tasks, they are ineffective in jointly representing larger k-node sets, k > 2. We propose MHM-GNN, an inductive unsupervised graph representation approach that combines joint k-node representations with energy-based models (hypergraph Markov networks) and GNNs. To address the intractability of the loss that arises from this combination, we endow our optimization with a loss upper bound using a finite-sample unbiased Markov Chain Monte Carlo estimator. Our experiments show that the unsupervised MHM-GNN representations of MHM-GNN produce better unsupervised representations than existing approaches from the literature.more » « less
-
Abstract Recent work has demonstrated that geometric deep learning methods such as graph neural networks (GNNs) are well suited to address a variety of reconstruction problems in high-energy particle physics. In particular, particle tracking data are naturally represented as a graph by identifying silicon tracker hits as nodes and particle trajectories as edges, given a set of hypothesized edges, edge-classifying GNNs identify those corresponding to real particle trajectories. In this work, we adapt the physics-motivated interaction network (IN) GNN toward the problem of particle tracking in pileup conditions similar to those expected at the high-luminosity Large Hadron Collider. Assuming idealized hit filtering at various particle momenta thresholds, we demonstrate the IN’s excellent edge-classification accuracy and tracking efficiency through a suite of measurements at each stage of GNN-based tracking: graph construction, edge classification, and track building. The proposed IN architecture is substantially smaller than previously studied GNN tracking architectures; this is particularly promising as a reduction in size is critical for enabling GNN-based tracking in constrained computing environments. Furthermore, the IN may be represented as either a set of explicit matrix operations or a message passing GNN. Efforts are underway to accelerate each representation via heterogeneous computing resources towards both high-level and low-latency triggering applications.
-
Influence operations are large-scale efforts to manipulate public opinion. The rapid detection and disruption of these operations is critical for healthy public discourse. Emergent AI technologies may enable novel operations that evade detection and influence public discourse on social media with greater scale, reach, and specificity. New methods of detection with inductive learning capacity will be needed to identify novel operations before they indelibly alter public opinion and events. To this end, we develop an inductive learning framework that: (1) determines content- and graph-based indicators that are not specific to any operation; (2) uses graph learning to encode abstract signatures of coordinated manipulation; and (3) evaluates generalization capacity by training and testing models across operations originating from Russia, China, and Iran. We find that this framework enables strong cross-operation generalization while also revealing salient indicators-illustrating a generic approach which directly complements transductive methodologies, thereby enhancing detection coverage.more » « less
-
Despite the prevalence of hypergraphs in a variety of high-impact applications, there are relatively few works on hypergraph representation learning, most of which primarily focus on hyperlink prediction, and are often restricted to the transductive learning setting. Among others, a major hurdle for effective hypergraph representation learning lies in the label scarcity of nodes and/or hyperedges. To address this issue, this paper presents an end-to-end, bi-level pre-training strategy with Graph Neural Networks for hypergraphs. The proposed framework named HyperGRL bears three distinctive advantages. First, it is mainly designed in the self-supervised fashion which has broad applicability, and meanwhile it is also capable of ingesting the labeling information when available. Second, at the heart of the proposed HyperGRL are two carefully designed pretexts, one on the node level and the other on the hyperedge level, which enable us to encode both the local and the global context in a mutually complementary way. Third, the proposed framework can work in both transductive and inductive settings. When applying the two proposed pretexts in tandem, it can accelerate the adaptation of the knowledge from the pre-trained model to downstream applications in the transductive setting, thanks to the bi-level nature of the proposed method. Extensive experiments demonstrate that: (1) HyperGRL achieves up to 5.69% improvements in hyperedge classification, and (2) improves pre-training efficiency by up to 42.80% on averagemore » « less