NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Harnessing Public Code Repositories to Develop Production-Ready ML Artifacts for Networking

https://doi.org/10.1145/3673422.3674898

Khan, Punnal Ismail; Guthula, Satyandra; Beltiukov, Roman; Schmid, Roland; Bühler, Tobias; Gupta, Arpit; Vanbever, Laurent; Willinger, Walter (July 2024, ACM)

Full Text Available
Leveraging Prefix Structure to Detect Volumetric DDoS Attack Signatures with Programmable Switches

https://doi.org/10.1109/SP54263.2024.00267

Misa, Chris; Durairajan, Ramakrishnan; Gupta, Arpit; Rejaie, Reza; Willinger, Walter (May 2024, IEEE)

As increasingly complex and dynamic volumetric DDoS attacks continue to wreak havoc on edge networks, two recent developments promise to bolster DDoS defense at the edge. First, programmable switches have emerged as promising means for achieving scalable and cost-effective attack signature detection. However, their practical application in edge networks remains a challenging open problem. Second, machine learning (ML)-based solutions have demonstrated potential in accurately detecting attack signatures based on per-flow traffic features. Yet, their inability to effectively scale to the traffic volumes and number of flows in actual production edge networks has largely excluded them from practical considerations.In this paper, we introduce ZAPDOS, a novel approach to accurately, quickly, and scalably detect volumetric DDoS attack signatures at the source prefix level. ZAPDOS is the first to utilize a key characteristic of the observed structure of measured attack and benign source prefixes (i.e., a pronounced cluster-within-cluster property) and effectively apply it in practice against modern attacks. ZAPDOS operates by monitoring aggregate prefix-level features in switch hardware, employing a learning model to identify prefixes suspected of containing attack sources, and using several innovative algorithmic methods to pinpoint attack sources efficiently. We have built a hardware prototype of ZAPDOS and a packet-level software simulator which achieves comparable accuracy results. Since existing datasets are inadequate for training and evaluating prefix-level models, we have developed a new data-fusion methodology for training and evaluating ZAPDOS. We use our prototype and simulator to show that ZAPDOS can detect volumetric DDoS attack signatures with orders of magnitude lower error rates than state-of-the-art under comparable monitoring resource budgets and for a range of different attack scenarios.
more » « less
Full Text Available
Query Planning for Robust and Scalable Hybrid Network Telemetry Systems

https://doi.org/10.1145/3649471

Shou, Chaofan; Bhatia, Rohan; Gupta, Arpit; Harrison, Rob; Lokshtanov, Daniel; Willinger, Walter (March 2024, Proceedings of the ACM on Networking)

Network telemetry systems have become hybrid combinations of state-of-the-art stream processors and modern programmable data-plane devices. However, the existing designs of such systems have not focused on ensuring that these systems are also deployable in practice, i.e., able to scale and deal with the dynamics in real-world traffic and query workloads. Unfortunately, efforts to scale these hybrid systems are hampered by severe constraints on available compute resources in the data plane (e.g., memory, ALUs). Similarly, the limited runtime programmability of existing hardware data-plane targets critically affects efforts to make these systems robust. This paper presents the design and implementation of DynaMap, a new hybrid telemetry system that is both robust and scalable. By planning for telemetry queries dynamically, DynaMap allows the remapping of stateful dataflow operators to data-plane registers at runtime. We model the problem of mapping dataflow operators to data-plane targets formally and develop a new heuristic algorithm for solving this problem. We implement our algorithm in prototype and demonstrate its feasibility with existing hardware targets based on Intel Tofino. Using traffic workloads from different real-world production networks, we show that our prototype of DynaMap improves performance on average by 1-2 orders of magnitude over state-of-the-art hybrid systems that use only static query planning.
more » « less
Full Text Available
In Search of netUnicorn: A Data-Collection Platform to Develop Generalizable ML Models for Network Security Problems

https://doi.org/10.1145/3576915.3623075

Beltiukov, Roman; Guo, Wenbo; Gupta, Arpit; Willinger, Walter (November 2023, ACM)

The remarkable success of the use of machine learning-based solutions for network security problems has been impeded by the developed ML models’ inability to maintain efficacy when used in different network environments exhibiting different network behaviors. This issue is commonly referred to as the generalizability problem of ML models. The community has recognized the critical role that training datasets play in this context and has developed various techniques to improve dataset curation to overcome this problem. Unfortunately, these methods are generally ill-suited or even counterproductive in the network security domain, where they often result in unrealistic or poor-quality datasets. To address this issue, we propose a new closed-loop ML pipeline that leverages explainable ML tools to guide the network data collection in an iterative fashion. To ensure the data’s realism and quality, we require that the new datasets should be endogenously collected in this iterative process, thus advocating for a gradual removal of data-related problems to improve model generalizability. To realize this capability, we develop a data-collection platform, netUnicorn, that takes inspiration from the classic “hourglass” model and is implemented as its “thin waist" to simplify data collection for different learning problems from diverse network environments. The proposed system decouples data-collection intents from the deployment mechanisms and disaggregates these high-level intents into smaller reusable, self-contained tasks. We demonstrate how netUnicorn simplifies collecting data for different learning problems from multiple network environments and how the proposed iterative data collection improves a model’s generalizability
more » « less
Full Text Available
Estimating WebRTC Video QoE Metrics Without Using Application Headers

https://doi.org/10.1145/3618257.3624828

Sharma, Taveesh; Mangla, Tarun; Gupta, Arpit; Jiang, Junchen; Feamster, Nick (October 2023, ACM)

Full Text Available
A NetAI Manifesto (Part II): Less Hubris, more Humility

https://doi.org/10.1145/3626570.3626610

Willinger, Walter; Gupta, Arpit; Beltiukov, Roman; Guo, Wenbo (September 2023, ACM SIGMETRICS Performance Evaluation Review)

The application of the latest techniques from artificial intelligence (AI) and machine learning (ML) to improve and automate the decision-making required for solving real-world network security and performance problems (NetAI, for short) has generated great excitement among networking researchers. However, network operators have remained very reluctant when it comes to deploying NetAIbased solutions in their production networks. In Part I of this manifesto, we argue that to gain the operators' trust, researchers will have to pursue a more scientific approach towards NetAI than in the past that endeavors the development of explainable and generalizable learning models. In this paper, we go one step further and posit that this opening up of NetAI research will require that the largely self-assured hubris about NetAI gives way to a healthy dose humility. Rather than continuing to extol the virtues and magic of black-box models that largely obfuscate the critical role of the utilized data play in training these models, concerted research efforts will be needed to design NetAI-driven agents or systems that can be expected to perform well when deployed in production settings and are also required to exhibit strong robustness properties when faced with ambiguous situations and real-world uncertainties. We describe one such effort that is aimed at developing a new ML pipeline for generating trained models that strive to meet these expectations and requirements.
more » « less
Full Text Available
A NetAI Manifesto (Part I): Less Explorimentation, More Science

https://doi.org/10.1145/3626570.3626609

Willinger, Walter; Gupta, Arpit; Jacobs, Arthur S.; Beltiukov, Roman; Ferreira, Ronaldo A.; Granville, Lisandro (September 2023, ACM SIGMETRICS Performance Evaluation Review)

The application of the latest techniques from artificial intelligence (AI) and machine learning (ML) to improve and automate the decision-making required for solving real-world network security and performance problems (NetAI, for short) has generated great excitement among networking researchers. However, network operators have remained very reluctant when it comes to deploying NetAI-based solutions in their production networks, mainly because the black-box nature of the underlying learning models forces operators to blindly trust these models without having any understanding of how they work, why they work, or when they don't work (and why not). Paraphrasing [1], we argue that to overcome this roadblock and ensure its future success in practice, NetAI has to get past its current stage of explorimentation, or the practice of poking around to see what happens and has to start employing tools of the scientific method.
more » « less
Full Text Available
PINOT: Programmable Infrastructure for Networking

https://doi.org/10.1145/3606464.3606485

Beltiukov, Roman; Chandrasekaran, Sanjay; Gupta, Arpit; Willinger, Walter (July 2023, ACM)

Full Text Available
SPG: Structure-Private Graph Database via SqueezePIR

https://doi.org/10.14778/3587136.3587138

Liang, Ling; Lin, Jilan; Qu, Zheng; Ahmad, Ishtiyaque; Tu, Fengbin; Gupta, Trinabh; Ding, Yufei; Xie, Yuan (March 2023, Proceedings of the VLDB Endowment)

Many relational data in our daily life are represented as graphs, making graph application an important workload. Because of the large scale of graph datasets, moving graph data to the cloud becomes a popular option. To keep the confidential and private graph secure from an untrusted cloud server, many cryptographic techniques are leveraged to hide the content of the data. However, protecting only the data content is not enough for a graph database. Because the structural information of the graph can be revealed through the database accessing track. In this work, we study the graph neural network (GNN), an important graph workload to mine information from a graph database. We find that the server is able to infer which node is processing during the edge retrieving phase and also learn its neighbor indices during GNN's aggregation phase. This leads to the leakage of the information of graph structure data. In this work, we present SPG, a structure-private graph database with SqueezePIR. Our SPG is built on top of Private Information Retrieval (PIR), which securely hides which nodes/neighbors are accessed. In addition, we propose SqueezePIR, a compression technique to overcome the computation overhead of PIR. Based on our evaluation, our SqueezePIR achieves 11.85× speedup on average with less than 2% accuracy loss when compared to the state-of-the-art FastPIR protocol.
more » « less
Full Text Available
AI/ML for Network Security: The Emperor has no Clothes

Jacobs, Arthur; Beltiukov, Roman; Willinger, Walter; Ferreira, Ronaldo; Gupta, arpit; Granville Lisandro (January 2022, ACM Conference on Computer and Communications Security (CCS))

Several recent research efforts have proposed Machine Learning (ML)-based solutions that can detect complex patterns in network traffic for a wide range of network security problems. However, without understanding how these black-box models are making their decisions, network operators are reluctant to trust and deploy them in their production settings. One key reason for this reluctance is that these models are prone to the problem of underspecification, defined here as the failure to specify a model in adequate detail. Not unique to the network security domain, this problem manifests itself in ML models that exhibit unexpectedly poor behavior when deployed in real-world settings and has prompted growing interest in developing interpretable ML solutions (e.g., decision trees) for “explaining” to humans how a given black-box model makes its decisions. However, synthesizing such explainable models that capture a given black-box model’s decisions with high fidelity while also being practical (i.e., small enough in size for humans to comprehend) is challenging. In this paper, we focus on synthesizing high-fidelity and low-complexity decision trees to help network operators determine if their ML models suffer from the problem of underspecification. To this end, we present TRUSTEE, a framework that takes an existing ML model and training dataset generate a high-fidelity, easy-to-interpret decision tree, and associated trust report. Using published ML models that are fully reproducible, we show how practitioners can use TRUSTEE to identify three common instances of model underspecification, i.e., evidence of shortcut learning, spurious correlations, and vulnerability to out-of-distribution samples.
more » « less
Full Text Available

« Prev Next »

Search for: All records