Although there are tools to help developers understand the matching behaviors between a regular expression and a string, regular-expression related faults are still common. Learning developers’ behavior through the change history of regular expressions can identify common edit patterns, which can inform the creation of mutation and repair operators to assist with testing and fixing regular expressions.
In this work, we explore how regular expressions evolve over time, focusing on the characteristics of regular expression edits, the syntactic and semantic difference of the edits, and the feature changes of edits. Our exploration uses two datasets. First, we look at GitHub projects that have a regular expression in their current version and look back through the commit logs to collect the regular expressions’ edit history. Second, we collect regular expressions composed by study participants during problem- solving tasks. Our results show that 1) 95% of the regular expressions from GitHub are not edited, 2) most edited regular expressions have a syntactic distance of 4-6 characters from their predecessors, 3) over 50% of the edits in GitHub tend to expand the scope of regular expression, and 4) the number of features used indicates the regular expression language usage increases over time. This work has implications for supporting regular expression repair and mutation to ensure test suite quality.
more »
« less
This content will become publicly available on April 16, 2025
Sequence Abstractions for Flexible, Line-Rate Network Monitoring
We develop FLM, a high-level language that enables network operators to write programs that recognize and react to specific packet sequences. To be able to examine every packet, our compilation procedure can transform FLM programs into P4 code that can run on programmable switch ASICs. It first splits FLM programs into a state management component and a classical regular expression, then generates an efficient implementation of the regular expression using SMT-based program synthesis. Our experiments find that FLM can express 15 sequence monitoring tasks drawn from prior literature. Our compiler can convert all of these programs to run on switch hardware in way that fit within available pipeline stages and consume less than 15% additional header fields and instruction words when run alongside switch programs.
more »
« less
- Award ID(s):
- 2219862
- PAR ID:
- 10534019
- Publisher / Repository:
- USENIX Association
- Date Published:
- ISBN:
- 978-1-939133-39-7
- Page Range / eLocation ID:
- 1593-1620
- Format(s):
- Medium: X
- Location:
- Santa Clara, CA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We propose the design of an energy packet switch for forwarding and delivery of energy in digital power grids in this paper. The proposed switch may receive energy from one or multiple power sources, store and forward it in the form of energy packets to requesting loads connected to one or multiple ports of the switch. Energy packets carry discrete amounts of energy for a finely controlled supply. Loads receive discrete amounts of energy through packets rather than a continuing and discretionary energy flow. Using energy packets may help manage the delivery of power in a more reliable, robust, and economical form than that used by the present power grid. The control and management of the proposed switch are based on a request-grant protocol. The switch uses a data network for the transmission of these requests and grants. The energy packet switch may be the centerpiece for creating infrastructure in the realization of the digital power grid. The design of the energy packet switch is based on shared supercapacitors to shape and manage discretization of energy. We introduce the design and analysis of the electrical properties of the proposed switch and describe the procedure used in the switch to determine the amount of energy transmitted to requesting loads.more » « less
-
Regular expressions are frequently found in programming projects. Studies have found that developers can accurately determine whether a string matches a regular expression. However, we still do not know the challenges associated with composing regular expressions. We conduct an exploratory case study to reveal the tools and strategies developers use during regular expression composition. In this study, 29 students are tasked with composing regular expressions that pass unit tests illustrating the intended behavior. The tasks are in Java and the Eclipse IDE was set up with JUnit tests. Participants had one hour to work and could use any Eclipse tools, web search, or web-based tools they desired. Screen- capture software recorded all interactions with browsers and the IDE. We analyzed the videos quantitatively by transcribing logs and extracting personas. Our results show that participants were 30% successful (28 of 94 attempts) at achieving a 100% pass rate on the unit tests. When participants used tools frequently, as in the case of the novice tester and the knowledgeable tester personas, or when they guess at a solution prior to searching, they are more likely to pass all the unit tests. We also found that compile errors often arise when participants searched for a result and copy/pasted the regular expression from another language into their Java files. These results point to future research into making regular expression composition easier for programmers, such as integrating visualization into the IDE to reduce context switching or providing language migration support when reusing regular expressions written in another language to reduce compile errors.more » « less
-
The growing number of network functions drives the need to install increasing numbers of fine-grained packet classification rules in the network switches. However, this demand for rules is outstripping the size of switch memory. While much work has focused on compressing classification rules, most of this work proposes solutions operating in the memory of a single switch. This paper proposed, instead, a collaborative approach encompassing switches and network functions: we couple approximate classification at switches with fine-grained filtering where needed at network functions to accomplish overall classification. This architecture enables a trade-off between usage of (expensive) switch memory and (cheaper) downstream network bandwidth and network function resources. Our implementation uses approximate classification and Prefiltering to reduce switch memory usage. Our system can reduce memory significantly compared to a strawman approach, as shown by simulations of real traffic traces and rules.more » « less
-
null (Ed.)The Jellyfish network has recently been proposed as an alternative to the fat-tree network for data centers and high-performance computing clusters. Jellyfish uses a random regular graph as its switch-level topology and has shown to be more cost-effective than fat-trees. Effective routing on Jellyfish is challenging. It is known that shortest path routing and equal cost multi-path routing (ECMP) do not work well on Jellyfish. Existing schemes use variations of k-shortest path routing (KSP). In this work, we study two routing components for Jellyfish: path selection that decides the paths to route traffic, and routing mechanisms that decide which path to be used for each packet. We show that the performance of the existing KSP can be significantly improved by incorporating two heuristics, randomization and edge-disjointness. We evaluate a range of routing mechanisms, including traffic oblivious and traffic adaptive schemes, and identify an adaptive routing scheme with noticeably higher performance than others.more » « less