skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: WebRobot: Web Robotic Process Automation using Interactive Programming-by-Demonstration
It is imperative to democratize robotic process automation (RPA), as RPA has become a main driver of the digital transformation but is still technically very demanding to construct, especially for non-experts. In this paper, we study how to automate an important class of RPA tasks, dubbed web RPA, which are concerned with constructing software bots that automate interactions across data and a web browser. Our main contributions are twofold. First, we develop a formal foundation which allows semantically reasoning about web RPA programs and formulate its synthesis problem in a principled manner. Second, we propose a web RPA program synthesis algorithm based on a new idea called speculative rewriting. This leads to a novel speculate-and-validate methodology in the context of rewrite-based program synthesis, which has also shown to be both theoretically simple and practically efficient for synthesizing programs from demonstrations. We have built these ideas in a new interactive synthesizer called WebRobot and evaluate it on 76 web RPA benchmarks. Our results show that WebRobot automated a majority of them effectively. Furthermore, we show that WebRobot compares favorably with a conventional rewrite-based synthesis baseline implemented using egg. Finally, we conduct a small user study demonstrating WebRobot is also usable.  more » « less
Award ID(s):
2123654
PAR ID:
10358525
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation
Page Range / eLocation ID:
152 to 167
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A key challenge in program synthesis is synthesizing programs that use libraries, which most real-world software does. The current state of the art is to model libraries with mock library implementations that perform the same function in a simpler way. However, mocks may still be large and complex, and must include many implementation details, both of which could limit synthesis performance. To address this problem, we introduce JLibSketch, a Java program synthesis tool that allows library behavior to be described with algebraic specifications, which are rewrite rules for sequences of method calls, e.g., encryption followed by decryption (with the same key) is the identity. JLibSketch implements rewrite rules by compiling JLibSketch problems into problems for the Sketch program synthesis tool. More specifically, after compilation, library calls are represented by abstract data types (ADTs), and rewrite rules manipulate those ADTs. We formalize compilation and prove it sound and complete if the rewrite rules are ordered and non-unifiable. We evaluated JLibSketch by using it to synthesize nine programs that use libraries from three domains: data structures, cryptography, and file systems. We found that algebraic specifications are, on average, about half the size of mocks. We also found that algebraic specifications perform better than mocks on seven of the nine programs, sometimes significantly so, and perform equally well on the last two programs. Thus, we believe that JLibSketch takes an important step toward synthesis of programs that use libraries. 
    more » « less
  2. Intermediate data structures are a common cause of inefficiency in functional programming.Fusionattempts to eliminate intermediate data structures by combining adjacent data traversals into one; existing fusion techniques, however, are based on predefined rewrite rules and hence are limited in expressiveness. In this work we explore a different approach to eliminating intermediate data structures, based on inductive program synthesis. We dub this approachsuperfusion(by analogy withsuperoptimization, which uses inductive synthesis for program optimization). Starting from a reference program annotated with data structures to be eliminated, superfusion first generates asketchwhere program fragments operating on those data structures are replaced with holes; it then fills the holes with constant-time expressions such that the resulting program is equivalent to the reference. The main technical challenge here is scalability because optimized programs are often complex, making the search space intractably large for naive enumeration. To address this challenge, our key insight is to first synthesize aghost functionthat describes the relationship between the original intermediate data structure and its compressed version; this function, although not used in the final program, serves to decompose the joint sketch filling problem into independent simpler problems for each hole. We implement superfusion in a tool calledSuFuand evaluate it on a dataset of 290 tasks collected from prior work on deductive fusion and program restructuring. The results show that SuFu solves 264 out of 290 tasks, exceeding the capabilities of rewriting-based fusion systems and achieving comparable performance with specialized approaches to program restructuring on their respective domains. 
    more » « less
  3. We propose a new synthesis algorithm that canefficientlysearch programs withlocalvariables (e.g., those introduced by lambdas). Prior bottom-up synthesis algorithms are not able to evaluate programs withfree local variables, and therefore cannot effectively reduce the search space of such programs (e.g., using standard observational equivalence reduction techniques), making synthesis slow. Our algorithm can reduce the space of programs with local variables. The key idea, dubbedlifted interpretation, is to lift up the program interpretation process, from evaluating one program at a time to simultaneously evaluating all programs from a grammar. Lifted interpretation provides a mechanism to systematically enumerate all binding contexts for local variables, thereby enabling us to evaluate and reduce the space of programs with local variables. Our ideas are instantiated in the domain of web automation. The resulting tool,Arborist, can automate a significantly broader range of challenging tasks more efficiently than state-of-the-art techniques includingWebRobotand Helena. 
    more » « less
  4. Using program synthesis to select instructions for and optimize input programs is receiving increasing attention. However, existing synthesis-based compilers are faced by two major challenges that prohibit the deployment of program synthesis in production compilers: exorbitantly long synthesis times spanning several minutes and hours; and scalability issues that prevent synthesis of complex modern compute and data swizzle instructions, which have been found to maximize performance of modern tensor and stencil workloads. This paper proposes MISAAL, a synthesis-based compiler that employs a novel strategy to use formal semantics of hardware instructions to automatically prune a large search space of rewrite rules for modern complex instructions in an offline stage. MISAAL also proposes a novel methodology to make term-rewriting process in the online stage (at compile-time) extremely lightweight so as to enable programs to compile in seconds. Our results show that MISAAL reduces compilation times by up to a geomean of 16x compared to the state-of-the-art synthesis-based compiler, HYDRIDE. MISAAL also delivers competitive runtime performance against the production compiler for image processing and deep learning workloads, Halide, as well as HYDRIDE across x86, Hexagon and ARM. 
    more » « less
  5. Ta-Shma, Amnon (Ed.)
    In a recent work, Gryaznov, Pudlák and Talebanfard (CCC '22) introduced a linear variant of read-once branching programs, with motivations from circuit and proof complexity. Such a read-once linear branching program is a branching program where each node is allowed to make 𝔽₂-linear queries, and is read-once in the sense that the queries on each path is linearly independent. As their main result, they constructed an explicit function with average-case complexity 2^{n/3-o(n)} against a slightly restricted model, which they call strongly read-once linear branching programs. The main tool in their lower bound result is a new type of extractor, called directional affine extractors, that they introduced. Our main result is an explicit function with 2^{n-o(n)} average-case complexity against the strongly read-once linear branching program model, which is almost optimal. This result is based on a new connection from this problem to sumset extractors, which is a randomness extractor model introduced by Chattopadhyay and Li (STOC '16) as a generalization of many other well-studied models including two-source extractors, affine extractors and small-space extractors. With this new connection, our lower bound naturally follows from a recent construction of sumset extractors by Chattopadhyay and Liao (STOC '22). In addition, we show that directional affine extractors imply sumset extractors in a restricted setting. We observe that such restricted sumset sources are enough to derive lower bounds, and obtain an arguably more modular proof of the lower bound by Gryaznov, Pudlák and Talebanfard. We also initiate a study of pseudorandomness against linear branching programs. Our main result here is a hitting set generator construction against regular linear branching programs with constant width. We derive this result based on a connection to Kakeya sets over finite fields. 
    more » « less