skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, September 13 until 2:00 AM ET on Saturday, September 14 due to maintenance. We apologize for the inconvenience.


Title: Horcrux: Automatic JavaScript Parallelism for Resource-Efficient Web Computation
Web pages today commonly include large amounts of JavaScript code in order to offer users a dynamic experience. These scripts often make pages slow to load, partly due to a fundamental inefficiency in how browsers process JavaScript content: browsers make it easy for web developers to reason about page state by serially executing all scripts on any frame in a page, but as a result, fail to leverage the multiple CPU cores that are readily available even on low-end phones. In this paper, we show how to address this inefficiency without requiring pages to be rewritten or browsers to be modified. The key to our solution, Horcrux, is to account for the non-determinism intrinsic to web page loads and the constraints placed by the browser’s API for parallelism. Horcrux-compliant web servers perform offline analysis of all the JavaScript code on any frame they serve to conservatively identify, for every JavaScript function, the union of the page state that the function could access across all loads of that page. Horcrux’s JavaScript scheduler then uses this information to judiciously parallelize JavaScript execution on the client-side so that the end-state is identical to that of a serial execution, while minimizing coordination and offloading overheads. Across a wide range of pages, phones, and mobile networks covering web workloads in both developed and emerging regions, Horcrux reduces median browser computation delays by 31-44% and page load times by 18-37%.  more » « less
Award ID(s):
2152313
NSF-PAR ID:
10323103
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
15th USENIX Symposium on Operating Systems Design and Implementation (OSDI)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. By repeatedly crawling and saving web pages over time, web archives (such as the Internet Archive) enable users to visit historical versions of any page. In this paper, we point out that existing web archives are not well designed to cope with the widespread presence of JavaScript on the web. Some archives store petabytes of JavaScript code, and yet many pages render incorrectly when users load them. Other archives which store the end-state of page loads (e.g., screen captures) break post-load interactions implemented in JavaScript. To address these problems, we present Jawa, a new design for web archives which significantly reduces the storage necessary to save modern web pages while also improving the fidelity with which archived pages are served. Key to enabling Jawa’s use at scale are our observations on a) the forms of non-determinism which impair the execution of JavaScript on archived pages, and b) the ways in which JavaScript’s execution fundamentally differs between live web pages and their archived copies. On a corpus of 1 million archived pages, Jawa reduces overall storage needs by 41%, when compared to the techniques currently used by the Internet Archive. 
    more » « less
  2. Ali, Karim ; Salvaneschi, Guido (Ed.)
    Browsers are the main way in which most users experience the internet, which makes them a prime target for malicious entities. The best defense for the common user is to keep their browser always up-to-date, installing updates as soon as they are available. Unfortunately, updating a browser is disruptive as it results in loss of user state. Even though modern browsers reopen all pages (tabs) after an update to minimize inconvenience, this approach still loses all local user state in each page (e.g., contents of unsubmitted forms, including associated JavaScript validation state) and assumes that pages can be refreshed and result in the same contents. We believe this is an important barrier that keeps users from updating their browsers as frequently as possible. In this paper, we present the design, implementation, and evaluation of Sinatra, which supports instantaneous browser updates that do not result in any data loss through a novel Multi-Version eXecution (MVX) approach for JavaScript programs, combined with a sophisticated proxy. Sinatra works in pure JavaScript, does not require any browser support, thus works on closed-source browsers, and requires trivial changes to each target page, that can be automated. First, Sinatra captures all the non-determinism available to a JavaScript program (e.g., event handlers executed, expired timers, invocations of Math.random). Our evaluation shows that Sinatra requires 6MB to store such events, and the memory grows at a modest rate of 253KB/s as the user keeps interacting with each page. When an update becomes available, Sinatra transfer the state by re-executing the same set of non-deterministic events on the new browser. During this time, which can be as long as 1.5 seconds, Sinatra uses MVX to allow the user to keep interacting with the old browser. Finally, Sinatra changes the roles in less than 10ms, and the user starts interacting with the new browser, effectively performing a browser update with zero downtime and no loss of state. 
    more » « less
  3. Mobile web browsing remains slow despite many efforts to accelerate page loads. Like others, we find that client-side computation (in particular, JavaScript execution) is a key culprit. Prior solutions to mitigate computation overheads, however, suffer from security, privacy, and deployability issues, hindering their adoption. To sidestep these issues, we propose a browser-based solution in which every client reuses identical computations from its prior page loads. Our analysis across roughly 230 pages reveals that, even on a modern smartphone, such an approach could reduce client-side computation by a median of 49% on pages which are most in need of such optimizations. 
    more » « less
  4. JavaScript has become the most popular programming language for web front-end development. With such popularity, there is a great demand for thorough testing of client-side JavaScript web applications. In this paper, we present a novel approach to concolic testing of front-end JavaScript web applications. This approach leverages widely used JavaScript testing frameworks such as Jest and Puppeteer and conducts concolic execution on JavaScript functions in web applications for unit testing. The seamless integration of concolic testing with these testing frameworks allows injection of symbolic variables within the native execution context of a JavaScript web function and precise capture of concrete execution traces of the function under test. Such concise execution traces greatly improve the effectiveness and efficiency of the subsequent symbolic analysis for test generation. We have implemented our approach on Jest and Puppeteer. The application of our Jest implementation on Metamask, one of the most popular Crypto wallets, has uncovered 3 bugs and 1 test suite improvement, whose bug reports have all been accepted by Metamask developers on Github. We also applied our Puppeteer implementation to 21 Github projects and detected 4 bugs. 
    more » « less
  5. null (Ed.)
    Monetizing websites and web apps through online advertising is widespread in the web ecosystem, creating a billion-dollar market. This has led to the emergence of a vast network of tertiary ad providers and ad syndication to facilitate this growing market. Nowadays, the online advertising ecosystem forces publishers to integrate ads from these third-party domains. On the one hand, this raises several privacy and security concerns that are actively being studied in recent years. On the other hand, the ability of today's browsers to load dynamic web pages with complex animations and Javascript has also transformed online advertising. This can have a significant impact on webpage performance. The latter is a critical metric for optimization since it ultimately impacts user satisfaction. Unfortunately, there are limited literature studies on understanding the performance impacts of online advertising which we argue is as important as privacy and security. In this paper, we apply an in-depth and first-of-a-kind performance evaluation of web ads. Unlike prior efforts that rely primarily on adblockers, we perform a fine-grained analysis on the web browser's page loading process to demystify the performance cost of web ads. We aim to characterize the cost by every component of an ad, so the publisher, ad syndicate, and advertiser can improve the ad's performance with detailed guidance. For this purpose, we develop a tool, adPerf, for the Chrome browser that classifies page loading workloads into ad-related and main-content at the granularity of browser activities. Our evaluations show that online advertising entails more than 15% of browser page loading workload and approximately 88% of that is spent on JavaScript. On smartphones, this additional cost of ads is 7% lower since mobile pages include fewer and well-optimized ads. We also track the sources and delivery chain of web ads and analyze performance considering the origin of the ad contents. We observe that 2 of the well-known third-party ad domains contribute to 35% of the ads performance cost and surprisingly, top news websites implicitly include unknown third-party ads which in some cases build up to more than 37% of the ads performance cost. 
    more » « less