Web pages today commonly include large amounts of JavaScript code in order to offer users a dynamic experience. These scripts often make pages slow to load, partly due to a fundamental inefficiency in how browsers process JavaScript content: browsers make it easy for web developers to reason about page state by serially executing all scripts on any frame in a page, but as a result, fail to leverage the multiple CPU cores that are readily available even on low-end phones. In this paper, we show how to address this inefficiency without requiring pages to be rewritten or browsers to be modified. The key to our solution, Horcrux, is to account for the non-determinism intrinsic to web page loads and the constraints placed by the browser’s API for parallelism. Horcrux-compliant web servers perform offline analysis of all the JavaScript code on any frame they serve to conservatively identify, for every JavaScript function, the union of the page state that the function could access across all loads of that page. Horcrux’s JavaScript scheduler then uses this information to judiciously parallelize JavaScript execution on the client-side so that the end-state is identical to that of a serial execution, while minimizing coordination and offloading overheads. Across a wide range of pages, phones, and mobile networks covering web workloads in both developed and emerging regions, Horcrux reduces median browser computation delays by 31-44% and page load times by 18-37%.
more »
« less
Rethinking Client-Side Caching for the Mobile Web
Mobile web browsing remains slow despite many efforts to accelerate page loads. Like others, we find that client-side computation (in particular, JavaScript execution) is a key culprit. Prior solutions to mitigate computation overheads, however, suffer from security, privacy, and deployability issues, hindering their adoption. To sidestep these issues, we propose a browser-based solution in which every client reuses identical computations from its prior page loads. Our analysis across roughly 230 pages reveals that, even on a modern smartphone, such an approach could reduce client-side computation by a median of 49% on pages which are most in need of such optimizations.
more »
« less
- Award ID(s):
- 2152313
- PAR ID:
- 10323105
- Date Published:
- Journal Name:
- Proceedings of the 22nd International Workshop on Mobile Computing Systems and Applications
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
By repeatedly crawling and saving web pages over time, web archives (such as the Internet Archive) enable users to visit historical versions of any page. In this paper, we point out that existing web archives are not well designed to cope with the widespread presence of JavaScript on the web. Some archives store petabytes of JavaScript code, and yet many pages render incorrectly when users load them. Other archives which store the end-state of page loads (e.g., screen captures) break post-load interactions implemented in JavaScript. To address these problems, we present Jawa, a new design for web archives which significantly reduces the storage necessary to save modern web pages while also improving the fidelity with which archived pages are served. Key to enabling Jawa’s use at scale are our observations on a) the forms of non-determinism which impair the execution of JavaScript on archived pages, and b) the ways in which JavaScript’s execution fundamentally differs between live web pages and their archived copies. On a corpus of 1 million archived pages, Jawa reduces overall storage needs by 41%, when compared to the techniques currently used by the Internet Archive.more » « less
-
Commodity operating system (OS) kernels, such as Windows, Mac OS X, Linux, and FreeBSD, are susceptible to numerous security vulnerabilities. Their monolithic design gives successful attackers complete access to all application data and system resources. Shielding systems such as InkTag, Haven, and Virtual Ghost protect sensitive application data from compromised OS kernels. However, such systems are still vulnerable to side-channel attacks. Worse yet, compromised OS kernels can leverage their control over privileged hardware state to exacerbate existing side channels; recent work has shown that a compromised OS kernel can steal entire documents via side channels. This paper presents defenses against page table and last-level cache (LLC) side-channel attacks launched by a compromised OS kernel. Our page table defenses restrict the OS kernel’s ability to read and write page table pages and defend against page allocation attacks, and our LLC defenses utilize the Intel Cache Allocation Technology along with memory isolation primitives. We proto- type our solution in a system we call Apparition, building on an optimized version of Virtual Ghost. Our evaluation shows that our side-channel defenses add 1% to 18% (with up to 86% for one application) overhead to the optimized Virtual Ghost (relative to the native kernel) on real-world applications.more » « less
-
There is a rich body of literature on measuring and optimizing nearly every aspect of the web, including characterizing the structure and content of web pages, devising new techniques to load pages quickly, and evaluating such techniques. Virtually all of this prior work used a single page, namely the landing page (i.e., root document, "/"), of each web site as the representative of all pages on that site. In this paper, we characterize the differences between landing and internal (i.e., non-root) pages of 1000 web sites to demonstrate that the structure and content of internal pages differ substantially from those of landing pages, as well as from one another. We review more than a hundred studies published at top-tier networking conferences between 2015 and 2019, and highlight how, in light of these differences, the insights and claims of nearly two-thirds of the relevant studies would need to be revised for them to apply to internal pages. Going forward, we urge the networking community to include internal pages for measuring and optimizing the web. This recommendation, however, poses a non-trivial challenge: How do we select a set of representative internal web pages from a web site? To address the challenge, we have developed Hispar, a "top list" of 100,000 pages updated weekly comprising both the landing pages and internal pages of around 2000 web sites. We make Hispar and the tools to recreate or customize it publicly available.more » « less
-
The demand for memory is ever increasing. Many prior works have explored hardware memory compression to increase effective memory capacity. However, prior works compress and pack/migrate data at a small - memory blocklevel - granularity; this introduces an additional block-level translation after the page-level virtual address translation. In general, the smaller the granularity of address translation, the higher the translation overhead. As such, this additional block-level translation exacerbates the well-known address translation problem for large and/or irregular workloads. A promising solution is to only save memory from cold (i.e., less recently accessed) pages without saving memory from hot (i.e., more recently accessed) pages (e.g., keep the hot pages uncompressed); this avoids block-level translation overhead for hot pages. However, it still faces two challenges. First, after a compressed cold page becomes hot again, migrating the page to a full 4KB DRAM location still adds another level (albeit page-level, instead of block-level) of translation on top of existing virtual address translation. Second, only compressing cold data require compressing them very aggressively to achieve high overall memory savings; decompressing very aggressively compressed data is very slow (e.g., > 800ns assuming the latest Deflate ASIC in industry). This paper presents Translation-optimized Memory Compression for Capacity (TMCC) to tackle the two challenges above. To address the first challenge, we propose compressing page table blocks in hardware to opportunistically embed compression translations into them in a software-transparent manner to effectively prefetch compression translations during a page walk, instead of serially fetching them after the walk. To address the second challenge, we perform a large design space exploration across many hardware configurations and diverse workloads to derive and implement in HDL an ASIC Deflate that is specialized for memory; for memory pages, it is 4X as fast as the state-of-the art ASIC Deflate, with little to no sacrifice in compression ratio. Our evaluations show that for large and/or irregular workloads, TMCC can either improve performance by 14% without sacrificing effective capacity or provide 2.2x the effective capacity without sacrificing performance compared to a stateof-the-art hardware memory compression for capacity.more » « less
An official website of the United States government

