Traces from production caching systems of users accessing con- tent are seldom made available to the public as they are considered private and proprietary. The dearth of realistic trace data makes it difficult for system designers and researchers to test and validate new caching algorithms and architectures. To address this key problem, we present TRAGEN, a tool that can generate a synthetic trace that is “similar” to an original trace from the production system in the sense that the two traces would result in similar hit rates in a cache simulation. We validate TRAGEN by first proving that the synthetic trace is similar to the original trace for caches of arbitrary size when the Least-Recently-Used (LRU) policy is used. Next, we empirically validate the similarity of the synthetic trace and original trace for caches that use a broad set of commonly-used caching policies that include LRU, SLRU, FIFO, RANDOM, MARKERS, CLOCK and PLRU. For our empirical validation, we use original request traces drawn from four different traffic classes from the world’s largest CDN, each trace consisting of hundreds of millions of requests for tens of millions of objects. TRAGEN is publicly available and can be used to generate synthetic traces that are similar to actual pro- duction traces for a number of traffic classes such as videos, social media, web, and software downloads. Since the synthetic traces are similar to the original production ones, cache simulations performed using the synthetic traces will yield similar results to what might be attained in a production setting, making TRAGEN a key tool for cache system developers and researchers. 
                        more » 
                        « less   
                    
                            
                            Representing Integer Sequences Using Piecewise-Affine Loops
                        
                    
    
            A formal, high-level representation of programs is typically needed for static and dynamic analyses performed by compilers. However, the source code of target applications is not always available in an analyzable form, e.g., to protect intellectual property. To reason on such applications, it becomes necessary to build models from observations of its execution. This paper details an algebraic approach which, taking as input the trace of memory addresses accessed by a single memory reference, synthesizes an affine loop with a single perfectly nested reference that generates the original trace. This approach is extended to support the synthesis of unions of affine loops, useful for minimally modeling traces generated by automatic transformations of polyhedral programs, such as tiling. The resulting system is capable of processing hundreds of gigabytes of trace data in minutes, minimally reconstructing 100% of the static control parts in PolyBench/C applications and 99.99% in the Pluto-tiled versions of these benchmarks. As an application example of the trace modeling method, trace compression is explored. The affine representations built for the memory traces of PolyBench/C codes achieve compression factors of the order of 106 and 103 with respect to gzip for the original and tiled versions of the traces, respectively. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10333804
- Date Published:
- Journal Name:
- Mathematics
- Volume:
- 9
- Issue:
- 19
- ISSN:
- 2227-7390
- Page Range / eLocation ID:
- 2368
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)Python has become one of the most used and taught languages nowadays. Its expressiveness, cross-compatibility and ease of use have made it popular in areas as diverse as finance, bioinformatics or machine learning. However, Python programs are often significantly slower to execute than an equivalent native C implementation, especially for computation-intensive numerical kernels. This work presents PolyBench/Python, implementing the 30 kernels in PolyBench/C, one of the standard benchmark suites for polyhedral optimization, in Python. In addition to the benchmark kernels, a functional wrapper including mechanisms for performance measurement, testing, and execution configuration has been developed. The framework includes support for different ways to translate C-array codes into Python, offering insight into the tradeoffs of Python lists and NumPy arrays. The benchmark performance is thoroughly evaluated on different Python interpreters, and compared against its PolyBench/C counterpart to highlight the profitability (or lack thereof) of using Python for regular numerical codes.more » « less
- 
            Manually writing parallel programs is difficult and error-prone. Automatic parallelization could address this issue, but profitability can be limited by not having facts known only to the programmer. A parallelizing compiler that collaborates with the programmer can increase the coverage and performance of parallelization while reducing the errors and overhead associated with manual parallelization. Unlike collaboration involving analysis tools that report program properties or make parallelization suggestions to the programmer, decompiler-based collaboration could leverage the strength of existing parallelizing compilers to provide programmers with a natural compiler-parallelized starting point for further parallelization or refinement. Despite this potential, existing decompilers fail to do this because they do not generate portable parallel source code compatible with any compiler of the source language. This paper presents SPLENDID, an LLVM-IR to C/OpenMP decompiler that enables collaborative parallelization by producing standard parallel OpenMP code. Using published manual parallelization of the PolyBench benchmark suite as a reference, SPLENDID's collaborative approach produces programs twice as fast as either Polly-based automatic parallelization or manual parallelization alone. SPLENDID's portable parallel code is also more natural than that from existing decompilers, obtaining a 39x higher average BLEU score.more » « less
- 
            Memory system is critical to architecture design which can significantly impact application performance. Concurrent Average Memory Access Time (C-AMAT) is a model for analyzing and optimizing memory system performance using a recursive definition of the memory access latency along the memory hierarchy. The original C-AMAT model, however, does not provide the necessary granularity and flexibility for handling modern memory architectures with heterogeneous memory technologies and diverse system topology. We propose to augment C-AMAT to take into consideration the idiosyncrasies of individual cache/memory components as well as their topological arrangement in the memory architecture design. Through trace-based simulation, we validate the augmented model and examine the memory system performance with insight unavailable using the original C-AMAT model.more » « less
- 
            The restricted active space spin–flip (RAS-SF) formalism is a particular form of single-reference configuration interaction that can describe some forms of strong correlation at a relatively low cost and which has recently been formulated for the description of charge-transfer excited states. Here, we introduce both equilibrium and nonequilibrium versions of a state-specific solvation correction for vertical transition energies computed using RAS-SF wave functions, based on the framework of a polarizable continuum model (PCM). Ground-state polarization is described using the solvent’s static dielectric constant and in the nonequilibrium solvation approach that polarization is modified upon vertical excitation using the solvent’s optical dielectric constant. Benchmark calculations are reported for well-studied models of photo-induced charge transfer, including naphthalene dimer, C 2 H 4 ⋯C 2 F 4 , pentacene dimer, and perylene diimide (PDI) dimer, several of which are important in organic photovoltaic applications. For the PDI dimer, we demonstrate that the charge-transfer character of the excited states is enhanced in the presence of a low-dielectric medium (static dielectric constant ɛ 0 = 3) as compared to a gas-phase calculation ( ɛ 0 = 1). This stabilizes mechanistic traps for singlet fission and helps to explain experimental singlet fission rates. We also examine the effects of nonequilibrium solvation on charge-separated states in an intramolecular singlet fission chromophore, where we demonstrate that the energetic ordering of the states changes as a function of solvent polarity. The RAS-SF + PCM methodology that is reported here provides a framework to study charge-separated states in solution and in photovoltaic materials.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    