We present FireSim, an open-source simulation platform that enables cycle-exact microarchitectural simulation of large scale-out clusters by combining FPGA-accelerated simulation of silicon-proven RTL designs with a scalable, distributed network simulation. Unlike prior FPGA-accelerated simulation tools, FireSim runs on Amazon EC2 F1, a public cloud FPGA platform, which greatly improves usability, provides elasticity, and lowers the cost of large-scale FPGA-based experiments. We describe the design and implementation of FireSim and show how it can provide sufficient performance to run modern applications at scale, to enable true hardware-software co-design. As an example, we demonstrate automatically generating and deploying a target cluster of 1,024 3.2 GHz quad-core server nodes, each with 16 GB of DRAM, interconnected by a 200 Gbit/s network with 2 microsecond latency, which simulates at a 3.4 MHz processor clock rate (less than 1,000x slowdown over real-time). In aggregate, this FireSim instantiation simulates 4,096 cores and 16 TB of memory, runs ~ 14 billion instructions per second, and harnesses 12.8 million dollars worth of FPGAs-at a total cost of only ~ $100 per simulation hour to the user. We present several examples to show how FireSim can be used to explore various research directions in warehouse-scale machine design, including modelingmore »
EdgeNet: A Lightweight Scalable Edge Cloud
This paper describes EdgeNet, a lightweight cloud infrastructure for the edge. We aim to bring as much of the flexibility of open cloud computing as possible to a very lightweight, easily-deployed, software-only edge infrastructure.
EdgeNet has been informed by the advances of cloud computing and the successes of such distributed systems as PlanetLab, GENI, G-Lab, SAVI, and V-Node: a large number of small points-of-presence, designed for the deployment of highly distributed experiments and applications. EdgeNet differs from its predecessors in two significant areas: first, it is a software-only infrastructure, where each worker node is designed to run part- or full-time on existing hardware at the local site; and, second, it uses modern, industry-standard software both as the node agent and the control framework. The first innovation permits rapid and unlimited scaling: whereas GENI and PlanetLab required the installation and maintenance of dedicated hardware at each site, EdgeNet requires only a software download, and a node can be added to the EdgeNet infrastructure in 15 minutes. The second offers performance, maintenance, and training benefits; rather than maintaining bespoke kernels and control frameworks, and developing training materials on using the latter, we are able to ride the wave of open-source and industry development, more »
- Award ID(s):
- 1820901
- Publication Date:
- NSF-PAR ID:
- 10097311
- Journal Name:
- Symposium on Edge Computing
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The Internet of Things (IoT) requires distributed, large scale data collection via geographically distributed devices. While IoT devices typically send data to the cloud for processing, this is problematic for bandwidth constrained applications. Fog and edge computing (processing data near where it is gathered, and sending only results to the cloud) has become more popular, as it lowers network overhead and latency. Edge computing often uses devices with low computational capacity, therefore service frameworks and middleware are needed to efficiently compose services. While many frameworks use a top-down perspective, quality of service is an emergent property of the entire system and often requires a bottom up approach. We define services as multi-modal, allowing resource and performance tradeoffs. Different modes can be composed to meet an application's high level goal, which is modeled as a function. We examine a case study for counting vehicle traffic through intersections in Nashville. We apply object detection and tracking to video of the intersection, which must be performed at the edge due to privacy and bandwidth constraints. We explore the hardware and software architectures, and identify the various modes. This paper lays the foundation to formulate the online optimization problem presented by the system whichmore »
-
The rapid growth in technology and wide use of internet has increased smart applications such as intelligent transportation control system, and Internet of Things, which heavily rely on an efficient and reliable connectivity network. To overcome high bandwidth work load on the network, as well as minimize latency for real-time applications, the computation can be moved from the central cloud to a distributed edge cloud. The edge computing benefits various smart applications that uses distributed network for data analytics and services. Different from the existing cloud management solutions, edge computing needs to move cloud management services towards distributed heterogeneous edge nodes for multi-tenant user applications. However, existing cloud management services do not offer remote deployment of multi-tenant user applications on the cloud of edge nodes. In this paper, we propose a practical edge cloud software framework for deploying multi-tenant distributed smart applications. Having multiple distributed end nodes, auto discovery of all active end nodes is required for deploying multi-tenant user applications. However, existing cloud solutions require either private network or fixed IP address, which is not achievable for the distributed edge nodes. Most of the edge nodes connected through the public internet without fixed IP, and some of them evenmore »
-
The migration of infrastructure from on premise installation and maintenance of computing resources to cloud based systems by business of all sizes has been an ongoing event for several years. To minimize capital expenses and allow for demand based operational expenses has increased the need for cloud practitioners with the ability to create and control these resources. The demand for skilled cloud workers ranging from developers to architects has been increasing, and one way to increase the technicians available for these job skills is to start recruitment as early as high school. For high school students interested in the technical side of STEM pathways, the ability to understand, design and work in a cloud environment is now part of critical technical skills. Fluency in cloud and cloud environments, the ability to understand the capabilities of all these modern technologies are necessary technical skills. To support this growing demand of cloud skills, the institution partnered with Amazon Web Services (AWS), the industry leader in cloud computing solutions, to train high school students as early cloud adopters and to be well-prepared for the computing/IT workforce of tomorrow. This academic-industry partnership aims to raise cloud literacy in K-12 by offering a two-week cloudmore »
-
Edge application’s distributed nature presents significant challenges for developers in orchestrating and managing the multitenant applications. In this paper, we propose a practical edge cloud software framework for deploying multitenant distributed smart applications. Here we exploit commodity, a low cost embedded board to form distributed edge clusters. The cluster of geo-distributed and wireless edge nodes not only power multitenant IoT applications that are closer to the data source and the user, but also enable developers to remotely deploy and orchestrate application containers over the cloud. Specifically, we propose building a software platform to manage the distributed edge nodes along with support services to deploy and launch isolated and multitenant user applications through a lightweight container. In particular, we propose an architectural solution to improve the resilience of edge cloud services through peer collaborated service migration when the failures happen or when resources are overburdened. We focus on giving the developers a single point control of the infrastructure over the intermittent and lossy wide area networks (WANs) and enabling the remote deployment of multitenant applications.