Snafu: An Ultra-Low-Power, Energy-Minimal CGRA-Generation Framework and Architecture

Gobieski, Graham; Atli, Ahmet Oguz; Mai, Kenneth; Lucia, Brandon; Beckmann, Nathan

doi:10.1109/ISCA52012.2021.00084

Ultra-low-power (ULP) devices are becoming pervasive, enabling many emerging sensing applications. Energy-efficiency is paramount in these applications, as efficiency determines device lifetime in battery-powered deployments and performance in energy-harvesting deployments. Unfortunately, existing designs fall short because ASICs’ upfront costs are too high and prior ULP architectures are too inefficient or inflexible.We present Snafu, the first framework to flexibly generate ULP coarse-grain reconfigurable arrays (CGRAs). Snafu provides a standard interface for processing elements (PE), making it easy to integrate new types of PEs for new applications. Unlike prior high-performance, high-power CGRAs, Snafu is designed from the ground up to minimize energy consumption while maximizing flexibility. Snafu saves energy by configuring PEs and routers for a single operation to minimize switching activity; by minimizing buffering within the fabric; by implementing a statically routed, bufferless, multi-hop network; and by executing operations in-order to avoid expensive tag-token matching.We further present Snafu-Arch, a complete ULP system that integrates an instantiation of the Snafu fabric alongside a scalar RISC-V core and memory. We implement Snafu in RTL and evaluate it on an industrial sub-28 nm FinFET process across a suite of common sensing benchmarks. Snafu-Arch operates at <1 mW, orders-of-magnitude less power than most prior CGRAs. Snafu-Arch uses 41% less energy and runs 4.4× faster than the prior state-of-the-art general-purpose ULP architecture. Moreover, we conduct three comprehensive case-studies to quantify the cost of programmability in Snafu. We find that Snafu-Arch is close to ASIC designs built in the same technology, using just 2.6× more energy on average.

More Like this