null
(Ed.)
AI applications powered by deep learning inference are
increasingly run natively on edge devices to provide better
interactive user experience. This often necessitates fitting a
model originally designed and trained in the cloud to edge
devices with a range of hardware capabilities, which so far
has relied on time-consuming manual effort.
In this paper, we quantify the challenges of manually generating
a large number of compressed models and then build
a system framework, Mistify, to automatically port a cloud-based
model to a suite of models for edge devices targeting
various points in the design space. Mistify adds an intermediate
“layer” that decouples the model design and deployment
phases. By exposing configuration APIs to obviate the need
for code changes deeply embedded into the original model,
Mistify hides run-time issues from model designers and hides
the model internals from model users, hence reducing the expertise
needed in either. For better scalability, Mistify consolidates
multiple model tailoring requests to minimize repeated
computation. Further, Mistify leverages locally available edge
data in a privacy-aware manner, and performs run-time model
adaptation to provide scalable edge support and accurate inference
results. Extensive evaluation shows that Mistify reduces
the DNN porting time needed by over 10x to cater to a wide
spectrum of edge deployment scenarios, incurring orders of
magnitude less manual effort.
more »
« less