Los puntos clave no están disponibles para este artículo en este momento.
Machine learning models are becoming the primary work-horses for many applications. Services deploy models through prediction serving systems that take in queries and return predictions by performing inference on models. Prediction serving systems are commonly run on many machines in cluster settings, and thus are prone to slowdowns and failures that inflate tail latency. Erasure coding is a popular technique for achieving resource-efficient resilience to data unavailability in storage and communication systems. However, existing approaches for imparting erasure-coded resilience to distributed computation apply only to a severely limited class of functions, precluding their use for many serving workloads, such as neural network inference.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jack Kosaian
University of Michigan
K. V. Rashmi
ASA College
Shivaram Venkataraman
University of Wisconsin System
University of Wisconsin–Madison
Carnegie Mellon University
Building similarity graph...
Analyzing shared references across papers
Loading...
Kosaian et al. (Mon,) studied this question.
synapsesocial.com/papers/6a0fd17692676d5461fd1ee4 — DOI: https://doi.org/10.1145/3341301.3359654