Research Article: Adaptive Caching for Model Inference Routers
Abstract
We introduce feedback-aware cache replacement for model routers, achieving 17 % lower latency under burst traffic.
Cite this article
Lee, N., Jackson, O., & Smith, S. (2022). Research Article: Adaptive Caching for Model Inference Routers. Research Explorations in Global Knowledge & Technology (REGKT), 1 (11). Retrieved from https://regkt.com/article.php?id=661&slug=adaptive-caching-for-model-inference-routers