Case Study: GPU Quota Markets for Shared Model Serving
Abstract
A marketplace introduced internal GPU quota markets with preemption and buybacks, improving utilization by 14% while protecting latency SLOs.
Cite this article
Smith, M. & Brown, O. (2024). Case Study: GPU Quota Markets for Shared Model Serving. Research Explorations in Global Knowledge & Technology (REGKT), 3 (10). Retrieved from https://regkt.com/article.php?id=284&slug=case-study-gpu-quota-markets-for-shared-model-serving