Short Communication: GPU-Aware Scheduling for Kubernetes Clusters
Abstract
We add GPU topology awareness to Kubernetes scheduler plugins, reducing inter-GPU communication latency by 19% in multi-node AI workloads.
Cite this article
O'Leary, G. & Xu, C. (2025). Short Communication: GPU-Aware Scheduling for Kubernetes Clusters. Research Explorations in Global Knowledge & Technology (REGKT), 4 (1). Retrieved from https://regkt.com/article.php?id=185&slug=short-communication-gpu-aware-scheduling-for-kubernetes-clusters