Perspective: Energy-Aware LLM Inference in Edge Environments
Abstract
We discuss scheduling, quantization, and speculative decoding for energy-aware LLM inference at the edge. A cost model shows 21�38% energy savings with negligible accuracy impact across multilingual assistants.
Cite this article
M�ller, F. & Kowalska, J. (2025). Perspective: Energy-Aware LLM Inference in Edge Environments. Research Explorations in Global Knowledge & Technology (REGKT), 3 (3). Retrieved from https://regkt.com/article.php?id=108&slug=perspective-energy-aware-llm-inference-in-edge-environments