Research Article: Compiler-Guided Quantization for Transformer Inference

research-article
Received: Oct 10, 2025
Published: Oct 30, 2025
Authors: Aisha Rivera ✉

Abstract

We present a compiler pass that selects per-layer quantization schemes via profile-guided search, yielding 1.32� speedup with <0.3% accuracy loss across five LLMs.

⬇ Download

Cite this article

Rivera, A. (2025). Research Article: Compiler-Guided Quantization for Transformer Inference. Research Explorations in Global Knowledge & Technology (REGKT), 4 (1). Retrieved from https://regkt.com/article.php?id=206&slug=compiler-guided-quantization-transformer-inference

Premium Membership Required

You need a premium account to view or download this article.

Become Premium