Research Article: Latency-Aware RAG with Freshness-Constrained Caches

research-article
Received: Oct 2, 2024
Published: Nov 1, 2024
Authors: Emma Hill ✉

Abstract

We design a RAG controller that routes between cache hits, stale-but-usable snippets, and live retrieval using freshness SLAs, cutting mean cost by 17% at stable accuracy.

⬇ Download

Cite this article

Hill, E. (2024). Research Article: Latency-Aware RAG with Freshness-Constrained Caches. Research Explorations in Global Knowledge & Technology (REGKT), 3 (10). Retrieved from https://regkt.com/article.php?id=281&slug=latency-aware-rag-freshness-constrained-caches

Premium Membership Required

You need a premium account to view or download this article.

Become Premium