Short Communication: UTF-8 Fast Path for JSON Tokenizers

short-communication
Received: Dec 14, 2023
Published: Dec 31, 2023
Authors: Ishan Sanchez ✉

Abstract

A branchless UTF-8 validation fast path trims tokenizer CPU by ~10% on multilingual logs.

⬇ Download

Cite this article

Sanchez, I. (2023). Short Communication: UTF-8 Fast Path for JSON Tokenizers. Research Explorations in Global Knowledge & Technology (REGKT), 2 (10). Retrieved from https://regkt.com/article.php?id=575&slug=short-communication-utf8-fast-path-json-tokenizers

Premium Membership Required

You need a premium account to view or download this article.

Become Premium