ztok — a fast multithreaded tokenizer in Zig that loads tiktoken / HF / SentencePiece and is 2–5× faster
Signal
78
Hype
25
In three linesztok is a multithreaded tokenizer library in Zig, 2–5× faster than tiktoken/HF/SentencePiece. Loads tiktoken, HF tokenizer.json, SentencePiece, TokenMonster, Mistral Tekken formats. Bit-identical to reference implementations, 8 language bindings, optimized for RAG and dataset tokenization.Read source
Your take?
Summary generated by Claude — human-verified