arXiv cs.AI·19 May 2026

StyleText: A Large-Scale Dataset and Benchmark for Stylized Scene Text Inpainting

Signal

Hype

In three linesStyleText is a dataset of 28,518 image-mask-prompt triplets for scene text inpainting with style preservation. Automated pipeline combines LLM templating, Flux with KV-cache injection, OCR, polygon mask extraction, and FluxFill augmentation. FluxFill+LoRA baseline substantially improves OCR accuracy while maintaining scene style consistency.

Read source

Your take?

Benchmarks Image generation Vision Papers

Summary generated by Claude — human-verified

StyleText: A Large-Scale Dataset and Benchmark for Stylized Scene Text Inpainting

Other angles on this story