Back to feed
Reddit r/LocalLLaMA·

numind/NuExtract3 · Hugging Face

Signal
75
Hype
25
In three linesNuExtract3 is a 4B vision-language model for document understanding. It combines structured extraction (text/images + JSON template → JSON output) and image-to-Markdown conversion, with multilingual support and reasoning/non-reasoning modes. Available in GGUF, NVFP4, MLX, VLLM.
Read source
Your take?
VisionRAGCode generationOpen sourceTools

Summary generated by Claude — human-verified