arXiv cs.CL·1 June 2026

ImmigrationQA: A Source-Grounded Dataset and Small-Model Adaptation for U.S. Immigration Law

Signal

Hype

In three linesImmigrationQA: source-grounded QA dataset of 17,058 pairs across 13 U.S. immigration law subdomains. Fine-tuned Llama 3.2 3B with LoRA on corpus of 10,056 validated documents. Fine-tuned model: 1.08/3.0 (16.8% fully correct) vs Llama 3 8B base: 0.85/3.0 (4% fully correct), 27% relative improvement. Cost: ~$29. Dataset, model, and code publicly released.

Read source

Your take?

Llama Fine-tuning RAG Benchmarks Open source

Summary generated by Claude — human-verified

ImmigrationQA: A Source-Grounded Dataset and Small-Model Adaptation for U.S. Immigration Law

Other angles on this story