Back to feed
Reddit r/MachineLearning·

I scraped over 2 million job postings across 100,000+ company career sites into a unified, daily-updated dataset. [P]

Signal
65
Hype
25
In three linesA user built a large-scale scraping pipeline aggregating 2M+ active job postings from 100,000+ company career sites. Dataset in Parquet format, daily-refreshed, freely accessible with standard fields (title, company, description, location, URL).
Read source
Your take?
ToolsInfrastructureOpen source

Summary generated by Claude — human-verified