1,000 Scientist AI Jam Session
OpenAI and nine national labs host first-of-its-kind event bringing together 1,000 leading scientists. No technical details or concrete outcomes disclosed in the announcement.
37 articles
OpenAI and nine national labs host first-of-its-kind event bringing together 1,000 leading scientists. No technical details or concrete outcomes disclosed in the announcement.
Arize Phoenix enables tracing and evaluation of AI agents. The tool provides visibility into API calls, agent decisions, and performance metrics. Integration with popular frameworks for production monitoring.
Mercari integrates GPT-4o mini and GPT-4 to enhance product listings and assist sellers. New features include AI Listing Support and Mercari AI Assistant, designed to boost sales on the marketplace platform.
OpenAI releases research preview of GPT-4.5, described as its largest and most capable model to date. No technical details, benchmarks, or general availability timeline provided in excerpt.
Endex builds an autonomous financial analyst using OpenAI's o1 and o3-mini reasoning models. The models enable advanced financial analysis without manual intervention.
Hugging Face and Indian Institute of Science (IISc) partner to advance AI model development for Indian languages. The collaboration aims to create resources and tools tailored to India's linguistic diversity.
Google DeepMind makes Gemini 2.0 Flash-Lite generally available in production via Gemini API, in Google AI Studio and Vertex AI for enterprise customers.
OpenAI releases a System Card for "deep research" detailing safety work: external red teaming, frontier risk evaluations per Preparedness Framework, and mitigations for key risk areas prior to launch.
Hugging Face releases FastRTC, a Python library for real-time communication. It simplifies building audio/video applications with native WebRTC support and AI model integration.
Hugging Face integrates remote VAEs (Variational Autoencoders) into Inference Endpoints for decoding. This feature enables using remotely hosted VAE models without local loading, optimizing resource usage and latency.
OpenAI announces measures against malicious AI use, including deepfake detection, abuse filtering, and law enforcement collaboration. No technical details or impact metrics provided in excerpt.
Hugging Face releases SigLIP 2, an improved multilingual vision-language encoder. The model delivers better performance on vision and multilingual understanding tasks compared to its predecessor.
Hugging Face releases SmolVLM2, a lightweight multimodal vision model capable of processing videos and images. Optimized for mobile and edge devices, it provides an accessible alternative to large vision models.
Google releases PaliGemma 2 Mix, a family of instruction-tuned vision-language models based on Gemma 2. Three variants (3B, 10B, 28B) combine visual and textual capabilities for multimodal tasks. Available open-source on Hugging Face.
OpenAI introduces SWE-Lancer, a benchmark measuring frontier LLMs' ability to complete real-world freelance software engineering tasks and generate revenue. The test evaluates whether models can earn $1 million on actual projects.
Hugging Face adds three new serverless inference providers: Hyperbolic, Nebius AI Studio, and Novita. These integrations expand model deployment options on the Hugging Face platform.
OpenAI and Guardian Media Group announce content partnership to integrate Guardian news articles into ChatGPT. Users will access Guardian journalism directly within ChatGPT's interface.
Fireworks.ai joins Hugging Face Hub. The inference platform specializing in open-source models integrates the ecosystem to streamline model deployment and access.
Hugging Face fixes its Open LLM Leaderboard by integrating Math-Verify, a mathematical verification method to more accurately evaluate language models' reasoning capabilities. This improvement addresses limitations of previous metrics.
Fanatics Betting and Gaming leverages AI to enhance financial strategy and operations. CFO Andrea Ellis discusses how the company deploys AI tools to analyze large-scale betting and gaming data, improving strategic decision-making across the business.
Rogo scales AI-driven financial research using OpenAI o1 for deeper analysis of complex financial data. The reasoning model enables automated financial analysis at scale. Production deployment case study demonstrating o1 adoption in fintech.
Hugging Face announces crossing 1 billion classifications on its platform. This milestone reflects growing adoption of AI models for classification tasks in production.
OpenAI releases an updated Model Spec, the document defining expected behaviors for its models. This specification guides development and evaluation of capabilities and safety boundaries.
Hugging Face publishes a guide for building high-quality datasets for training video generation models. The article covers best practices for data curation, annotation, and organization.
Hugging Face optimizes uploads and downloads on the Hub by replacing chunks with blocks. This architecture reduces latency and improves stability for large file transfers.
Hugging Face releases Open R1 Update #2, advancing its open-source reasoning model. The update improves performance and reasoning capabilities on complex tasks.
OpenAI partners with Schibsted Media Group to integrate Guardian news and archive content into ChatGPT. Media content distribution partnership.
Hugging Face releases the Open Arabic LLM Leaderboard 2, a ranking system evaluating Arabic language models on standardized benchmarks. The initiative measures performance in Arabic comprehension, generation, and reasoning tasks.
OpenAI introduces data residency in Europe, strengthening its enterprise-grade data privacy, security, and compliance programs for customers worldwide.
OpenAI and the CSU system deploy ChatGPT to 500,000 students and faculty. The largest ChatGPT deployment to date aims to expand AI use in education and build an AI-ready workforce in the United States.
Hugging Face introduces π0 and π0-FAST, vision-language-action models for general robot control. These models unify visual perception, natural language understanding, and action generation, trained on diverse robotic data to execute complex tasks without task-specific fine-tuning.
OpenAI showcases a custom math tutor powered by ChatGPT, demonstrating how to adapt the model for education through specialized prompts and conversational interactions. No performance metrics or benchmarks included in the excerpt.
Hugging Face releases open-source DeepResearch, an autonomous research agent that conducts in-depth investigations on complex topics. The tool integrates web search, information synthesis, and multi-step reasoning to generate detailed reports without human intervention.
Hugging Face introduces DABStep, a benchmark for evaluating AI agents on multi-step reasoning. The tool measures models' ability to decompose complex tasks and iteratively use tools to solve problems.
OpenAI launches Deep Research, an agent using reasoning to synthesize online information and complete multi-step research tasks. Available to Pro users today, then Plus and Team.
OpenAI showcases deep research feature helping Bain & Company analyze complex industry trends. No technical details, benchmarks, or quantified results provided in excerpt.
Hugging Face releases Update #1 on Open-R1, an open-source reasoning model project. The update covers progress and future directions for the initiative.