February 2025

37 articles

1,000 Scientist AI Jam Session

OpenAI and nine national labs host first-of-its-kind event bringing together 1,000 leading scientists. No technical details or concrete outcomes disclosed in the announcement.

OpenAI Business

SIG

HYP

Hugging Face Blog·Feb 28

Trace & Evaluate your Agent with Arize Phoenix

Arize Phoenix enables tracing and evaluation of AI agents. The tool provides visibility into API calls, agent decisions, and performance metrics. Integration with popular frameworks for production monitoring.

AI Agents Evals Tools

SIG

HYP

OpenAI Blog·Feb 27

Supporting sellers with enhanced product listings

Mercari integrates GPT-4o mini and GPT-4 to enhance product listings and assist sellers. New features include AI Listing Support and Mercari AI Assistant, designed to boost sales on the marketplace platform.

GPT OpenAI Business

SIG

HYP

OpenAI Blog·Feb 27

OpenAI GPT-4.5 System Card

OpenAI releases research preview of GPT-4.5, described as its largest and most capable model to date. No technical details, benchmarks, or general availability timeline provided in excerpt.

OpenAI GPT

SIG

HYP

OpenAI Blog·Feb 27

Building an autonomous financial analyst with o1 and o3-mini

Endex builds an autonomous financial analyst using OpenAI's o1 and o3-mini reasoning models. The models enable advanced financial analysis without manual intervention.

OpenAI Reasoning AI Agents

SIG

HYP

Hugging Face Blog·Feb 27

HuggingFace, IISc partner to supercharge model building on India's diverse languages

Hugging Face and Indian Institute of Science (IISc) partner to advance AI model development for Indian languages. The collaboration aims to create resources and tools tailored to India's linguistic diversity.

Open source

SIG

HYP

Google DeepMind·Feb 25

Start building with Gemini 2.0 Flash and Flash-Lite

Google DeepMind makes Gemini 2.0 Flash-Lite generally available in production via Gemini API, in Google AI Studio and Vertex AI for enterprise customers.

Gemini Tools

SIG

HYP

OpenAI Blog·Feb 25

Deep research System Card

OpenAI releases a System Card for "deep research" detailing safety work: external red teaming, frontier risk evaluations per Preparedness Framework, and mitigations for key risk areas prior to launch.

OpenAI AI safety Evals

SIG

HYP

Hugging Face Blog·Feb 25

FastRTC: The Real-Time Communication Library for Python

Hugging Face releases FastRTC, a Python library for real-time communication. It simplifies building audio/video applications with native WebRTC support and AI model integration.

Tools Infrastructure Voice

SIG

HYP

Hugging Face Blog·Feb 24

Remote VAEs for decoding with Inference Endpoints 🤗

Hugging Face integrates remote VAEs (Variational Autoencoders) into Inference Endpoints for decoding. This feature enables using remotely hosted VAE models without local loading, optimizing resource usage and latency.

Infrastructure Tools Open source

SIG

HYP

OpenAI Blog·Feb 21

Disrupting malicious uses of AI

OpenAI announces measures against malicious AI use, including deepfake detection, abuse filtering, and law enforcement collaboration. No technical details or impact metrics provided in excerpt.

OpenAI AI safety Regulation

SIG

HYP

Hugging Face Blog·Feb 21

SigLIP 2: A better multilingual vision language encoder

Hugging Face releases SigLIP 2, an improved multilingual vision-language encoder. The model delivers better performance on vision and multilingual understanding tasks compared to its predecessor.

Vision Embeddings Open source

SIG

HYP

Hugging Face Blog·Feb 20

SmolVLM2: Bringing Video Understanding to Every Device

Hugging Face releases SmolVLM2, a lightweight multimodal vision model capable of processing videos and images. Optimized for mobile and edge devices, it provides an accessible alternative to large vision models.

Vision Open source Tools

SIG

HYP

Hugging Face Blog·Feb 19

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

Google releases PaliGemma 2 Mix, a family of instruction-tuned vision-language models based on Gemma 2. Three variants (3B, 10B, 28B) combine visual and textual capabilities for multimodal tasks. Available open-source on Hugging Face.

Gemini Vision Open source

SIG

HYP

OpenAI Blog·Feb 18

Introducing the SWE-Lancer benchmark

OpenAI introduces SWE-Lancer, a benchmark measuring frontier LLMs' ability to complete real-world freelance software engineering tasks and generate revenue. The test evaluates whether models can earn $1 million on actual projects.

OpenAI Benchmarks Code generation

SIG

HYP

Hugging Face Blog·Feb 18

Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥

Hugging Face adds three new serverless inference providers: Hyperbolic, Nebius AI Studio, and Novita. These integrations expand model deployment options on the Hugging Face platform.

Infrastructure Tools Open source

SIG

HYP

OpenAI Blog·Feb 14

OpenAI and Guardian Media Group launch content partnership

OpenAI and Guardian Media Group announce content partnership to integrate Guardian news articles into ChatGPT. Users will access Guardian journalism directly within ChatGPT's interface.

OpenAI Business

SIG

HYP

Hugging Face Blog·Feb 14

Welcome Fireworks.ai on the Hub 🎆

Fireworks.ai joins Hugging Face Hub. The inference platform specializing in open-source models integrates the ecosystem to streamline model deployment and access.

Open source Infrastructure Tools

SIG

HYP

Hugging Face Blog·Feb 14

Fixing Open LLM Leaderboard with Math-Verify

Hugging Face fixes its Open LLM Leaderboard by integrating Math-Verify, a mathematical verification method to more accurately evaluate language models' reasoning capabilities. This improvement addresses limitations of previous metrics.

Benchmarks Evals Reasoning

SIG

HYP

OpenAI Blog·Feb 13

Fanatics Betting and Gaming uses AI to focus on the big picture

Fanatics Betting and Gaming leverages AI to enhance financial strategy and operations. CFO Andrea Ellis discusses how the company deploys AI tools to analyze large-scale betting and gaming data, improving strategic decision-making across the business.

Business OpenAI

SIG

HYP

OpenAI Blog·Feb 13

Using OpenAI o1 for financial analysis

Rogo scales AI-driven financial research using OpenAI o1 for deeper analysis of complex financial data. The reasoning model enables automated financial analysis at scale. Production deployment case study demonstrating o1 adoption in fintech.

OpenAI Reasoning Business

SIG

HYP

Hugging Face Blog·Feb 13

1 Billion Classifications

Hugging Face announces crossing 1 billion classifications on its platform. This milestone reflects growing adoption of AI models for classification tasks in production.

Open source Benchmarks

SIG

HYP

OpenAI Blog·Feb 12

Sharing the latest Model Spec

OpenAI releases an updated Model Spec, the document defining expected behaviors for its models. This specification guides development and evaluation of capabilities and safety boundaries.

OpenAI AI safety Alignment

SIG

HYP

Hugging Face Blog·Feb 12

Build awesome datasets for video generation

Hugging Face publishes a guide for building high-quality datasets for training video generation models. The article covers best practices for data curation, annotation, and organization.

Video generation Tools Open source

SIG

HYP

Hugging Face Blog·Feb 12

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

Hugging Face optimizes uploads and downloads on the Hub by replacing chunks with blocks. This architecture reduces latency and improves stability for large file transfers.

Infrastructure Tools

SIG

HYP

Hugging Face Blog·Feb 10

Open R1: Update #2

Hugging Face releases Open R1 Update #2, advancing its open-source reasoning model. The update improves performance and reasoning capabilities on complex tasks.

Open source Reasoning Benchmarks

SIG

HYP

OpenAI Blog·Feb 10

OpenAI partners with Schibsted Media Group

OpenAI partners with Schibsted Media Group to integrate Guardian news and archive content into ChatGPT. Media content distribution partnership.

OpenAI Business

SIG

HYP

Hugging Face Blog·Feb 10

The Open Arabic LLM Leaderboard 2

Hugging Face releases the Open Arabic LLM Leaderboard 2, a ranking system evaluating Arabic language models on standardized benchmarks. The initiative measures performance in Arabic comprehension, generation, and reasoning tasks.

Benchmarks Open source Tools

SIG

HYP

OpenAI Blog·Feb 5

Introducing data residency in Europe

OpenAI introduces data residency in Europe, strengthening its enterprise-grade data privacy, security, and compliance programs for customers worldwide.

OpenAI Regulation Business

SIG

HYP

OpenAI Blog·Feb 4

OpenAI and the CSU system bring AI to 500,000 students & faculty

OpenAI and the CSU system deploy ChatGPT to 500,000 students and faculty. The largest ChatGPT deployment to date aims to expand AI use in education and build an AI-ready workforce in the United States.

GPT Business

SIG

HYP

Hugging Face Blog·Feb 4

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

Hugging Face introduces π0 and π0-FAST, vision-language-action models for general robot control. These models unify visual perception, natural language understanding, and action generation, trained on diverse robotic data to execute complex tasks without task-specific fine-tuning.

Robotics Vision AI Agents

SIG

HYP

OpenAI Blog·Feb 4

Building a custom math tutor powered by ChatGPT

OpenAI showcases a custom math tutor powered by ChatGPT, demonstrating how to adapt the model for education through specialized prompts and conversational interactions. No performance metrics or benchmarks included in the excerpt.

Claude Prompt engineering

SIG

HYP

Hugging Face Blog·Feb 4

Open-source DeepResearch – Freeing our search agents

Hugging Face releases open-source DeepResearch, an autonomous research agent that conducts in-depth investigations on complex topics. The tool integrates web search, information synthesis, and multi-step reasoning to generate detailed reports without human intervention.

AI Agents Open source Reasoning

SIG

HYP

Hugging Face Blog·Feb 4

DABStep: Data Agent Benchmark for Multi-step Reasoning

Hugging Face introduces DABStep, a benchmark for evaluating AI agents on multi-step reasoning. The tool measures models' ability to decompose complex tasks and iteratively use tools to solve problems.

AI Agents Benchmarks Reasoning

SIG

HYP

OpenAI Blog·Feb 2

Introducing deep research

OpenAI launches Deep Research, an agent using reasoning to synthesize online information and complete multi-step research tasks. Available to Pro users today, then Plus and Team.

OpenAI AI Agents Reasoning

SIG

HYP

OpenAI Blog·Feb 2

Understanding complex trends with deep research

OpenAI showcases deep research feature helping Bain & Company analyze complex industry trends. No technical details, benchmarks, or quantified results provided in excerpt.

OpenAI Business

SIG

HYP

Hugging Face Blog·Feb 2

Open-R1: Update #1

Hugging Face releases Update #1 on Open-R1, an open-source reasoning model project. The update covers progress and future directions for the initiative.

Open source Reasoning

SIG

HYP