December 2023

14 articles

Speculative Decoding for 2x Faster Whisper Inference

Hugging Face implements speculative decoding to accelerate Whisper inference by 2x. The technique uses a lightweight model to generate candidate tokens, validated by the full model in parallel, reducing latency without quality loss.

Code generation Infrastructure Open source

SIG

HYP

Hugging Face Blog·Dec 18

2023, year of open LLMs

2023 marked the emergence of open-source LLMs as viable alternatives to proprietary models. Llama, Mistral and others democratized access to large language models, reducing dependence on OpenAI and Google.

Open source Llama Mistral

SIG

HYP

OpenAI Blog·Dec 14

Practices for Governing Agentic AI Systems

OpenAI releases governance practices for agentic AI systems, covering monitoring, operational boundaries, and control mechanisms. The document proposes frameworks for safely deploying autonomous agents in production.

AI Agents AI safety Alignment

SIG

HYP

OpenAI Blog·Dec 14

Superalignment Fast Grants

OpenAI launches $10M in grants for technical research on alignment and safety of superhuman AI systems, covering weak-to-strong generalization, interpretability, and scalable oversight.

OpenAI Alignment AI safety

SIG

HYP

OpenAI Blog·Dec 14

Increasing accuracy of pediatric visit notes

Summer Health uses OpenAI's GPT to improve accuracy of pediatric visit notes. The system automates medical documentation during pediatrician visits, reducing errors and administrative time.

GPT OpenAI Business

SIG

HYP

OpenAI Blog·Dec 14

Weak-to-strong generalization

OpenAI explores leveraging deep learning's generalization properties to control strong models with weak supervisors. New research direction for superalignment with promising initial results.

OpenAI Alignment Reasoning

SIG

HYP

OpenAI Blog·Dec 13

Partnership with Axel Springer to deepen beneficial use of AI in journalism

OpenAI and Axel Springer announce a partnership to integrate journalism into AI technologies. Axel Springer becomes the first global publishing house to sign such a deal with OpenAI.

OpenAI Business

SIG

HYP

Hugging Face Blog·Dec 11

Mixture of Experts Explained

Hugging Face explains Mixture of Experts (MoE) architecture: a mechanism where a router directs inputs to specialized experts instead of using all parameters. Reduces latency and increases model capacity without proportional computational overhead.

Open source Infrastructure Benchmarks

SIG

HYP

Hugging Face Blog·Dec 11

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Mixtral, a state-of-the-art Mixture of Experts (MoE) model, is now available on Hugging Face. The model delivers superior performance with improved computational efficiency through its specialized expert architecture.

Open source Benchmarks Infrastructure

SIG

HYP

Hugging Face Blog·Dec 6

SetFitABSA: Few-Shot Aspect Based Sentiment Analysis using SetFit

Hugging Face introduces SetFitABSA, a few-shot method for aspect-based sentiment analysis. SetFit enables efficient model training with minimal examples, without fine-tuning the base model weights.

Fine-tuning Prompt engineering Open source

SIG

HYP

Hugging Face Blog·Dec 5

AMD + 🤗: Large Language Models Out-of-the-Box Acceleration with AMD GPU

Hugging Face and AMD announce native integration to accelerate LLMs on AMD GPUs. Models run out-of-the-box without manual optimization, supporting RDNA and CDNA architectures.

Tools Infrastructure Open source

SIG

HYP

Hugging Face Blog·Dec 5

Optimum-NVIDIA Unlocking blazingly fast LLM inference in just 1 line of code

Hugging Face and NVIDIA release Optimum-NVIDIA, a library accelerating LLM inference with a single line of code. Native integration of NVIDIA optimizations (TensorRT-LLM, cuDNN) reduces latency and increases throughput without requiring code changes.

Tools Infrastructure Code generation

SIG

HYP

Hugging Face Blog·Dec 5

Goodbye cold boot - how we made LoRA Inference 300% faster

Hugging Face optimized LoRA inference to achieve 300% speed improvement. Optimizations target cold boot and reduce overall latency for low-rank adapters.

Fine-tuning

SIG

HYP

Hugging Face Blog·Dec 1

Open LLM Leaderboard: DROP deep dive

Hugging Face provides a detailed analysis of the DROP benchmark in the Open LLM Leaderboard, which evaluates reading comprehension and information extraction. The article examines model performance on this specific task and the challenges it presents.

Benchmarks Evals Open source

SIG

HYP