January 2021

6 articles

Faster TensorFlow models in Hugging Face Transformers

Hugging Face optimizes TensorFlow models in its Transformers library to accelerate inference. Performance improvements enable faster deployments without accuracy loss.

Tools Infrastructure

SIG

HYP

OpenAI Blog·Jan 25

Scaling Kubernetes to 7,500 nodes

OpenAI scaled Kubernetes clusters to 7,500 nodes to support training of large models (GPT-3, CLIP, DALL·E) and iterative research. Critical infrastructure for language model scalability.

Infrastructure OpenAI Benchmarks

SIG

HYP

Hugging Face Blog·Jan 19

Fit More and Train Faster With ZeRO via DeepSpeed and FairScale

Hugging Face integrates ZeRO (Zero Redundancy Optimizer) from DeepSpeed and FairScale to reduce GPU memory and accelerate model training. ZeRO partitions optimizer states, gradients, and parameters across GPUs, enabling training of larger models with fewer resources.

Infrastructure Fine-tuning Open source

SIG

HYP

Hugging Face Blog·Jan 18

How we sped up transformer inference 100x for 🤗 API customers

Hugging Face achieved 100x speedup in transformer inference for API customers through quantization, dynamic batching, and KV cache optimization. Models like Llama 2 and Mistral show measurable latency and throughput gains.

Infrastructure Benchmarks Llama

SIG

HYP

OpenAI Blog·Jan 5

CLIP: Connecting text and images

OpenAI introduces CLIP, a neural network that efficiently learns visual concepts from natural language supervision. CLIP enables zero-shot visual classification by simply providing category names, without task-specific training.

OpenAI Vision Benchmarks

SIG

HYP

OpenAI Blog·Jan 5

DALL·E: Creating images from text

OpenAI introduces DALL·E, a neural network that generates images from text captions in natural language, covering a wide range of expressible concepts.

OpenAI Image generation Vision

SIG

HYP