Policy Gradient with PyTorch
Hugging Face publishes a guide on policy gradients with PyTorch. The article covers reinforcement learning fundamentals and implements classical algorithms. Reproducible code and examples included.
16 articles
Hugging Face publishes a guide on policy gradients with PyTorch. The article covers reinforcement learning fundamentals and implements classical algorithms. Reproducible code and examples included.
Hugging Face practical guide to starting your first ML project. Covers fundamental steps: problem definition, data collection, training, and deployment. Includes recommended resources and tools.
OpenAI outlines safety mitigations implemented for DALL·E 2 to prevent generation of images violating its content policy. The goal is to make the model broadly accessible while reducing risks associated with powerful image generation models.
Hugging Face launches an evaluation platform integrated into the Hub for benchmarking models directly. Users can create custom evaluations, compare performance, and share results without external infrastructure.
Hugging Face integrates DeepSpeed to accelerate large model training. The solution optimizes memory and speed through model partitioning, gradient optimization, and mixed precision.
OpenAI trains a neural network to play Minecraft using Video PreTraining (VPT) on unlabeled human gameplay videos. The model learns to craft diamond tools (24,000-action task) with minimal labeled data. It uses native human interface (keyboard/mouse) and represents progress toward general computer-using agents.
Introductory guide to embeddings: vector representations of text, images, or data. Covers use cases (RAG, semantic search, clustering) and how to use embedding models via Hugging Face.
Hugging Face Optimum enables conversion of Transformer models to ONNX format to optimize inference performance. The tool automates conversion and provides quantization and optimization options for efficient deployment on CPU and GPU.
OpenAI examines how large language models evolve and improve. The article investigates learning mechanisms and emergent capabilities of LLMs across different model scales.
Intel and Hugging Face partner to democratize machine learning hardware acceleration. The partnership aims to make optimization and inference tools more accessible to developers by integrating Intel technologies into the Hugging Face ecosystem.
Hugging Face releases part 3 of a series on ML insights applied to the finance sector. The article explores how language models and ML techniques transform financial analysis, fraud detection, and risk management.
OpenAI trains critique-writing models to describe flaws in summaries. Human evaluators detect significantly more flaws when shown AI-generated critiques. Larger models excel at self-critique, with scale improvements greater for critique than summary generation.
OpenAI publishes techniques for training large neural networks, highlighting challenges of orchestrating GPU clusters for synchronized large-scale computations.
Hugging Face releases an annotated guide to diffusion models, explaining the mathematical mechanisms and practical implementation of the progressive noise-addition process for image generation.
Deep Q-Learning implementation on Space Invaders. Uses a neural network to approximate Q-values and optimize agent policy. Demonstrates reinforcement learning applied to classic video games.
Cohere, OpenAI, and AI21 Labs release preliminary best practices for deploying language models, applicable to any organization developing or deploying LLMs.