Topic

#Image generation

Image generation refers to an AI model's ability to create visuals from a text description. Stable Diffusion, for example, produces realistic or artistic images in seconds from a simple prompt.

40Articles
8Sources
66Avg. signal
GitHub Trending·

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> Comfy-Org /</span> ComfyUI

ComfyUI is a modular GUI for diffusion models with a node/graph-based interface, providing API and backend capabilities for image generation.

Image generationOpen sourceTools
SIG
75
HYP
00
GitHub Trending·

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> yossTheDev /</span> removerized

Removerized is an AI image toolkit running fully in the browser. Free, private, and offline-first with no server dependency.

Image generationOpen sourceTools
SIG
45
HYP
00
Reddit r/LocalLLaMA·

Small comparison on full compute performance (Anima) of 5090 (600,475 and 400W) vs 6000 PRO MaxQ (325W), and 6000 PRO WS/SE (600W).

Compute performance benchmark (text-to-image diffusion) comparing RTX 5090 (400-600W) vs RTX 6000 PRO MaxQ (325W) and 6000 PRO WS (600W). Tests on Forge Neo with SageAttention 2.1, 896x1088 resolution, batch size 4. 5090 undervolted/overclocked (2930MHz, +4400MHz VRAM), 6000 PRO MaxQ modified (+550MHz core).

Image generationBenchmarksInfrastructure
SIG
45
HYP
00
GitHub Trending·

<svg aria-hidden="true" data-component="Octicon" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-repo mr-1 tmp-mr-1 color-fg-muted"> <path d="M2 2.5A2.5 2.5 0 0 1 4.5 0h8.75a.75.75 0 0 1 .75.75v12.5a.75.75 0 0 1-.75.75h-2.5a.75.75 0 0 1 0-1.5h1.75v-2h-8a1 1 0 0 0-.714 1.7.75.75 0 1 1-1.072 1.05A2.495 2.495 0 0 1 2 11.5Zm10.5-1h-8a1 1 0 0 0-1 1v6.708A2.486 2.486 0 0 1 4.5 9h8ZM5 12.25a.25.25 0 0 1 .25-.25h3.5a.25.25 0 0 1 .25.25v3.25a.25.25 0 0 1-.4.2l-1.45-1.087a.249.249 0 0 0-.3 0L5.4 15.7a.25.25 0 0 1-.4-.2Z"></path> </svg> <span data-view-component="true" class="text-normal"> NVlabs /</span> Sana

NVIDIA Labs releases Sana, a linear diffusion transformer for efficient high-resolution image synthesis. Architecture reduces computational complexity while maintaining visual quality.

Image generationOpen sourcePapers
SIG
75
HYP
00
arXiv cs.AI·

Perception-based Image Denoising via Generative Compression

Paper proposes generative compression framework for perception-based image denoising. Two approaches: conditional WGAN-based denoiser explicitly controlling rate-distortion-perception trade-off, and conditional diffusion-based iterative reconstruction guided by compressed latents. Theoretical guarantees and perceptual improvements demonstrated on synthetic and real-noise benchmarks.

Image generationPapersBenchmarks
SIG
72
HYP
00
arXiv cs.AI·

Curriculum Group Policy Optimization: Adaptive Sampling for Unleashing the Potential of Text-to-Image Generation

CGPO (Curriculum Group Policy Optimization) improves text-to-image model training via adaptive curriculum based on reward variance. Method prioritizes partially-mastered prompts (high variance) and balances categories through proportional fairness optimization. Gains validated on GenEval, T2I-CompBench++, DPG Bench.

Image generationReinforcement learningBenchmarks
SIG
72
HYP
00
arXiv cs.AI·

SIPO: Stabilized and Improved Preference Optimization for Aligning Diffusion Models

SIPO stabilizes diffusion model alignment to human preferences by addressing training instability and off-policy bias. The method introduces DPO-C&M to clip uninformative timesteps and applies timestep-aware importance reweighting. Experiments on SD1.5, SDXL, CogVideoX-2B/5B, and Wan2.1-1.3B demonstrate improvements over Diffusion-DPO.

Image generationVideo generationReinforcement learning
SIG
72
HYP
00