Topic

#Gemini

Gemini is Google DeepMind's family of multimodal AI models, designed to process text, images, audio, and video together. For example, Gemini 1.5 Pro can analyze long documents and videos within a single prompt.

40Articles

9Sources

60Avg. signal

Le Big Data·Jun 18

Noam Shazeer : le cerveau de Gemini lâche Google pour OpenAI

Noam Shazeer, key researcher in Gemini's development at Google, is leaving the company to join OpenAI. This departure marks a significant shift in competition between the two AI giants.

Gemini OpenAI Business

SIG

HYP

Reddit r/LocalLLaMA·Jun 17

GLM 5.2 Release Video [Made with GLM 5.2]

GLM 5.2 generates videos via Remotion, comparable to Fable but below Gemini 3.1 Pro. Server overload observed on OpenRouter with timeouts on long outputs.

Video generation Gemini Qwen

SIG

HYP

Reddit r/LocalLLaMA·Jun 17

Gemma 4 E2B running in-browser at 255 tok/s using WebGPU kernels written by Fable 5

Gemma 4 E2B runs in-browser at 255 tokens/sec using WebGPU kernels optimized by Fable 5. Demo and kernels released on Hugging Face.

Gemini Code generation Open source

SIG

HYP

arXiv cs.LG·Jun 17

When the Next Step Is Not One Step: Distribution-Aware Execution Modeling for Concurrent Go Programs

7B model fine-tuned to predict next step in concurrent Go programs by learning event distributions rather than single labels. On 798 predictions from real bugs (CockroachDB, Kubernetes, gRPC, etcd), achieves 36.2% accuracy with <1000 traces, outperforming Gemini 3.5 Flash zero-shot (34.8%). Dataset, adapters, and tooling released.

Code generation Benchmarks Fine-tuning

SIG

HYP

Reddit r/LocalLLaMA·Jun 16

Why might DiffusionGemma be better at tool calls than its benchmark quality suggests

DiffusionGemma generates 256 tokens in parallel with bidirectional attention, enabling self-correction before finalization. Unlike autoregressive models locked after each token, this architecture could improve structured tool calls despite lower base quality than Gemma 4. Testing needed to confirm if bidirectional correction compensates for lower quality.

Gemini Code generation Reasoning

SIG

HYP

Reddit r/LocalLLaMA·Jun 16

Gemma 12b - Reasoning hardening instructions

A user shares a system instruction to improve reasoning in Gemma 12b QAT. The technique aims to reduce cognitive bias and adapt reasoning depth to context. It works well on trick questions but partially fails on certain problems depending on framing.

Gemini Prompt engineering Reasoning

SIG

HYP

Le Big Data·Jun 16

Ces hackers chinois utilisent Gemini pour piéger des tas de gens : Google riposte !

The FBI and Google dismantled a Chinese cybercriminal network using Gemini for attacks. Google responded to these platform abuses.

Gemini AI safety Regulation

SIG

HYP

Le Big Data·Jun 15

Oups… Amazon a dévoilé le Pixel Drop de Google avant l’heure

Amazon accidentally revealed Google's Pixel Drop ahead of its official announcement. Three new AI features for Pixel smartphones were exposed prematurely.

Gemini

SIG

HYP

Reddit r/LocalLLaMA·Jun 15

React Native ExecuTorch now runs Gemma 4 (Vulkan and MLX accelerated)

ExecuTorch integrates Gemma 4 into React Native with GPU acceleration: Vulkan on Android, MLX on Apple Silicon. Fully offline execution.

Gemini Code generation Tools

SIG

HYP

arXiv cs.CL·Jun 15

LoSoNA: A Benchmark for Local Social Norm Adaptation in Group Conversations

LoSoNA is a benchmark measuring LLM ability to recognize and adapt to local social norms in group chats. Eight frontier and open-weight models tested under four prompting conditions: Gemini 3.1 Pro reaches 84.2%, Claude Fable 5 81.6%. Explicit norm-aware prompting helps unevenly.

Benchmarks Claude Gemini

SIG

HYP

arXiv cs.AI·Jun 15

Dense Coordinate-List Fine-Tuning Induces a Controllable Interference Surface in Vision-Language Models

Fine-tuning vision-language models on dense coordinate lists improves visual grounding but induces spurious repetitions. On Gemma 4 12B, high-capacity LoRA raises F1@0.3 from 0.007 to 0.448 but creates duplicate rate 0.080. Object-level control removes repetitions (rate 0.000) while preserving performance (F1 0.490).

Fine-tuning Vision Benchmarks

SIG

HYP

arXiv cs.LG·Jun 15

Can Editing 1 Neuron Fix Repetition Loops in LLMs?

Gemma 4 models exhibit repetition loops on long enumerations (up to 95% failure rate). Per-neuron ablation identifies a few MLP neurons responsible: suppressing them via weight edits removes simple loops but not 'doom loops' (infinite self-correction), limited by knowledge gaps rather than removable circuits.

Gemini Papers Evals

SIG

HYP

arXiv cs.CL·Jun 15

Which Models Perform Better in Inheritance Reasoning?

Evaluation of commercial vs open-source LLMs on Islamic inheritance legal reasoning (QIAS 2026 shared task). Gemini 2.5 Flash achieves best performance (MRE 0.989), while open-source models show greater instability in dependent legal decisions and fractional share adjustments.

Benchmarks Reasoning Gemini

SIG

HYP

Reddit r/LocalLLaMA·Jun 14

Gemma 12b less than 10 watts 6.5pp 1.3tg

Gemma 12B running on Google Pixel 10 Pro via Termux and llama.cpp (v9639) consumes under 10W. Performance: 6.5 tokens/s prompt, 1.3 tokens/s generation with 32k context and Q3_K_XL quantization.

Gemini Open source Infrastructure

SIG

HYP

Reddit r/LocalLLaMA·Jun 14

Gemma 4 models benchmarked on with Triple GPU

Gemma 4 benchmarked on triple GPU setup (3× GTX-1070, 24 GiB VRAM total). Gemma-4-26B-A4B-qat achieves 123.5 t/s prompt processing and 53.08 t/s generation. Gemma-4-E4B-BF16 reaches 302.16 t/s but limited to 11.54 t/s generation. Tests on llama.cpp build 9204 with GGUF quantizations.

Gemini Benchmarks Open source

SIG

HYP

Reddit r/LocalLLaMA·Jun 13

Yay got Gemma 12B QAT working on old 1080ti (maybe with speculative decoding?)

User runs Gemma 12B QAT on GTX 1080 Ti (9 years old) at 50 tok/sec. Setup includes speculative decoding with MTP draft model and Q4_K_XL quantization. Seeking further optimizations.

Gemini Code generation Open source

SIG

HYP

The Decoder·Jun 13

Google Research's Gemini-SQL2 tops text-to-SQL benchmarks by a wide margin

Google Research's Gemini-SQL2, built on Gemini 3.1 Pro, achieves 80.04% accuracy on the BIRD benchmark for natural language-to-SQL conversion, significantly outperforming OpenAI and Anthropic. Google plans to integrate this technology into its data services.

Gemini Benchmarks Code generation

SIG

HYP

Reddit r/LocalLLaMA·Jun 12

Diffusion Gemma is 4x faster, but makes 6x more mistakes!

H100 (FP8) benchmark: DiffusionGemma 26B generates 763 tok/s (3.7s) vs Gemma4 218 tok/s (15.1s), but produces 28 factual errors across 61 tested facts vs 5 for Gemma4. DiffusionGemma invents names, dates, numbers (Clara Clley as Jobs' mother, BeBox at $9,999 instead of $1,600). The diffusion model generates 256 tokens simultaneously and polishes text without verifying factuality.

Gemini Benchmarks Evals

SIG

HYP

Le Big Data·Jun 12

Gemini peut maintenant régler l’image sur Google TV… mais il y a un hic

Google integrates Gemini into Google TV to adjust image settings. The feature enables AI to control visual parameters, but limitations remain according to the article.

Gemini Tools

SIG

HYP

ActuIA·Jun 12

Siri AI : Gemini comme professeur, pas comme moteur - ce que la WWDC n'a pas dit

Apple integrates Google's Gemini into Siri at WWDC on June 8, but not as the primary engine. The article challenges the dominant interpretation of this partnership and explores Gemini's actual role in Siri AI's architecture.

Gemini Business

SIG

HYP

Reddit r/LocalLLaMA·Jun 12

Open Dungeon: local roleplay with Gemma 4 QAT + inline Uncen-FLUX images, running at full 256K context under 8GB RAM (OS)

Open Dungeon is a local roleplay game using Gemma 4 QAT (12B) via Ollama for narration and FLUX for image generation. Runs on 7.7GB RAM with full 256K context, no APIs or cloud. Features Do/Say/Story modes, line editing, model selection. MIT licensed, source available.

Gemini Open source Image generation

SIG

HYP

arXiv cs.CL·Jun 12

Shopping Reasoning Bench: An Expert-Authored Benchmark for Multi-Turn Conversational Shopping Assistants

Shopping Reasoning Bench: expert-authored benchmark of 525 missions (232 single-turn, 293 multi-turn) with 10,863 importance-weighted binary rubrics for evaluating conversational shopping assistants. Evaluation of 9 models (GPT, Claude, Gemini): pass rates 57–77%, performance degrades 4–18 points across conversation turns, 13–29 point gap between required and optional criteria.

Benchmarks GPT Claude

SIG

HYP

Reddit r/LocalLLaMA·Jun 12

Some contrived tests comparing the accuracy of different Gemma and Qwen quantizations

Empirical comparison of Gemma and Qwen quantizations across three tasks (arithmetic, presidential dates, attention). Gemma-4-31B-Q4_K_S reaches 83.8% arithmetic and 87% attention accuracy. Qwen3.6-27B-Q4_K_S achieves 95.5% arithmetic and 100% presidents. Results demonstrate major impact of model size and quantization scheme on accuracy.

Gemini Qwen Evals

SIG

HYP

Reddit r/LocalLLaMA·Jun 11

Gemma 4 Quadruple Release, 12B, 12B QAT, 26B-A4B QAT and 31B QAT Uncensored Heretics!

Four quantized and uncensored Gemma 4 variants released: 12B, 12B QAT, 26B-A4B QAT, and 31B QAT. Multiple formats available (Safetensors, GGUF, NVFP4, GPTQ-Int4) on Hugging Face. Uncensored 'heretic' versions.

Gemini Open source

SIG

HYP

Reddit r/LocalLLaMA·Jun 11

DiffusionGemma 26B A4B results on my 5090

DiffusionGemma 26B A4B quantization benchmarks (Q6_K 22GB, Q4_K_M 16GB) on RTX 5090. Stable context: 6,144 tokens (Q6_K) and 10,240 tokens (Q4_K_M) limited by disabled Flash Attention on SM120. Optimal parameters and llama.cpp invocations documented.

Gemini Code generation Benchmarks

SIG

HYP

Le Big Data·Jun 11

DiffusionGemma : l’IA de Google met un coup d’accélérateur à la génération de texte

Google introduces DiffusionGemma, an experimental text generation model 4x faster than standard approaches. The model rethinks the architecture of text generation.

Gemini Code generation

SIG

HYP

Le Big Data·Jun 11

Gemini 3.5 Translate va faire tomber la barrière des langues

Google launches Gemini 3.5 Translate to translate conversations in real-time across 70+ languages without altering original content.

Gemini

SIG

HYP

arXiv cs.CL·Jun 11

APEX: Automated Prompt Engineering eXpert with Dynamic Data Selection

APEX automatically optimizes prompts by dynamically selecting training data. The framework stratifies the dataset into Easy, Hard, and Mixed tiers, prioritizing the Mixed frontier to generate informative mutations. Within 5,000 evaluation calls, APEX improves performance by 11.2% on Gemini 2.5 Flash and 6.8% on Gemma 3 27B.

Prompt engineering Benchmarks Gemini

SIG

HYP

Simon Willison·Jun 10

DiffusionGemma

Google releases DiffusionGemma-26B, an open-weight Gemma model (Apache 2 license) based on its May 2024 Gemini Diffusion research. The model generates text at 500+ tokens/second. NVIDIA hosts it free on NIM cloud API.

Gemini Open source Code generation

SIG

HYP

The Decoder·Jun 10

Google's new open model DiffusionGemma generates text from noise instead of word by word

Google releases DiffusionGemma, a 26B-parameter model generating text via diffusion (noise → text) rather than token-by-token. Achieves ~1,000 tokens/sec on H100 (4× faster than comparable autoregressive models), but lower output quality. Positioned as experimental tool for developers.

Gemini Open source Code generation

SIG

HYP

Reddit r/LocalLLaMA·Jun 10

Google Drops Diffusion Version of Gemma

Google releases DiffusionGemma, a diffusion-based variant of Gemma with 26B parameters and 4B active. Claims 700+ tokens/sec throughput on RTX 5090.

Gemini Open source Code generation

SIG

HYP

Reddit r/LocalLLaMA·Jun 10

DiffusionGemma: 4x faster text generation

DiffusionGemma achieves 4x faster text generation by using diffusion instead of autoregressive decoding. Built on Gemma, the model applies diffusion techniques to parallelize generation and reduce latency.

Gemini Code generation Benchmarks

SIG

HYP

Reddit r/LocalLLaMA·Jun 10

DiffusionGemma: The Developer Guide- Google Developers Blog

Google releases a developer guide for DiffusionGemma, its diffusion-based image generation model. The guide covers integration, optimization, and practical use cases for developers.

Gemini Image generation Tools

SIG

HYP

Le Big Data·Jun 10

Google Gemini rencontre de gros problèmes, que se passe-t-il vraiment ?

Gemini experiences outage affecting multiple users with error messages and mobile bugs. Google maintains official silence on the incident.

Gemini

SIG

HYP

The Decoder·Jun 10

Google's NotebookLM now runs its own cloud computer with code execution and agent-based research

Google upgrades NotebookLM with Gemini 3.5 Flash, cloud-based code execution, and autonomous source discovery via Google Search. Internal tests show the new version outperforms the previous one 78.2% of the time.

Gemini AI Agents Code generation

SIG

HYP

Reddit r/LocalLLaMA·Jun 10

Anyone gotten Gemma 4 12B (unified audio) to actually attend to speech with a large system prompt?

User reports Gemma 4 12B (unified audio/vision/text model) ignores audio input when system prompt exceeds ~21k tokens. Model works well with minimal prompt but generates generic/hallucinated responses with dense context. Behavior reproduced across vLLM, llama.cpp, and LiteRT-LM. Appears to be an inherent attention saturation limit.

Gemini Voice Multi-agent

SIG

HYP

arXiv cs.CL·Jun 10

Gaming AI-Assisted Peer Reviews Poses New Risks to the Scientific Community

Researchers demonstrate that AI systems used for scientific peer review are vulnerable to simple manipulation: superficially rephrasing a manuscript abstract improves acceptance scores by 38% without changing scientific content. The attack costs ~$1 and takes 5 minutes, affecting Gemini 3 Flash and GPT 5.4 Mini reviewers.

GPT Gemini Evals

SIG

HYP

Hacker News (AI)·Jun 10

German ruling declares Google liable for false answers in AI Overviews

A German court ruled Google liable for false answers generated by its AI Overviews system. The decision establishes that Google must verify the reliability of AI-generated content before displaying it to users.

Regulation AI safety Gemini

SIG

HYP

Reddit r/LocalLLaMA·Jun 9

Watch agents fight: a live challenge to speed up Gemma 4 E4B inference on a single A10G

Community challenge to optimize Gemma 4 E4B inference on A10G GPU. Participants test acceleration techniques in real-time to reduce latency and increase throughput on a single card.

Gemini Benchmarks

SIG

HYP

The Decoder·Jun 9

Google's Gemini 3.5 Live Translate delivers real-time voice translation across 70+ languages

Google releases Gemini 3.5 Live Translate, an audio model for real-time voice translation across 70+ languages. The system translates continuously without waiting for sentence completion and preserves speaker tone, pace, and pitch. Google Meet support expands from 5 to 70+ languages.

Gemini Voice

SIG

HYP