☀️ AI Morning Minute: Gemma

Google has been keeping its best AI models behind a paywall. Gemma is what happens when it doesn't.

Jun 30, 2026

Most people know Google’s AI through Gemini, the model that powers Google Search, Gmail, and Workspace. Gemma is something different. It’s Google’s family of open-weights models, meaning you can download the actual files, run them on your own hardware, and use them commercially without asking Google’s permission or paying Google’s prices. The current version, Gemma 4, released in April 2026, is the most capable Google has made available this way.

What it means

Gemma is built by Google DeepMind and trained on the same research infrastructure as Gemini, but released publicly under the Apache 2.0 license. That license matters. Apache 2.0 is about as permissive as it gets: you can download the model, fine-tune it on your own data, build a product on top of it, and charge customers for that product, all without licensing fees or restrictions. You own what you deploy.

Gemma 4 comes in several sizes, from small models designed to run on a phone or Raspberry Pi all the way up to a 31-billion parameter model that runs on a single high-end GPU. The 31B model currently ranks third among all open-weights models on the Arena AI leaderboard, competing against models with far more parameters. The math that makes that possible: a 26-billion parameter version uses a Mixture of Experts architecture, meaning it activates only about 4 billion parameters per token during inference. The knowledge base of a 26B model at roughly the compute cost of a 4B one.

The whole family handles text, images, audio, and video natively, across 140 languages, with a context window up to 256,000 tokens on the larger models.

Why it matters

The licensing shift is the real story. Earlier open-weights models from major labs often came with custom licenses that created legal friction for commercial use. Apache 2.0 removes that. A healthcare company that can’t send patient data to a third-party API can now run Gemma locally, on their own servers, with no data leaving their infrastructure.
The performance numbers closed the gap. On AIME 2026 math benchmarks, Gemma 3’s 27B model scored 20.8%. Gemma 4’s 31B scored 89.2%. Gemma 4 was also downloaded over 400 million times across all versions before this latest release, with more than 100,000 variants built on top of it by the developer community.
It puts competitive AI on hardware most people already own. The 4B model runs on a consumer GPU with 12GB of memory. The smallest edge models run offline on a phone. For developers, researchers, and small companies that can’t afford proprietary API costs at scale, that changes what’s buildable.

The AI Morning Minute

Discussion about this post

Ready for more?