What OpenRouter doesn't tell you about free models
OpenRouter offers 30+ LLM models for free. What you won’t find on the landing page: you need at least $10 to unlock the actually useful tier, models disappear without warning, and you can accidentally spend money you never intended to.
Why this matters
If you’ve ever thought about testing LLMs without committing to a provider, OpenRouter’s free tier probably caught your eye. It sounds simple: free models, one API, done. The reality is messier. There’s a hidden cost of entry, models can vanish overnight, and one wrong setting can drain your balance. In this article we break down how the free tier actually works, what the gotchas are, and how to use it without getting burned.
The $10.50 nobody mentions
The difference between “free” and “actually free” on OpenRouter comes down to two numbers: 50 and 1000. If you’ve never topped up your account, you get 50 requests per day on free models. If you’ve topped up at least $10, the daily limit jumps to 1000 requests. So the real cost of entry for using OpenRouter productively isn’t zero: it’s $10.50.
The extra fifty cents is the top-up fee. Pay with crypto and it’s 5% of the total, so $0.50 on $10. Pay with a credit card and there’s an $0.80 minimum fee, making the effective entry cost $10.80. Crypto is the better deal: lower fees and more control. But the real takeaway is that “free” on OpenRouter means “free after an initial investment.”
Models come and go
Open the free model list today and you’ll find roughly 30 options. Open it next week and some of them might be gone. Step 3.5 Flash was one of the fastest and most popular free models: vanished. Qwen 3.6 Coder was free for about a week, then went paid with no warning.
The point isn’t that models change. It’s that if you’re using OpenRouter as a backend for an agent or automated workflow, a model that exists today might not exist tomorrow. Your code breaks and nobody tells you. This is the structural trade-off of the free tier: availability isn’t contractual, it’s temporary. Providers make models free for promotion, testing, or to drive adoption, and they pull them when priorities shift.
The current list as of April 2026 includes models like Nemotron 3 Super (NVIDIA), Qwen3 Coder, Gemma 4, MiniMax M2.5, GLM 4.5 Air, and OpenAI GPT-OSS 120B. But nobody guarantees they’ll still be there next week.
The mistake that costs $1.50
If you plug your OpenRouter API key into an agent without setting the spend limit to zero, you’re playing Russian roulette with your wallet. The video’s author spent $1.50 on Gemini 3 Flash without intending to: his agent, configured to use only free models, started routing requests to paid models without him noticing.
The fix is simple: create an API key with a $0 credit limit. That way, if the system tries to use a paid model, the request fails instead of charging you. It’s a 10-second setting in the API Keys section of the OpenRouter dashboard, and it saves you from surprise charges.
Rate limiting: the invisible bottleneck
Even with 1000 daily requests unlocked, the 20 requests-per-minute limit is a tight leash. If you’re using a popular model like Nemotron Super during peak hours, getting 429 errors on simple requests isn’t unusual. The server is overloaded because everyone is hammering the same free model at the same time.
The per-model rate limiting means the 20 RPM cap applies to each individual model, not in aggregate. In theory, you could spread requests across multiple models to increase throughput. In practice, if your workflow depends on a specific model, that model’s rate limit is your real constraint.
OpenRouter in the API landscape: why it exists and what it gets out of it
OpenRouter isn’t a model provider. It’s a router: an intermediary that aggregates access to models from dozens of providers (OpenAI, Google, Meta, NVIDIA, Qwen, MiniMax, and many others) through a single API. Its business model rests on two pillars:
- A small fee on every paid request (typically 5% of the provider’s cost)
- The ability to route traffic intelligently across providers
Free models are a customer acquisition investment. When NVIDIA makes Nemotron Super available for free on OpenRouter, it’s because they want developers to get familiar with the model, integrate it into their workflows, and then upgrade to the paid version when free tier limits become untenable. OpenRouter earns from the fee on paid transactions that follow. It’s the freemium playbook: the free product is marketing, not the product.
Compared to the competition:
- Together AI: serverless inference with competitive pricing but no meaningful free tier
- Fireworks AI: focuses on custom deployments
- Groq: generous free tier but a much more limited model catalog
OpenRouter’s uniqueness is the breadth of its catalog and the convenience of a single API for everything.
How to verify a model is actually free
Don’t trust what your client tells you. OpenCode, for instance, shows a list of “free” OpenRouter models that sometimes doesn’t match current reality. A model can appear in your tool’s list but no longer be available as a free tier on OpenRouter. The result: the request fails silently, or worse, gets routed to the paid version of the same model.
The safe procedure: go to openrouter.ai/models, filter by “free”, and verify that the model you want to use is actually on the list. It’s tedious but necessary if you don’t want surprises. Alternatively, use OpenRouter’s Free Models Router (openrouter/free), which automatically selects the best available free model based on the request’s requirements (tool calling, vision, etc.).
OpenCode: the easiest way to get started
For those who want to configure providers and models without hand-rolling API calls, OpenCode is the most straightforward interface. Available as both a desktop app and a terminal tool, connecting OpenRouter is a matter of pasting your API key, and free models are pre-filtered and ready to go. The model ID is visible and copyable, making it easy to configure agents or workflows targeting the right model.
The terminal approach is equally simple: opencode, then /models, Ctrl+A to connect a provider, and you’re operational. It’s not the only OpenRouter-compatible client, but it’s one of the few that makes free model configuration genuinely transparent.
Free alternatives for LLM APIs
OpenRouter isn’t the only way to use LLM models without spending money. Here are the main alternatives:
- Ollama (local): runs open source models directly on your hardware. Zero cost, zero rate limits, but constrained by your GPU’s power. Ideal for models up to ~14B parameters on an M-series MacBook.
- Google AI Studio: offers Gemini Flash for free with generous limits (around 1500 requests/day). Great for prototyping and testing.
- Groq free tier: ultra-fast inference on Llama, Mixtral, and other open source models. Rate limits exist but are reasonable for personal use.
- NVIDIA NIM: free models on build.nvidia.com, roughly 40 requests per minute. Covers Kimi K2.5, MiniMax M2.5, GLM-5, DeepSeek V3.2, and others. Good for prototyping, not for production.
- Hugging Face Inference: free tier with popular models, though limits are tighter.
The choice depends on the use case. If you need a broad catalog and routing flexibility, OpenRouter is still the best pick. If you need speed and predictability, Groq or local is better. If you need reliability without surprises, Google AI Studio is hard to beat at the price of zero.
From here on we get into the technical details. If you’re just looking for the practical takeaways, you can skip to the conclusion.
Free models as of April 2026: what’s actually available
The free model list on OpenRouter changes frequently, but as of April 2026 it includes roughly 30 models. The most practically useful ones:
| Model | Provider | Context | Notes |
|---|---|---|---|
| Nemotron 3 Super 120B | NVIDIA | 262K | Most used on the free tier, good generalist |
| Qwen3 Coder | Qwen | 262K | Excellent for coding, but may disappear |
| MiniMax M2.5 | MiniMax | 197K | Good agentic, tool calling |
| GPT-OSS 120B | OpenAI | 131K | Open source version from OpenAI |
| GLM 4.5 Air | Z.ai | 131K | Lightweight, solid performance |
| Llama 3.3 70B | Meta | 66K | Reliable generalist |
| Gemma 4 26B | 262K | Multimodal with vision | |
| Ling 2.6 1T | InclusionAI | 262K | Chinese model, tool calling |
All have the :free suffix in the model ID and are subject to rate limiting (20 RPM, 1000 req/day with $10+ topped up). Availability is not guaranteed: providers can remove them at any time.
The bottom line
Key points:
- OpenRouter’s “free” tier requires a $10+ top-up to be genuinely useful (50 vs 1000 daily requests)
- Free models can disappear without warning, so never depend on one for production workflows
- Setting a $0 spend limit on your API key prevents accidental charges on paid models
OpenRouter’s real value isn’t the free tier. It’s the flexibility to switch providers with a single API. The free models are how you discover which provider works best for your use case, before you start paying.
🔗 Resources
- OpenRouter - Free Models - Official updated list
- OpenRouter - Pricing - Tier details, limits, and costs
- OpenRouter - Free Models Router - Documentation on the automatic router
- OpenCode - Desktop/terminal client for managing providers and models