SL
Skeptik Log
skeptik-log

What OpenRouter doesn't tell you about free models

By Skeptik Log

OpenRouter offers 30+ LLM models for free. What you won’t find on the landing page: you need at least $10 to unlock the actually useful tier, models disappear without warning, and you can accidentally spend money you never intended to.

Source: YouTube video, OpenRouter
📺 Source: YouTube video, OpenRouter

Why this matters

If you’ve ever thought about testing LLMs without committing to a provider, OpenRouter’s free tier probably caught your eye. It sounds simple: free models, one API, done. The reality is messier. There’s a hidden cost of entry, models can vanish overnight, and one wrong setting can drain your balance. In this article we break down how the free tier actually works, what the gotchas are, and how to use it without getting burned.

The $10.50 nobody mentions

The difference between “free” and “actually free” on OpenRouter comes down to two numbers: 50 and 1000. If you’ve never topped up your account, you get 50 requests per day on free models. If you’ve topped up at least $10, the daily limit jumps to 1000 requests. So the real cost of entry for using OpenRouter productively isn’t zero: it’s $10.50.

The extra fifty cents is the top-up fee. Pay with crypto and it’s 5% of the total, so $0.50 on $10. Pay with a credit card and there’s an $0.80 minimum fee, making the effective entry cost $10.80. Crypto is the better deal: lower fees and more control. But the real takeaway is that “free” on OpenRouter means “free after an initial investment.”

Models come and go

Open the free model list today and you’ll find roughly 30 options. Open it next week and some of them might be gone. Step 3.5 Flash was one of the fastest and most popular free models: vanished. Qwen 3.6 Coder was free for about a week, then went paid with no warning.

The point isn’t that models change. It’s that if you’re using OpenRouter as a backend for an agent or automated workflow, a model that exists today might not exist tomorrow. Your code breaks and nobody tells you. This is the structural trade-off of the free tier: availability isn’t contractual, it’s temporary. Providers make models free for promotion, testing, or to drive adoption, and they pull them when priorities shift.

The current list as of April 2026 includes models like Nemotron 3 Super (NVIDIA), Qwen3 Coder, Gemma 4, MiniMax M2.5, GLM 4.5 Air, and OpenAI GPT-OSS 120B. But nobody guarantees they’ll still be there next week.

The mistake that costs $1.50

If you plug your OpenRouter API key into an agent without setting the spend limit to zero, you’re playing Russian roulette with your wallet. The video’s author spent $1.50 on Gemini 3 Flash without intending to: his agent, configured to use only free models, started routing requests to paid models without him noticing.

The fix is simple: create an API key with a $0 credit limit. That way, if the system tries to use a paid model, the request fails instead of charging you. It’s a 10-second setting in the API Keys section of the OpenRouter dashboard, and it saves you from surprise charges.

Rate limiting: the invisible bottleneck

Even with 1000 daily requests unlocked, the 20 requests-per-minute limit is a tight leash. If you’re using a popular model like Nemotron Super during peak hours, getting 429 errors on simple requests isn’t unusual. The server is overloaded because everyone is hammering the same free model at the same time.

The per-model rate limiting means the 20 RPM cap applies to each individual model, not in aggregate. In theory, you could spread requests across multiple models to increase throughput. In practice, if your workflow depends on a specific model, that model’s rate limit is your real constraint.

OpenRouter in the API landscape: why it exists and what it gets out of it

OpenRouter isn’t a model provider. It’s a router: an intermediary that aggregates access to models from dozens of providers (OpenAI, Google, Meta, NVIDIA, Qwen, MiniMax, and many others) through a single API. Its business model rests on two pillars:

  • A small fee on every paid request (typically 5% of the provider’s cost)
  • The ability to route traffic intelligently across providers

Free models are a customer acquisition investment. When NVIDIA makes Nemotron Super available for free on OpenRouter, it’s because they want developers to get familiar with the model, integrate it into their workflows, and then upgrade to the paid version when free tier limits become untenable. OpenRouter earns from the fee on paid transactions that follow. It’s the freemium playbook: the free product is marketing, not the product.

Compared to the competition:

  • Together AI: serverless inference with competitive pricing but no meaningful free tier
  • Fireworks AI: focuses on custom deployments
  • Groq: generous free tier but a much more limited model catalog

OpenRouter’s uniqueness is the breadth of its catalog and the convenience of a single API for everything.

How to verify a model is actually free

Don’t trust what your client tells you. OpenCode, for instance, shows a list of “free” OpenRouter models that sometimes doesn’t match current reality. A model can appear in your tool’s list but no longer be available as a free tier on OpenRouter. The result: the request fails silently, or worse, gets routed to the paid version of the same model.

The safe procedure: go to openrouter.ai/models, filter by “free”, and verify that the model you want to use is actually on the list. It’s tedious but necessary if you don’t want surprises. Alternatively, use OpenRouter’s Free Models Router (openrouter/free), which automatically selects the best available free model based on the request’s requirements (tool calling, vision, etc.).

OpenCode: the easiest way to get started

For those who want to configure providers and models without hand-rolling API calls, OpenCode is the most straightforward interface. Available as both a desktop app and a terminal tool, connecting OpenRouter is a matter of pasting your API key, and free models are pre-filtered and ready to go. The model ID is visible and copyable, making it easy to configure agents or workflows targeting the right model.

The terminal approach is equally simple: opencode, then /models, Ctrl+A to connect a provider, and you’re operational. It’s not the only OpenRouter-compatible client, but it’s one of the few that makes free model configuration genuinely transparent.

Quick setup - Create an OpenRouter account and top up $10+ with crypto - Generate an API key with $0 spend limit - Connect in OpenCode: paste key, filter free models, done - Always verify at openrouter.ai/models before relying on a specific model

Free alternatives for LLM APIs

OpenRouter isn’t the only way to use LLM models without spending money. Here are the main alternatives:

  • Ollama (local): runs open source models directly on your hardware. Zero cost, zero rate limits, but constrained by your GPU’s power. Ideal for models up to ~14B parameters on an M-series MacBook.
  • Google AI Studio: offers Gemini Flash for free with generous limits (around 1500 requests/day). Great for prototyping and testing.
  • Groq free tier: ultra-fast inference on Llama, Mixtral, and other open source models. Rate limits exist but are reasonable for personal use.
  • NVIDIA NIM: free models on build.nvidia.com, roughly 40 requests per minute. Covers Kimi K2.5, MiniMax M2.5, GLM-5, DeepSeek V3.2, and others. Good for prototyping, not for production.
  • Hugging Face Inference: free tier with popular models, though limits are tighter.

The choice depends on the use case. If you need a broad catalog and routing flexibility, OpenRouter is still the best pick. If you need speed and predictability, Groq or local is better. If you need reliability without surprises, Google AI Studio is hard to beat at the price of zero.

From here on we get into the technical details. If you’re just looking for the practical takeaways, you can skip to the conclusion.

Free models as of April 2026: what’s actually available

The free model list on OpenRouter changes frequently, but as of April 2026 it includes roughly 30 models. The most practically useful ones:

Model Provider Context Notes
Nemotron 3 Super 120B NVIDIA 262K Most used on the free tier, good generalist
Qwen3 Coder Qwen 262K Excellent for coding, but may disappear
MiniMax M2.5 MiniMax 197K Good agentic, tool calling
GPT-OSS 120B OpenAI 131K Open source version from OpenAI
GLM 4.5 Air Z.ai 131K Lightweight, solid performance
Llama 3.3 70B Meta 66K Reliable generalist
Gemma 4 26B Google 262K Multimodal with vision
Ling 2.6 1T InclusionAI 262K Chinese model, tool calling

All have the :free suffix in the model ID and are subject to rate limiting (20 RPM, 1000 req/day with $10+ topped up). Availability is not guaranteed: providers can remove them at any time.

The bottom line

Key points:

  • OpenRouter’s “free” tier requires a $10+ top-up to be genuinely useful (50 vs 1000 daily requests)
  • Free models can disappear without warning, so never depend on one for production workflows
  • Setting a $0 spend limit on your API key prevents accidental charges on paid models

OpenRouter’s real value isn’t the free tier. It’s the flexibility to switch providers with a single API. The free models are how you discover which provider works best for your use case, before you start paying.

🔗 Resources

skeptik-log By Skeptik Log