Claude Code for Free: Three Ways to Run It Without Opening Your Wallet
Claude Code costs nothing if you run it on alternative models. Ollama gives you GLM-5.1 from the cloud or Gemma 4 locally, OpenRouter hands you Elephant Alpha for free. The trade-off? No Claude under the hood, but surprisingly solid results at zero cost.
Where we’re going
If you’re paying $20/month for Claude Code, you might not know there are three ways to get a similar experience for free. Julian Goldie walks through them in his video: Ollama with a cloud model, Ollama with a local model, and OpenRouter with a free alpha model. The video is heavy on pitches for his community, but the technical content is legit.
Claude Code, minus Claude
Let’s clear this up first. “Claude Code for free” doesn’t mean using Anthropic’s Claude model. It means using the Claude Code client, the command-line coding agent, and pointing it at different models. Claude Code is a client. The model is a separate decision.
That means the experience shifts. Smaller or less capable models will make mistakes that Claude wouldn’t. But for plenty of day-to-day coding tasks, the gap is smaller than you’d think.
Method 1: Ollama + GLM-5.1 (cloud)
The simplest approach: run a cloud model through Ollama.
ollama run glm-5.1:cloud
ollama launch claude --model glm-5.1:cloud
GLM-5.1 comes from Z.AI with solid credentials: SWE-Bench Pro SOTA (at release time) and 68.5% on Terminal-Bench 2.0. It’s built for coding and agentic work, and it shows.
Pros:
- Zero local setup: no beefy hardware needed, the model runs on Ollama’s cloud
- Good speed: latency is API-call level, not local-inference level
- High quality: GLM-5.1 ranks among the strongest open models for coding tasks
Cons:
- Token limits: the free tier has a per-session and per-week token budget. Long refactors might exhaust it
- Connection required: no internet, no coding agent
Method 2: Ollama + Gemma 4 (local)
The second approach is for people who want everything on their own machine. No cloud, no costs, no logs.
ollama run gemma4:31b
ollama launch claude --model gemma4:31b
Gemma 4 by Google is an open-weight model that runs locally. The 31B dense version is the most capable, but there’s also a 26B MoE variant that needs fewer resources.
Pros:
- Zero cost, unlimited tokens: the model lives on your disk, tokens are infinite
- Full privacy: your code never leaves your machine
- Offline: works without an internet connection
Cons:
- Hardware: the 31B dense needs at least 16GB unified RAM (24GB to be comfortable). The 26B MoE drops to 6GB
- Speed: on a Mac Mini M4 Pro with 24GB, the 31B runs at roughly 15 tokens/s. Fine for thoughtful work, slow for rapid iteration
- Quality: Gemma 4 is good, but it’s not Claude. On complex multi-file tasks, the gap shows
Method 3: OpenRouter + Elephant Alpha (free API)
The third one is the most curious. Elephant Alpha is a stealth model that appeared on OpenRouter on April 13, 2026. Nobody knows who trained it: the provider is OpenRouter itself. 100 billion parameters, 256K context, zero cost.
To use it with Claude Code, configure OpenRouter as your API provider and point it at Elephant Alpha.
Pros:
- 100B parameters: the largest model of the three, potentially the most capable
- 256K context: plenty of memory for large files and extended codebases
- Function calling and structured output: natively supported
- Free: $0/M for both input and output
Cons:
- Alpha: it’s in testing. It could change, disappear, or become paid at any point
- Privacy: prompts may be logged by the provider. The model page states: “Prompts and completions may be logged by the provider and used to improve the model.” If you’re working on proprietary code, think twice
- Mystery: no documentation on architecture, training data, or who’s behind it. “Elephant Alpha” could be anything
Which one? Depends on context
There’s no clear winner. Each method has its ideal use case:
| Scenario | Method | Why |
|---|---|---|
| Quick prototyping, single tasks | GLM-5.1 cloud | Fast, zero setup, high quality |
| Sensitive code, offline work | Gemma 4 local | Full privacy, unlimited tokens |
| Large codebase, long context | Elephant Alpha | 256K context, 100B params |
| Complex multi-file refactoring | GLM-5.1 cloud | Best agentic reasoning |
| Open-ended experimentation | Gemma 4 local | No token limits, iterate as much as you want |
For the technically inclined
From here on it gets technical. If you care about the idea more than the implementation, skip to the conclusion.
Ollama’s compatibility layer
Since Ollama v0.14 (January 2026), the daemon exposes an endpoint compatible with the Anthropic Messages API at localhost:11434/v1/messages. Any client that speaks the Anthropic protocol, Claude Code included, can point to Ollama as if it were the official API.
The flow:
- Ollama downloads and loads the model (local or cloud)
- It exposes the compatible endpoint
ollama launch claudeautomatically setsANTHROPIC_BASE_URLandANTHROPIC_API_KEY- Claude Code talks to Ollama as if talking to Anthropic
No proxy, no litellm, no manual configuration. It just works.
Elephant Alpha: what we know
Not much, honestly. The specs published by OpenRouter:
| Parameter | Value |
|---|---|
| Parameters | 100B |
| Context | 256K tokens |
| Input | $0/M |
| Output | $0/M |
| Function calling | Yes |
| Structured output | Yes |
| Prompt caching | Yes |
| Provider | OpenRouter (unknown) |
The “intelligence efficiency” label suggests a model optimized to produce quality responses with fewer tokens. Think reasoning efficiency: think well, waste little. But without a paper, without independent benchmarks, it’s all unverified.
Configuring OpenRouter with Claude Code
To use Elephant Alpha (or any OpenRouter model) with Claude Code:
export ANTHROPIC_BASE_URL=https://openrouter.ai/api/v1
export ANTHROPIC_API_KEY=your-openrouter-api-key
claude --model openrouter/elephant-alpha
Or through the project’s configuration file.
The bottom line
Key points:
- Claude Code is a client: you can plug any model that speaks the Anthropic protocol
- Ollama + GLM-5.1 cloud is the easiest path to immediate quality at zero cost
- Ollama + Gemma 4 local is the privacy and offline choice, at the cost of speed
- Elephant Alpha is the wildcard: powerful, free, but alpha and with privacy caveats
The free coding agent isn’t an experiment anymore. It’s a real choice with real trade-offs. The question isn’t “can I afford Claude Code?” anymore, it’s “which compromise am I willing to make?”