Free LLM models exist right now. Tomorrow the list will be different. Token Scout discovers what's free and cheap in real time — across OpenRouter, Groq, Cerebras, and your local Ollama GPUs — so your agent always routes to the cheapest safe option.
Dynamic pricing. Compatibility filtering. Cost ceiling. One tool you wire in once — your agent does the rest.
Token Scout combines cloud, local, and fallback discovery — every query reflects what's available right now.
| Layer | What it discovers | How |
|---|---|---|
| OpenRouter Live | Hundreds of models with real-time pricing. Free tiers come and go — Token Scout catches them. | GET /api/v1/models — no key needed |
| Ollama Constellation | Every model on your local network. Free, unlimited. | Probes configured hosts via /api/tags |
| Static Fallback | Groq, Cerebras, Mistral, GitHub, Google — curated free tiers. | Always available, even offline |
Free models on OpenRouter change hourly. Token Scout catches them as they appear and disappear.
You can. But your AI agent can't browse a website. Token Scout gives your agent the same market awareness you have — what's free, what's cheap, what just appeared, what disappeared since yesterday. Plus three things OpenRouter's UI doesn't tell you:
min_context filter prevents it.reasoning_format so you match correctly.A model that's free today gets priced tomorrow. A new model drops with a 48-hour promotional window. A provider quietly adds a trillion-parameter model to their free tier. If your agent is hardcoding model IDs or reading from a static config, it's missing these windows entirely.
Token Scout checks the market on every query. Your agent always routes to the cheapest option that's compatible with the task. The savings compound — 50 subagent calls a day on free models instead of paid ones adds up fast.
One tool. Five minutes. Free and cheap inference, found live.
MIT License · Rust + Python · Works with Claude Code, OpenClaw, LangChain, CrewAI & any MCP client