Launched on March 23, 2026, the MiniMax Token Plan redefines the developer AI subscription market. For the first time, a single subscription grants access to five families of models — text/code, video, voice, image, and music — through one unified API key. At $10/month for the entry tier, it stands as the best value-for-money offering in 2026 for developers using OpenClaw, Hermes Agent, and vibe coding tools like Kilo Code, OpenCode, and Claude Code.

🎁 Exclusive Partner Offer: -10% Until June 1st, 2026
Get 10% off your Token Plan subscription through our exclusive partner link. The Starter plan at $10/month comes down to an effective $7.50/month by combining the annual subscription (2 months free) with the promo code. ⏰ Offer valid until June 1st, 2026 only.
→ Activate Your Discount Now
MiniMax M2.7: Architecture and Performance
At the core of the Token Plan is MiniMax M2.7, released on March 18, 2026. Built on a Mixture-of-Experts (MoE) architecture with roughly 230 billion total parameters — but only ~10 billion active per inference — M2.7 delivers high speed (~100 tokens/second) at minimal compute cost. Its context window reaches 205,000 tokens (~150,000 words), enabling entire codebases to be loaded without truncation.

Key benchmark highlights:
- SWE-bench Verified: 78% (Claude Opus 4.6: 80.8%)
- GPQA Diamond: 87.4%
- τ²-Bench (conversational agents): 84.8%
- Hallucination rate: just 34% (vs 46% for Claude Sonnet 4.6)
- PinchBench OpenClaw: 71.2% — 100% in Coding, File Ops, Data Analysis, and Document Summarization
- Max output: 131,072 tokens per response
The bottom line: 90% of Claude Opus 4.6’s performance at 6% of the price. A real-world comparison produced equivalent results for $0.27 (M2.7) vs ~$3.85 (Opus). On a pay-as-you-go basis, M2.7 costs $0.279/M input tokens and $1.20/M output tokens — analyzing a 10,000-line codebase costs roughly 2 cents.
Token Plan Pricing

The Token Plan comes in two tiers: Standard (~50 TPS) and High-Speed (~100 TPS). Key distinction: the Token Plan Key is separate from the standard pay-as-you-go API key. Rather than counting tokens consumed, it tracks the number of requests within a rolling 5-hour window.
Standard Plans — M2.7 (~50 TPS)
- 🟢 Starter — $10/month ($100/year): 1,500 M2.7 requests/5h + Music 2.6 included — ideal for solo developers
- 🔵 Plus — $20/month ($200/year): 4,500 requests/5h + 50 image-01 images/day + Speech 2.8 (4,000 chars/day)
- 🟣 Max — $50/month ($500/year): 15,000 requests/5h + Hailuo video generation (2 videos/day) included
High-Speed Plans — M2.7-highspeed (~100 TPS)
- ⚡ Plus-HS — $40/month: 4,500 requests/5h at full speed + 100 images/day
- ⚡ Max-HS — $80/month: 15,000 requests/5h + 200 images/day + Speech 19,000 chars/day
- ⚡ Ultra-HS — $150/month: 30,000 requests/5h for high-intensity usage

💡 With our partner promo code: the Starter plan comes down to an effective $7.50/month (annual billing + 10% discount). The most affordable high-performance AI solution on the market. → Subscribe with -10% (valid until June 1st, 2026)
The 5-Hour Window Explained
The M2.7 quota resets on a rolling 5-hour window. The day is effectively split into blocks: 0–5h, 5–10h, 10–15h, 15–20h, 20–24h. If you hit the limit, three options are available:
- Use Credits (pre-paid balance on the same Token Plan key)
- Switch temporarily to pay-as-you-go with a standard API key
- Wait for the automatic reset of the 5-hour window
For 95% of developers in everyday solo usage, the 1,500 requests/5h of the Starter plan are more than sufficient — you’ll rarely come close to the limit.
Token Plan for OpenClaw
MiniMax has integrated OpenClaw directly into its official documentation — a clear signal of the synergy between the two products. M2.7 excels in multi-step agentic loops thanks to several key advantages:
- MoE architecture with ~10B active parameters: fast response times, critical in agent loops
- 34% hallucination rate: lower than Claude Sonnet 4.6 (46%) — fewer silent errors in autonomous pipelines
- 205K token context window: load an entire codebase without truncation
- PinchBench: 100% in Basic, Coding, File Ops, Data Analysis, and Document Summarization
- No per-request token limit: the key counts requests, not tokens — ideal for long-running tasks
A community testimonial sums it up well: « Get MiniMax M2.7 coding plan and setup MiniMax as main agent. You will probably never have API rate limit issues. When I used Gemini or Claude API, rate limit was always happening. OpenClaw works like a charm. »
Token Plan for Hermes Agent
Hermes Agent (Nous Research) is an open-source agent framework with persistent cross-session memory, an integrated learning loop, 40+ built-in tools, and multi-platform access (CLI, Telegram, Discord, Slack, WhatsApp). MiniMax has integrated Hermes directly into its Token Plan documentation. M2.7 is the recommended model for its 131,072-token output capacity — essential for long memory-rich conversations — and its τ²-Bench score of 84.8%.
# One-liner install
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
# Model selection
hermes model
# → Choose "MiniMax (global endpoint)"
# → Paste your Token Plan Key
# → Select MiniMax-M2.7
Token Plan for Vibe Coding
The Token Plan is officially compatible with Claude Code, Cursor, Kilo Code, Cline, Roo Code, OpenCode, Codex CLI, TRAE, Grok CLI and any OpenAI-compatible tool. The Token Plan Key is configured as a BYOK (Bring Your Own Key) provider in each tool, with minimax-m2.7 as the target model via the official MiniMax endpoint.
A popular 2026 workflow combines OpenCode with M2.7 for building and GLM-5 for planning — all at a fraction of the cost of premium alternatives. Developers who’ve migrated from Claude Pro report: « MiniMax M2.7 offers a much more generous limit — 1,500 requests every 5 hours for just $9/month. » The OpenCode community consensus is clear: the $10/month Starter plan is enough for 95% of developers in daily use.
Quick Setup Guide
- Subscribe via our partner link: go to our exclusive link to get -10% (valid until June 1st, 2026). Choose the Starter at $10/month or $100/year.
- Retrieve your Token Plan Key: in the MiniMax dashboard, navigate to Account > Token Plan. This key is separate from the standard API key.
- OpenClaw: use the native MiniMax provider in
~/.openclaw/openclaw.jsonwith the Token Plan Key and selectminimax-m2.7. - Hermes Agent: run
hermes model, choose « MiniMax (global endpoint) », paste the Token Plan Key, and select MiniMax-M2.7. - Kilo Code / OpenCode / Claude Code: add MiniMax as a BYOK provider. Model:
minimax-m2.7. Base URL: official MiniMax endpoint.
Known Limitations
- Peak hour throttling: during weekday peak hours, high-concurrency tasks may be redirected to pay-as-you-go
- Token Plan Key ≠ standard API key: the two are not interchangeable; pay-as-you-go remains necessary for full-resolution Hailuo or vision models
- Limited video quota: the Max standard plan only includes 2 Hailuo videos/day — separate resource packages are needed for intensive video generation
- M2.5 remains relevant for simple, repetitive tasks (10x lower cost per run than M2.7)
Conclusion
The MiniMax Token Plan at $10/month is, as of May 2026, the best performance-to-cost ratio available for developers building on OpenClaw and Hermes, or practicing vibe coding. MiniMax M2.7 delivers 78% SWE-bench Verified performance at 6% of Claude Opus 4.6’s price, with a lower hallucination rate than Sonnet and a 205K token context window. The addition of multimodal models under the same key — Speech 2.8, image-01, Music 2.6, Hailuo video — turns this subscription into a complete infrastructure for automated content pipelines, with no surprise invoices.
🎁 Last chance: Get 10% off the Token Plan via our exclusive partner link — valid until June 1st, 2026 only. The Starter plan comes down to an effective $7.50/month with annual billing.
→ Subscribe Now with -10% Off

