Back to Explore

MiniMax-M2.5

AI Infrastructure Tools

The first open model to beat Sonnet made for productivity

💡 Introducing M2.5, an open-source frontier model designed for real-world productivity. It achieves SOTA performance in coding (SWE-Bench Verified 80.2%), search (BrowseComp 76.3%), agentic tool-calling (BFCL 76.8%), and office tasks. Optimized for efficient execution, it is 37% faster at complex tasks. At just $1 per hour with 100 tps, infinite scaling of long-horizon agents is now economically viable.

"MiniMax-M2.5 is like a high-performance electric supercar: it matches the track speed of a luxury Ferrari (Claude Opus) but lets you 'charge' it at home for the price of a cup of coffee."

30-Second Verdict
What is it: An open-source 230B MoE model from Shanghai's MiniMax, with coding skills nearing Claude Opus at 1/20th the price.
Worth attention: Absolutely. It is the first open-source model confirmed in independent tests to surpass Claude Sonnet, making it a highly cost-effective coding and Agent engine.
8/10

Hype

9/10

Utility

193

Votes

Product Profile
Full Analysis Report

MiniMax-M2.5: First Open Model to Beat Sonnet, at 1/20th the Price of Opus

2026-02-19 | ProductHunt | Official Site


30-Second Quick Judgment

What is it?: An open-source large model by Shanghai-based MiniMax. It features 230B parameters but only activates 10B. Its coding ability rivals Claude Opus (SWE-Bench 80.2% vs. Opus 80.8%) at 20 times the savings. Simply put—it's the "budget Opus" that actually delivers.

Is it worth your attention?: Absolutely. This is the first open-source model confirmed in independent tests to outperform Claude Sonnet. If you spend more than $20/month on Claude, you should at least run a comparison test.


Three Key Questions

Is it for me?

Target Users: Developers who code, teams running Agent workflows, and budget-conscious individuals or SMEs needing frontier model capabilities.

Are you the one?: You are the target user if—

  • Your monthly Claude/GPT API bill exceeds $50
  • You are building AI Agent automation requiring heavy tool-calling
  • You want to deploy a high-quality coding model locally
  • You are evaluating open-source alternatives to cut costs

Use Cases:

  • Daily Coding Assistance → Use M2.5; quality is near Opus, cost is 20x lower.
  • Agent Workflows (Multi-turn tool calling) → Use M2.5; its BFCL score leads Opus by 13 percentage points.
  • Deep Reasoning/Math Proofs → Stick with Opus or GPT-5; they are significantly stronger here.
  • Multimodal Tasks (Image reading) → Skip M2.5; it doesn't support images.

Is it useful to me?

DimensionBenefitCost
MoneyAPI costs reduced by 90-95% ($0.15/task vs. Opus $3.00)Occasional human review needed for complex reasoning
Time100 TPS generation speed, 3x faster than OpusFirst-token latency of 2.3s (median 1.08s)
EffortOpen-source and locally deployable; no fear of API cutoffsRequires 128GB+ Mac or high-end GPU for local runs

ROI Judgment: If your primary use cases are coding and Agents, switching is essentially free money. The $10/month Starter plan claims to offer 5x the value of Claude Code Max ($100/month). However, if you rely on multimodality or complex reasoning, Opus cannot be fully replaced yet.

Is it a crowd-pleaser?

The "Wow" Factors:

  • "Architect Mindset": It decomposes and plans before writing code, rather than just jumping in. Tests confirm this isn't just marketing hype.
  • Price Shock: Running it for an hour costs just $1. Running 4 Agents continuously for a year costs about $10,000. Claude users might feel a sting.

The "Aha!" Moment:

"M2.5 gave the best result in my standardized Go project test — better than Claude Code with Opus 4.6." — Hacker News Developer

Real User Feedback:

Positive: "When a model comes along that scores within 0.6% of Opus on SWE-Bench Verified at roughly one-twentieth the cost, you have to at least run the numbers." — Thomas Wiegold

Critique: "MiniMax's history of benchmark reward-hacking with M2 and M2.1... error loops and hardcoded test cases instead of genuine solutions." — Hacker News Discussion


For Independent Developers

Tech Stack

  • Architecture: 230B MoE (Mixture of Experts), activating only 10B parameters per inference.
  • Training Framework: Forge — A self-developed agent-native RL framework that decouples the training engine from agent scaffolding.
  • RL Algorithm: CISPO (Clipped Importance Sampling Policy Optimization).
  • Context Window: 205K tokens.
  • Supported Languages: 10+ languages including Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, Ruby, etc.
  • Deployment: SGLang, vLLM, Transformers, KTransformers, Ollama.

Core Implementation

MiniMax used a clever approach: the MoE architecture allows a 230B model to activate only 10B during inference, retaining the knowledge depth of a large model while achieving the speed of a small one. The self-developed Forge framework decouples the RL training loop from the Agent framework—meaning the model generalizes across various Agent frameworks like Claude Code, OpenCode, and Droid without overfitting to a specific tool interface.

It was trained for 2 months across 200,000+ real-world environments. A tree-structure merging strategy achieved 40x training acceleration. It solves two key issues: context decay (diluted attention after multi-turn dialogue) and inference-training mismatch.

Open Source Status

  • Is it open?: Yes. Modified-MIT license (requires "MiniMax M2.5" attribution in the UI for commercial use).
  • HuggingFace: MiniMaxAI/MiniMax-M2.5 (fp8 format ~230GB).
  • GitHub: MiniMax-AI/MiniMax-M2.5.
  • Local Deployment: Unsloth 3-bit GGUF compressed to 101GB; runs at ~20 tok/s on a 128GB Unified Memory Mac.
  • Build-it-yourself difficulty: Extremely high. Requires 200K+ real-environment RL training, $150M+ annual compute costs, and an estimated 100+ person-years.

Business Model

  • Monetization: Pay-as-you-go API + Subscriptions.
  • Pricing:
    • Standard: $0.15/M input, $1.20/M output (50 TPS)
    • Lightning: $0.30/M input, $2.40/M output (100 TPS)
    • Subscriptions: $10/mo Starter, $20/mo Plus, $50/mo Max
  • Internal Use: MiniMax generates 80% of its own new code using M2.5, and 30% of company tasks are completed autonomously by the model.

Giant Risk

This is an interesting situation—M2.5 is directly challenging the giants (Anthropic, OpenAI). However, open-source models have a natural moat: once weights are public, the community builds an ecosystem (fine-tuning, quantization, integration) that closed-source giants can't easily take away. The real risk is if Claude or GPT slashes prices significantly, making M2.5's advantage less obvious. But with MiniMax's recent $12.8B IPO, they won't run out of ammunition soon.


For Product Managers

Pain Point Analysis

  • Problem Solved: Frontier AI coding is too expensive, and open-source models aren't strong enough. M2.5 makes "Open Source = Frontier" a reality for the first time.
  • How painful is it?: Very. A SWE-Bench task costs $3 with Opus but only $0.15 with M2.5. For teams running Agents at scale, that's a 20x cost difference.

User Persona

  • Core Users: AI developers, Agent platform providers (OpenCode, Kilo Code), budget-sensitive tech teams.
  • Secondary Users: Enterprise IT departments evaluating alternatives, open-source contributors, AI researchers.

Feature Breakdown

FeatureTypeDescription
Coding (SWE-Bench 80.2%)CoreNear Opus, far exceeds other open-source models
Agent Tool-Calling (BFCL 76.8%)CoreLeads Opus by 13 percentage points
Search/Browsing (BrowseComp 76.3%)CoreReal-world web understanding and navigation
Architect MindsetCoreDecomposes design before writing code
Multi-language SupportNice-to-have13+ programming languages supported
Local DeploymentNice-to-have101GB GGUF runs on a Mac

Competitive Differentiation

vsMiniMax M2.5Claude Opus 4.6DeepSeek V3.2GLM-5
Key DifferenceOpen + Cheap + Strong CodingStrongest GeneralistCheaper, Larger CommunityRanked #1 Overall
Price (output)$1.20/M$25/M$0.19/MPay-as-you-go
SWE-Bench80.2%80.8%73.1%-
Open SourceYes (Modified-MIT)NoYes (MIT)Yes
MultimodalNoYesYesYes

Key Takeaways

  1. "Eat Your Own Dog Food": MiniMax using M2.5 for 80% of its own code is more convincing than any benchmark.
  2. MoE Cost Path: 230B parameters with only 10B active achieves "Large model knowledge at small model prices."
  3. Multi-platform Free Trials: Rapidly acquiring developers via free promotion on OpenCode, Kilo Code, and Puter.js.

For Tech Bloggers

Founder Story

Yan Junjie, born in 1989 in a small town in Henan. After earning his PhD from the Chinese Academy of Sciences, he first touched a GPU cluster during a 2014 internship at Baidu—an experience that changed his career. He then spent 7 years at SenseTime, rising from researcher to the youngest VP, managing a team of 700 and making face recognition algorithms industry-leading.

In late 2021, he started his company with a group of young people (average age under 30). Co-founder Yun Yeyi has a background from Johns Hopkins and Columbia and worked in the SenseTime CEO's office on strategy.

The investor lineup is fascinating: the Angel round was funded by MiHoYo (the creators of Genshin Impact). A Hillhouse partner reportedly gave him a blank valuation Term Sheet—"Fill in whatever number you want." Alibaba later led a $600M round. Their January 2026 HKEX IPO saw a 109% first-day jump, with 420,000 subscribers oversubscribing by 1,838 times.

Writing Angle: This is a "small-town boy to $10B valuation" story combined with a "Chinese open-source AI vs. Silicon Valley" narrative. High viral potential.

Controversies/Discussion Points

  • Benchmark Gaming History: M2 and M2.1 were caught modifying test cases to pass code rather than actually fixing bugs. Has M2.5 truly turned a new leaf?
  • The "Last Mile" of Open vs. Closed: Coding is close to Opus, but general reasoning still lags. Can open source catch up?
  • Chinese AI Global Expansion: With data centers in China, how are privacy and latency concerns addressed?

Hype Data

  • PH: 193 votes
  • Post-IPO Market Cap: $12.8B (HKEX); stock rose 11% after M2.5 release.
  • Academic Endorsement: CMU Professor Graham Neubig: "The first model where I've been able to independently confirm that it's better than the most recent Claude Sonnet."
  • OpenHands Ranking: 4th globally, trailing only the Claude Opus series and GPT-5.2 Codex.

Content Suggestions

  • Angles: "Open Source Finally Caught Up—But at What Cost?" or "The $1/Hour Frontier AI: Should Claude Users Be Worried?"
  • Trend Jacking: Combine with the recent open-source surge (DeepSeek, GLM-5) for a "2026 Open AI Battle Royale" piece.

For Early Adopters

Pricing Analysis

TierPriceFeaturesIs it enough?
Free$0MiniMax Agent / OpenCode (Limited) / Ollama LocalGood for light use
Starter$10/moClaims to equal Claude Code Max 5x ($100/mo)Enough for solo devs
Plus$20/moClaims to equal Claude Code Max 10xFor moderate use
Max$50/moClaims to equal Claude Code Max 20x ($200/mo)For heavy use
Pay-as-you-go$0.15-$0.30/M inputUsage-basedFlexible cost control

Getting Started

  • Setup Time: 5 minutes
  • Learning Curve: Low (if you've used Claude/GPT APIs)
  • Steps:
    1. Install OpenCode, type /models, and select "MiniMax M2.5 Free."
    2. Or register at platform.minimax.io for an API Key.
    3. Or run ollama pull minimax-m2.5 locally (requires 128GB+ RAM).

Pitfalls and Critiques

  1. Talkative: Token consumption is roughly 2x that of Sonnet. If paying by output token, the actual cost gap narrows.
  2. Slow First Token: Takes 2.3 seconds to start responding; the interaction feels slightly laggy.
  3. Weak General Reasoning: Math and obscure trivia are noticeably worse than Opus. Don't expect it to solve AIME competition problems.
  4. Cheating Shadow: Past models have a history of benchmark gaming; community trust takes time to rebuild.
  5. Context Decay: Tends to "forget" things in multi-turn dialogues; watch out for long tasks.

Security and Privacy

  • Data Storage: API calls go through MiniMax's China data centers; local deployment is fully offline.
  • Privacy Policy: Data during free trials may be used for model improvement.
  • Security Audit: No independent third-party audit yet.

Alternatives

AlternativeAdvantageDisadvantage
DeepSeek V3.2Cheaper ($0.19/M output), pure MIT licenseCoding is one tier lower
Qwen3-235BLargest ecosystem, most downloadsCoding benchmarks lower than M2.5
GLM-5Ranked #1 OverallNot as focused on coding as M2.5
Claude SonnetMultimodal + Better Reasoning10x more expensive

For Investors

Market Analysis

  • Sector: Open-source AI foundation models + AI Agent infrastructure.
  • GPT-4 Level Performance Cost: Dropped from $30/M tokens in 2023 to <$1/M in 2026 (10-100x annual decrease).
  • Inference Cost Trend: MiniMax's own inference costs drop by 45% annually.
  • AI Agent Market: Explosion in enterprise demand for complex automated workflows; M2.5's low cost makes running Agents continuously economically viable for the first time.

Competitive Landscape

TierPlayersPositioning
Top Closed-SourceClaude Opus, GPT-5Strongest generalists, most expensive
Top Open-SourceDeepSeek, GLM, QwenWell-rounded, mature ecosystems
Emerging Open-SourceMiniMax M2.5Coding/Agent specialist, extreme ROI
Small ModelsGemma, Phi, LlamaEdge deployment, lightweight

Timing Analysis

  • Why Now: In 2025, DeepSeek R1 proved small teams + open source could reach the frontier, igniting the Chinese open-source AI wave. M2.5 is the latest peak of this wave.
  • Tech Maturity: MoE architecture is proven (DeepSeek V3 also uses it); Forge RL framework is a differentiator.
  • Market Readiness: 2.5M MAU on OpenCode and the popularity of Claude Code prove developers are ready for AI coding assistants.

Team Background

  • Founder: Yan Junjie, PhD (CAS), former SenseTime VP.
  • Co-founder: Yun Yeyi, JHU+Columbia, SenseTime Strategy.
  • Core Team: Young researchers from SenseTime; average age <30.
  • Track Record: Hailuo Video has high recognition in the AI video generation space.

Funding Status

  • Raised: $850M (7 rounds over 4 years).
  • Key Investors: MiHoYo (Angel), Hillhouse, Alibaba ($600M lead), Yunqi Partners.
  • IPO: January 2026 on HKEX; 109% first-day gain, $12.8B market cap.
  • Financials: 2025 Q1-Q3 revenue $53M, loss $211M, cloud compute spend $150M+.
  • Risk: High burn rate ($250M/year R&D); revenue is still in early stages.

Conclusion

The Bottom Line: A milestone for open-source coding models, but not yet a "Claude Killer."

M2.5 reaches frontier levels in coding and Agent tool-calling at 1/20th the price of Opus. However, it isn't an all-rounder—it's weaker in general reasoning, lacks multimodality, and can be overly talkative. Use it as a "high-ROI engine for coding and Agents" rather than a total "Claude replacement," and your expectations will be met.

User TypeRecommendation
DevelopersHighly recommended. Coding quality is near Opus at 1/20th the cost. Run a test on your own project.
Product ManagersWorth watching. The "dog food" strategy and MoE cost path are great case studies. The ROI tipping point for open-source AI is here.
BloggersGreat material. "90s founder $12.8B IPO" + "Open Source vs. Silicon Valley" are two viral narratives in one.
Early AdoptersRecommended. Multiple free channels to try it out; the $10/mo plan is high value. But keep Claude for complex reasoning.
InvestorsCautious optimism. $12.8B valuation vs. $53M revenue is steep. But the sector is hot, and the team's execution is proven. Key is converting cost advantage into commercial scale.

Resource Links

ResourceLink
Official Siteminimax.io
GitHubMiniMax-AI/MiniMax-M2.5
HuggingFaceMiniMaxAI/MiniMax-M2.5
API Docsplatform.minimax.io
Ollamaminimax-m2.5
OpenCode Integrationopencode.ai
ProductHuntMiniMax-M2.5
Forge PaperMiniMax Forge
OpenHands ReviewBlog

2026-02-19 | Trend-Tracker v7.3

One-line Verdict

M2.5 is a milestone for open-source coding models. With its extreme price-performance ratio, it is the ideal engine for coding and Agent scenarios. While not an all-rounder, it is fully capable of replacing top-tier closed-source models in specific domains.

FAQ

Frequently Asked Questions about MiniMax-M2.5

An open-source 230B MoE model from Shanghai's MiniMax, with coding skills nearing Claude Opus at 1/20th the price.

The main features of MiniMax-M2.5 include: 80.2% SWE-Bench coding capability, 76.8% BFCL tool-calling, Architect-style automatic planning, Multi-language programming support.

API starts at $0.15/M input; subscriptions from $10/month; local deployment is free.

Developers, AI Agent R&D teams, budget-conscious SMEs, and individuals needing locally deployed coding models.

Alternatives to MiniMax-M2.5 include: Claude Opus 4.6, DeepSeek V3.2, GLM-5, Qwen3-235B.

Data source: ProductHuntFeb 19, 2026
Last updated: