Skyvern MCP & Skills: Let AI Assistants Control the Browser Directly—RPA is About to Change
2026-03-04 | ProductHunt | Official Site | GitHub

Skyvern's main interface: A dark-themed design with a workflow editor on the left for drag-and-drop automation building (e.g., "Purchase Product"). On the right is an advanced settings panel for retries and error handling. It feels like a no-code automation console built specifically for developers.
30-Second Quick Judgment
What is it?: It uses AI + Computer Vision to control browsers, replacing traditional Selenium/Playwright scripts. You tell the AI in natural language to "fill out this form," and Skyvern does it—no CSS selectors required.
Is it worth watching?: Absolutely. This isn't just another RPA toy. With 20k GitHub Stars, YC S23 backing, a 6-person team generating $900K in revenue, and a recent $2.7M Seed round—it uses the MCP protocol to give tools like Claude Code and Cursor the ability to actually "do things" on the web. It's a critical piece of the 2026 Agent ecosystem.
Three Questions for Me
Is this for me?
Target Users:
- Enterprises needing bulk web automation (invoice downloads, government forms, procurement).
- Developers using Claude Code/Cursor who need browser capabilities.
- Small teams who want RPA but can't afford Automation Anywhere ($750+/month).
- Agent developers who need to give their AI "eyes and hands."
Am I the target?: If you frequently repeat manual web tasks (scraping data, filling forms, logging into portals to download files), or if you're building an AI Agent that needs browser control—you are the target user.
When would I use it?:
- "Log into my supplier portal and download last month's invoices" -- Use this.
- "Search Indeed for remote Python jobs paying $150K+" -- Use this.
- "Fill out this multi-page government form with dozens of fields" -- Use this.
- "I just want to see what a webpage looks like" -- You don't need this; just use a browser.
Is it useful?
| Dimension | Benefit | Cost |
|---|---|---|
| Time | Saves 10-20 hours/week of repetitive browser tasks | 30-second MCP setup, low learning curve |
| Money | Replaces $750+/month enterprise RPA tools | Free tier available; Cloud version is credit-based |
| Effort | No more maintaining fragile XPath scripts; resilient to site changes | Need to understand MCP concepts and API Key config |
ROI Judgment: If you currently spend time writing Selenium scripts or manually clicking through websites, Skyvern is a net gain. It's open-source and free, and the Cloud version is far cheaper than traditional RPA. With a 30-second setup, the cost of trial is nearly zero.
Is it a "wow" experience?
The Highlights:
- 30-Second Config: One JSON command and Claude Code can control your browser. No Python environment or local server needed.
- Natural Language Control: Just say "Submit this," instead of writing
document.querySelector('#btn-submit-form-v2').click(). - Resilient to Changes: Traditional scripts break when a site updates. Skyvern uses vision to understand the page; if a button changes color or moves, it still finds it.
The "Aha!" Moment:
"Skyvern proved to be an invaluable tool for developing an automated job application MVP, offering robust browser automation capabilities." -- ProductHunt User Review
Real User Feedback:
"skyvern mcp looks solid for browser automation. i tried something similar with 49agents where i just wanted the agents to handle the boring stuff without me watching. the difference is skyvern is specifically for web flows" -- @49agents, Twitter
"Automates browser workflows using vision models" -- @tom_doerr, Twitter (51 likes)
For Indie Developers
Tech Stack
- Backend: Python 3.11+
- Browser Engine: Playwright (SDK adds AI capabilities on top)
- Database: PostgreSQL (for state management)
- AI/Models: Multi-model support—GPT-5/GPT-4.1/O3, Claude 4.5 Opus/Sonnet, Gemini 2.5/3.0, AWS Bedrock, and even local Ollama (supporting vision models like qwen3-vl).
- MCP Server: 33 tools across 6 categories.
Core Implementation
Skyvern's architecture is inspired by BabyAGI and AutoGPT but adds browser control. It's a multi-agent system:
- LLM (Cognitive Brain): Processes both visual screenshots and DOM text to build a complete understanding of the page.
- Computer Vision (Eyes): Doesn't just look at HTML; it "sees" the page like a human—identifying buttons, forms, and links even if the underlying code changes.
- Actor Agent + Validator Agent: One performs the action, the other verifies the result. If verification fails, it automatically corrects and retries.
In short: Traditional automation looks at code (selectors); Skyvern looks at the interface (screenshots). A site redesign is a disaster for traditional tools but just a "change of clothes" for Skyvern.
Open Source Status
- Fully Open Source: 100% of the core logic is on GitHub under the Apache-2.0 license.
- GitHub: ~20k Stars, 1.7k Forks, very active.
- Cloud Extra Features: Anti-detection, proxy networks, CAPTCHA solving, and parallel execution.
- One-Line Install:
pip install skyvern && skyvern quickstart
Business Model
- Monetization: Free open-source core + Cloud version with monthly credit-based billing.
- Pricing: Free (Trial) / Hobby (Personal) / Pro (Production) / Enterprise (Unlimited).
- Old Pricing Reference: Cloud version was ~$0.10/step.
- Team: 6 people.
- Revenue: Reached $900K by June 2024.
Giant Risk
Risks exist but are manageable. Microsoft has Power Automate and Google has AI Agent capabilities, but:
- Skyvern takes the "AI + Vision" route, while big-tech RPA still leans on traditional selectors.
- The 20k Star open-source community is a strong moat.
- The MCP protocol is pushed by Anthropic, and Skyvern is an early mover in this ecosystem.
- A 6-person team hitting $900K revenue proves real PMF.
- Real threats may come from Perplexity Comet (free, consumer-grade) and Browser Use (also open-source).
For Product Managers
Pain Point Analysis
- Problem Solved: Traditional browser automation is incredibly fragile. Change a button color, move its position, or swap a class name, and the script dies. Maintenance costs often exceed development costs.
- How painful is it?: High frequency and high demand. Large enterprises spend hundreds of thousands maintaining RPA scripts; SMEs often just give up and do it manually. Skyvern uses vision models to replace selectors, solving this once and for all.
User Persona
- Enterprise IT Teams: Automating internal tools (invoices, HR, procurement).
- Indie Developers: Building Agent products that need browser capabilities.
- AI Tool Users: Users of Claude Code/Cursor who want "web browsing" powers.
- SMEs: Those who can't afford Automation Anywhere but need automation.
Feature Breakdown
| Feature | Type | Description |
|---|---|---|
| MCP Server (33 tools) | Core | Lets AI assistants control the browser |
| Vision + LLM Page Understanding | Core | Doesn't rely on selectors; adapts to changes |
| Natural Language Workflow | Core | Describe tasks in plain English, no code needed |
| Cloud Browser | Core | Runs in the cloud with geo-proxy support |
| CAPTCHA Solving | Value-add | Built into the Cloud version |
| 2FA/TOTP Support | Value-add | Supports Bitwarden and other vault integrations |
| Observer Mode | Nice-to-have | Automatically generates workflows |
| Video Recording | Nice-to-have | Records the automation process |
Competitive Differentiation

This diagram shows Skyvern MCP's core positioning: connecting various AI Agents (Claude, Cursor, Codex, etc.) to become their "browser arm." It shows natural language instruction examples like "Pull data from Google Sheet and run insurance automation."
| vs | Skyvern | Automation Anywhere | Browser Use | Perplexity Comet |
|---|---|---|---|---|
| Core Diff | AI Vision + MCP | Traditional Selectors | Open Source LLM + Playwright | Consumer Browser Agent |
| Price | Free + Cloud Credits | $750/month start | Free (pay for LLM) | Free |
| Coding | No code needed | Low code | Requires scripting | No code needed |
| Maintenance | Auto-adapts | Breaks on site change | Manual updates needed | Auto-adapts |
| Enterprise | Yes (CAPTCHA/2FA) | Yes | No | No |
| Open Source | Yes (20k Stars) | No | Yes | No |
Key Takeaways
- MCP as a Distribution Strategy: Instead of just a standalone product, it's a "browser plugin for all AI tools." Being available in Claude Code and Cursor is smarter than just building a proprietary UI.
- Open Core + Cloud Value-add: Use open source to build trust and community, then monetize via hard-to-build features like anti-detection and CAPTCHA solving.
- 30-Second Onboarding: One line of JSON to run—drastically lowering the barrier to entry.
- Credit-based Billing: Moving from "per step" to monthly credits makes costs more predictable and natural for users.
For Tech Bloggers
Founder Story
This is a classic "third time's the charm" story:
Suchintan Singh (CEO) and Shuchang Zheng first built Ikonomos, an engineer onboarding tool. They made every classic mistake: didn't talk to users and argued over irrelevant features. Their first YC interview was a rejection because they couldn't answer "Why would anyone use this?"
After the rejection, they studied Paul Graham’s essays and Stanford’s startup classes. They built a second product, Wyvern (an ML ranking platform), rushed the application on the deadline day, and got into YC S23 with a demo that said "the product isn't built but we already have customers."
They eventually pivoted to Skyvern for browser automation. The day they hit #1 on Hacker News, they got 3,000 GitHub Stars, 71 meeting invites, and 39 Cloud waitlist signups overnight. Singh previously built ML platforms at Faire and Gopuff, helping generate over $100M in GMV. Zheng is a CMU grad and ex-Lyft engineer whose testing tools were used by 1,000+ engineers.
With just a 6-person team, they hit $900K in revenue by June 2024.
Controversies / Discussion Angles
- Can AI Agents really replace RPA? Skyvern says yes—by using vision instead of selectors. But traditional RPA giants (UiPath, AA) are also adding AI. The battle isn't over.
- Monopoly Risk of the MCP Protocol: It's an Anthropic-led standard that OpenAI and Google have followed. If MCP becomes the de facto standard, Skyvern's first-mover advantage as a "native MCP browser tool" is massive.
- Open Source vs. Commercialization: If the core logic is open, how do they prevent forks? Are anti-detection and CAPTCHA solving enough of a moat?
- $900K Revenue with 6 People: Incredible efficiency, but can they scale to meet the support demands of enterprise clients?
Hype Data
- PH Launch: 164 votes (moderate hype).
- GitHub: ~20k Stars (extremely high).
- HN: Hit #1 on the front page.
- Twitter: CEO's MCP launch tweet got ~4k views with positive community feedback.
- Recognition: Included in the "50+ Best MCP Servers of 2026" lists.
Content Suggestions
- The "Death of RPA" Angle: "Is RPA Dead? How this 6-person startup is using AI vision to redefine browser automation."
- The Ecosystem Angle: Capitalize on the MCP hype and the Claude Code/Cursor ecosystem.
- The Tutorial Angle: "Give Claude Code Internet Access in 30 Seconds—A Skyvern MCP Guide."
For Early Adopters
Pricing Analysis
| Tier | Price | Features | Is it enough? |
|---|---|---|---|
| Free | $0 | Evaluation, low concurrency, no anti-detect | Good for testing, not for production |
| Hobby | Credit-based | Real workflows, for individuals | Enough for personal projects |
| Pro | Credit-based | Parallel execution, for small teams | Production-ready |
| Enterprise | Custom | Unlimited, full features | Enterprise-ready |
| Self-Hosted | $0 | All core features, no anti-detect | Good for tech-savvy users |
Onboarding Guide
-
Time to Value: 30 seconds (MCP mode), 5 minutes (Self-hosted).
-
Learning Curve: Low.
-
Steps (MCP Mode - Recommended):
- Register at app.skyvern.com and get your API Key.
- Run this command:
claude mcp add-json skyvern '{"type":"http","url":"https://mcp.skyvern.com/v1/mcp","headers":{"x-api-key":"YOUR_KEY"}}' - Tell Claude Code: "Open Hacker News and give me the top 10 headlines." Done.
-
Steps (Self-Hosted):
pip install skyvern && skyvern quickstart- Choose Docker Compose deployment.
- Configure your LLM API Key as prompted.
Pitfalls and Complaints
- Python Versioning: Requires 3.11/3.12/3.13; older versions will fail. This is a common GitHub Issue.
- Auto-fill Handling: Handling inputs that behave like browser auto-fills is a known current weakness.
- Documentation: Community feedback suggests the docs aren't fully comprehensive yet; beginners might get stuck.
- Consumer Awareness: Only 1 review on Trustpilot, suggesting they haven't focused on consumer-level reputation building yet.
Security and Privacy
- Data Storage: Self-hosted data stays local; Cloud data is on Skyvern's servers.
- Authentication: Supports integrations with Bitwarden and HashiCorp Vault.
- MCP Security: Research shows 7-10% of open-source MCP servers have vulnerabilities. Use environment variables for credentials; never hard-code them.
- Audits: Open-source code is reviewable, but no third-party security audit reports are currently public.
Alternatives
| Alternative | Pros | Cons |
|---|---|---|
| Browser Use | Fully free/open, flexible | Requires Playwright scripting, no Cloud service |
| Perplexity Comet | Free, most polished consumer UX | Not open-source, not for enterprise custom use |
| n8n | Massive workflow ecosystem | Weak browser control capabilities |
| Browserbase | Great cloud browser infra | Still requires writing Selenium/Playwright code |
| Custom Playwright+LLM | Total control | High development and maintenance cost |
For Investors
Market Analysis
- RPA Market: $35.27B by 2026, $247.34B by 2035 (24.2% CAGR).
- Hyper-automation: $63.06B by 2025, $287.38B by 2035 (16.4% CAGR).
- AI+RPA Convergence: By 2026, 58% of enterprises will use RPA combined with AI/ML.
- Browser Agent Category: This is the "iPhone moment" for RPA, shifting from fragile scripts to AI-driven adaptive automation.
Competitive Landscape
| Tier | Players | Positioning |
|---|---|---|
| Giants (Traditional) | UiPath, Automation Anywhere | Enterprise-grade, $750+/month |
| Giants (New) | Microsoft Power Automate | Tied to the M365 ecosystem |
| Mid-tier (AI Native) | Skyvern, Browser Use | Open Source + AI Vision |
| Consumer | Perplexity Comet | Free browser agent |
| Infrastructure | Browserbase | Cloud browser infrastructure |
Timing Analysis
- Why Now?: The MCP protocol was launched by Anthropic in late 2024, with OpenAI and Google following in 2025. By 2026, it's the standard. Skyvern is perfectly positioned as an MCP-native tool.
- Tech Maturity: Multimodal LLMs (GPT-4V, Claude 3.5+) only became truly viable in 2024. Vision capabilities are only now stable enough for production browser automation.
- Market Readiness: Enterprise acceptance of AI automation is exploding in 2025-2026, moving the "Agentic Browser" from a geek toy to a productivity tool.
Team Background
- Suchintan Singh (CEO): Ex-Faire/Gopuff ML platform, helped drive $100M+ GMV.
- Shuchang Zheng: CMU grad, ex-Lyft (testing tools used by 1,000+ engineers).
- Kerem Yilmaz: Co-founder.
- Team Size: 6 people (extremely lean).
- Experience: Third-time founders with both failure experience and successful execution.
Funding Status
- YC S23 Alum.
- Total Funding: ~$2.7M - $3.43M (Seed round, Dec 2025).
- Investors: Y Combinator, Unpopular Ventures.
- Revenue: $900K as of June 2024 (with 6 people).
- Valuation: Undisclosed (Seed round, estimated $15M-$25M range).
Conclusion
Skyvern is the "browser arm" of the 2026 AI Agent ecosystem—it doesn't just exist as a standalone product; it gives every AI tool the ability to browse and act on the web.
| User Type | Recommendation |
|---|---|
| Developers | Highly Recommended. Open-source, 20k Stars, clear architecture. If you're building an Agent, MCP integration is a must. |
| Product Managers | Recommended. Study their MCP distribution strategy and their open-core/cloud-monetization model. |
| Bloggers | Recommended. Great mix of MCP hype, founder stories, and RPA disruption. |
| Early Adopters | Recommended. Free tier is great for testing, and 30-second setup is a low bar. Watch out for sparse docs. |
| Investors | Worth tracking. An AI disruptor in a $35B market with proven PMF ($900K ARR). Watch the growth trajectory post-Seed. |
Resource Links
2026-03-04 | Trend-Tracker v7.3