Vois: The Local AI Voice Workstation Trying to Kill ElevenLabs' Per-Character Billing
2026-03-06 | ProductHunt | Official Site
30-Second Quick Judgment
What is this?: A desktop TTS application that packs voice synthesis, voice cloning, multi-track editing, and mastering into a single app. It runs 100% locally with no internet required.
Is it worth your attention?: For most people, not yet. It has only 6 votes on ProductHunt and almost zero user discussion, making its quality hard to verify. However, its goal—replacing cloud-based per-character billing with local TTS—is a real trend for 2026. If you're stressed about your ElevenLabs bill, keep an eye on it, but don't rush in.
Three Key Questions
Is it for me?
- Target Users: Podcasters, audiobook authors, content creators—anyone who needs high-volume voice synthesis but hates per-character fees and cloud privacy risks.
- Am I the target?: If you spend over $30/month on ElevenLabs, or if your content involves sensitive info (legal, medical, corporate training), you are the target.
- When would I use it?:
- Making a podcast but don't want to use your own voice → Use 63 AI voices + multi-speaker editor.
- Turning ebooks/PDFs into audiobooks → Import EPUB/PDF and generate directly.
- Corporate training voiceovers → Runs locally, no data leaks.
- When NOT to use: Occasional short video voiceovers (free open-source tools are enough).
Is it useful?
| Dimension | Benefit | Cost |
|---|---|---|
| Time | One app for TTS + Editing + Mastering; no jumping between tools | Learning curve for a new tool; UI quality unknown |
| Money | $9/mo unlimited vs. ElevenLabs per-character (heavy users save $50+/mo) | $9/mo subscription; quality might not match the price |
| Effort | No worrying about character limits or uploading/downloading | Requires decent hardware (Apple Silicon is best) |
ROI Judgment: If you are a heavy ElevenLabs user paying $22+/month, switching to Vois could theoretically save you a lot. But since it's unverified, try the free tier first.
Is it enjoyable?
The "Aha!" Moment:
- Unlimited generation without the guilt: You don't have to worry about burning character credits every time you preview an edit.
- All-in-one workflow: Go from text to release-ready audio without leaving the app.
The "Wow" Factor: None yet from real users. It's too new; even on Twitter, it's mostly just the founder posting.
What the Founder says:
"Cloud voice AI charges you per character. Every edit, every preview, every revision costs money. And your scripts live on someone else's servers. I spent the last year building the alternative." — @praneybehl
For Indie Developers
Tech Stack
- Core Language: Rust (High performance, memory safe, 6x real-time speed on Apple Silicon)
- TTS Engines: Integrates 3 engines (specifics not disclosed, likely Kokoro/Piper or similar open-source models)
- Platforms: Desktop app for macOS/Windows
- Import: PDF, EPUB, DOCX, Web articles
- Export: WAV/MP3/FLAC with presets for Spotify/YouTube/Apple Podcasts/ACX
- Audio Processing: LUFS normalization, de-esser, EQ, limiter (Pro mastering)
Core Implementation
Vois's technical path involves wrapping multiple open-source TTS models into a native Rust desktop app, then adding audio editing and mastering features. Using Rust for the inference layer ensures performance (6x real-time on Apple Silicon) while avoiding Python dependency hell. It’s essentially "Open Source Models + Commercial UI + Pro Audio Post-production."
Open Source Status
- Is it open source?: No, it's a closed-source commercial product.
- GitHub: No public repository.
- Closest Open Source Alternative: Voicebox (MIT, Tauri+Rust, Qwen3-TTS driven, very similar features).
- Build Difficulty: Medium-High. TTS inference alone isn't hard (models are available), but building a smooth desktop app with an editor, mastering, cloning, and multi-engine management takes effort. Estimated 2-3 person-months.
Business Model
- Monetization: Subscription
- Pricing: Free tier (all voices, all engines, no credit card) + $9/month (billed annually)
- Differentiation: No character limits, local execution, replaces TTS service + audio editor + mastering plugins.
Giant Risk
Medium. Apple is doing a lot of on-device synthesis, and Google/Microsoft have powerful APIs. However, giants use cloud-based pay-as-you-go models; they won't release an "unlimited local" desktop tool soon as it cannibalizes their business. The real threat is the open-source community: projects like Voicebox, Kokoro, and Chatterbox already offer Vois's core features for free.
For Product Managers
Pain Point Analysis
- What it solves: The three pains of cloud TTS—per-character costs (unpredictable), script privacy (third-party uploads), and usage caps (creative restriction).
- How painful?: Mid-frequency essential. ElevenLabs' $100M ARR proves the demand, but not everyone is sensitive to per-character costs. Heavy users (podcasts, audiobooks) feel it most.
User Persona
- Podcasters: Need multiple voices and bulk generation for episodes.
- Audiobook Authors: Long-form text; per-character billing is too expensive.
- Corporate Training: Data cannot be uploaded to third-party servers.
- Privacy-Sensitive Users: Medical, legal, and government sectors.
Feature Breakdown
| Feature | Type | Description |
|---|---|---|
| Local TTS Generation | Core | 63 voices, 23 languages, 3 engines |
| Voice Cloning | Core | Requires only 5-60 second samples |
| Multi-Speaker Editor | Core | Assign different roles to dialogue |
| Pro Mastering | Differentiator | LUFS/de-esser/EQ/limiter |
| Multi-track Timeline | Differentiator | DAW-level editing capabilities |
| Content Import | Nice-to-have | PDF/EPUB/DOCX/Web |
| Export Presets | Nice-to-have | Spotify/YouTube/Podcasts/ACX |
Competitor Comparison
| vs | Vois | ElevenLabs | Voicebox (Open Source) |
|---|---|---|---|
| Execution | Local | Cloud | Local |
| Pricing | $9/mo Unlimited | Per character, $5-$99+/mo | Free |
| Voice Cloning | 5-60s samples | Cloud upload | 3s samples |
| Mastering | Built-in Pro | None | None |
| Open Source | No | No | MIT |
| Barrier to Entry | Download & Use | Register & Use | Download & Use |
Key Takeaways
- "Unlimited" Pricing Psychology: $9/mo unlimited vs. per-character billing removes the anxiety of "spending money with every preview."
- Workflow Integration: TTS + Editing + Mastering in one place reduces tool switching.
- Export Presets: Targeting Spotify/YouTube/ACX standards saves users from looking up technical parameters.
For Tech Bloggers
Founder Story
- Founder: Praney Behl (@praneybehl)
- Background: 20 years of software engineering experience, turned solopreneur in 2025. Tech stack includes TypeScript/React/Web3/GCP/AWS. Previously built WorkflowOS, Togglez, and RecastUI.
- Motivation: He stated on Twitter, "I spent the last year building the alternative"—driven by dissatisfaction with cloud TTS costs and privacy.
- Sources: Twitter @praneybehl, LinkedIn, GitHub
Controversy / Discussion Angles
- Angle 1: Local vs. Cloud—In 2026, open-source TTS quality is nearing commercial levels (Chatterbox won 63.8% in blind tests against ElevenLabs). Is local execution the inevitable future?
- Angle 2: $9/mo vs. Free Open Source—With Voicebox (MIT licensed) offering similar features, will users pay $9/mo for the convenience of a polished UI?
- Angle 3: Indie Dev Courage vs. Reality—A product built over a year by one person got only 6 votes on PH. Does a failed launch mean a failed product?
Hype Data
- PH Ranking: 6 votes, virtually no hype.
- Twitter Discussion: Only 7 tweets from the founder, very low interaction (max 400 views).
- Search Trends: The brand name "Vois" conflicts with many products (getvois.com plugin, vois.fm podcasting, Vodafone VOIS), making SEO extremely difficult.
Content Suggestions
- Best Approach: Don't write about Vois in isolation. Include it in a "2026 Local TTS Showdown" alongside Voicebox, Kokoro, and Chatterbox.
- Trend Opportunity: Local TTS is a hot topic; Voicebox gained 911 likes and 58K views on Twitter recently.
For Early Adopters
Pricing Analysis
| Tier | Price | Features | Is it enough? |
|---|---|---|---|
| Free | $0 | All voices, all engines, no credit card | Enough for light use |
| Paid | $9/mo (Annual) | Unlimited generation + all pro features | Worth it for heavy use |
Comparison: ElevenLabs' free tier is only ~10K characters/month. Vois's unlimited free tier is a massive advantage.
Quick Start Guide
- Setup Time: 10-15 minutes
- Learning Curve: Medium (multi-track editor takes some time to master)
- Steps:
- Visit vois.so to download the desktop app.
- Select a voice and language.
- Input text or import a document.
- Generate → Edit → Master → Export.
Pitfalls & Complaints
- No Real User Feedback: The product just launched; stability and quality are unverified.
- Brand Confusion: Searching for "Vois" brings up unrelated products like Vodafone VOIS.
- Hardware Requirements: Claims 6x real-time on Apple Silicon, but performance on Intel Macs and Windows is unmentioned.
Security & Privacy
- Data Storage: 100% local, no uploads.
- Privacy Policy: The core selling point is "Nothing leaves your machine."
- Real Advantage: Essential for GDPR/HIPAA compliance scenarios.
Alternatives
| Alternative | Advantage | Disadvantage |
|---|---|---|
| Voicebox | Free MIT Open Source, Tauri+Rust | Only one engine (Qwen3-TTS) |
| Kokoro-82M | Free, ultra-lightweight, runs on CPU | CLI only, no GUI editor |
| Chatterbox | Free MIT, beats ElevenLabs in tests | Requires Python, no integrated editor |
| ElevenLabs | Highest quality, most mature | Per-character fees, cloud-based |
For Investors
Market Analysis
- Market Size: Global TTS market approx. $5.3B by 2026.
- Growth: 10-23% CAGR; expected to hit $7.6B by 2029.
- Drivers: AI/NLP progress, accessibility needs, e-Learning boom, and privacy-driven local deployment.
Competitive Landscape
| Tier | Players | Positioning |
|---|---|---|
| Leaders | ElevenLabs ($100M ARR), Google TTS, Amazon Polly | Cloud API/Platform |
| Open Source | Voicebox, Kokoro, Chatterbox, Coqui | Free local solutions |
| New Entrants | Vois ($9/mo) | Paid local desktop tools |
Timing Analysis
- Why now?: Apple Silicon makes local inference viable; open-source TTS quality is nearing commercial grade; GDPR/HIPAA are driving local demand.
- Tech Maturity: TTS models are good enough (Kokoro 82M achieves MOS 4.4+). The bottleneck is now product experience, not the model.
- Market Readiness: User education is complete (thanks to ElevenLabs), but the "local-first" mindset is still forming.
Team & Funding
- Founder: Praney Behl, 20 years of engineering experience.
- Team: Likely a 1-person operation.
- Funding: No funding found; likely bootstrapped.
- Verdict: An indie project not currently seeking VC funding.
Conclusion
Vois has the right direction but awkward timing. Local TTS replacing cloud billing is a 2026 reality, but open-source projects like Voicebox already offer similar features for free. Vois's $9/month price point sits awkwardly between "Free Open Source" and "Industry Standard ElevenLabs." The weak ProductHunt launch suggests it hasn't found PMF yet.
| User Type | Recommendation |
|---|---|
| Developers | ❌ Stick to open-source (Voicebox/Kokoro) for more flexibility and zero cost. |
| Product Managers | ✅ Study the "full-workflow + unlimited pricing" strategy, but don't copy the product. |
| Bloggers | ❌ Not worth a standalone post; include it in a local TTS comparison. |
| Early Adopters | ⚠️ Try the free tier, but don't rely on it—it's too new and lacks a community. |
| Investors | ❌ High risk: 1-person team, no funding, surrounded by open-source rivals. |
Resource Links
| Resource | Link |
|---|---|
| Official Site | https://vois.so/ |
| ProductHunt | https://www.producthunt.com/products/vois |
| Founder Twitter | https://twitter.com/praneybehl |
| Founder GitHub | https://github.com/praneybehl |
| Founder LinkedIn | https://www.linkedin.com/in/praney-behl-b9129313/ |
| Competitor: Voicebox | https://voicebox.sh/ |
| Competitor: Kokoro | https://github.com/hexgrad/kokoro |
2026-03-06 | Trend-Tracker v7.3