Back to Explore

Kokori

Social audio apps

Transform text to speech with a powerful macOS app

💡 Kokori is a TTS app for macOS that transforms text to speech using a powerful local API server and desktop application. It features high-quality voices, precise speed control, and seamless menu bar integration. You can generate audio via the API for your own applications or use the desktop interface to create voice-overs for podcasts, TikTok, Instagram, and sound-bites.

"Kokori is like having a professional voice actor living inside your Mac's menu bar—no internet required and no hourly rate."

30-Second Verdict
What is it: Kokori is a free macOS app wrapping the open-source Kokoro-82M TTS model, offering local, unlimited text-to-speech conversion with menu bar access and a built-in API.
Worth attention: Yes, especially for those needing frequent TTS and wanting to avoid cloud-based costs.
6/10

Hype

9/10

Utility

90

Votes

Product Profile
Full Analysis Report

Kokori: The Ultimate Free Local TTS Tool for macOS

2026-01-30 | Official Website | ProductHunt


30-Second Quick Judgment

What is this?: Kokori wraps the open-source Kokoro-82M TTS model into a native macOS desktop app. It features one-click menu bar access and includes a local API server that developers can integrate directly into their workflows.

Is it worth your time?: If you frequently need TTS for podcasts, voiceovers, or audiobooks and are tired of ElevenLabs charging you by the character, this is a must-try. It's free, runs locally, and has no limits.

Comparison:

  • vs ElevenLabs: ElevenLabs costs ~$330 for 2M characters; Kokori is free and unlimited.
  • vs macOS Native Speech: Kokori sounds significantly better and offers 54 distinct voices.
  • vs Self-hosting Kokoro: Kokori requires zero configuration—just download and run.

Three Questions for Me

Is this for me?

Target Audience:

  1. Indie Developers: Need to integrate TTS features into their apps.
  2. Content Creators: Making podcasts or voiceovers for TikTok/Instagram.
  3. Audiobook Producers: Batch converting text to speech.
  4. Accessibility Users: Helping those with visual impairments read text.

Do you fit?: You are the target user if:

  • You spend dozens of dollars monthly on TTS APIs.
  • You want to add voiceovers to videos without recording them yourself.
  • You are developing an app that requires voice output.
  • You often need to turn long articles into audio for listening.

Common Scenarios:

  • Converting a blog post into a YouTube audio track.
  • Using the API interface for an app's voice prompts.
  • Generating a draft of a podcast script to hear how it flows.
  • Turning a long e-book into audio for your daily commute.

Is it useful?

DimensionBenefitCost
Time5-minute setup, zero configGeneration speed is ~0.7x real-time
MoneySaves ElevenLabs fees ($330/2M chars)Completely free
EffortRuns locally; no quotas to managemacOS only

ROI Assessment: If you're a Mac user needing TTS, there's no reason not to try it. Zero cost, zero risk, and 5 minutes to see results.

What's the "Wow" Factor?

The Highlights:

  • Truly Free: Unlimited generation, unlike other services that hit you with overage charges.
  • Menu Bar Access: Highlight text and convert it to speech instantly.
  • Built-in API: Developers can integrate it with just a few lines of code.

The "Aha!" Moment:

"A game-changer for converting e-book libraries into audiobooks." — Digital Publisher

User Feedback:

Positive: "Allowed us to generate clear and natural-sounding voiceovers in multiple languages, saving us both time and money." — Enterprise User Positive: "It's just a 82M model but with amazing results." — GitHub User Complaint: No voice cloning support; emotional expression is limited.


For Indie Developers

Tech Stack

  • Frontend: Native macOS app with menu bar integration.
  • Backend: Local REST API server (OpenAI-compatible format).
  • AI/Model: Kokoro-82M (82M parameters, StyleTTS 2 architecture).
  • G2P Library: misaki (phoneme conversion).
  • Infrastructure: Purely local, no cloud dependencies.

Core Implementation

Kokoro-82M uses a decoder-only architecture without diffusion or a separate encoder, which is how it achieves large-model results with only 82M parameters. Based on StyleTTS 2, it reached #1 on the TTS Arena leaderboard using a hybrid voice (a 50/50 mix of Bella and Sarah).

The local API server runs on localhost:8880. The interface is designed to be OpenAI-compatible, meaning you can switch your existing OpenAI TTS code to Kokori with almost zero changes.

Open Source Status

  • Model: Kokoro-82M is under the Apache 2.0 license (commercial use allowed).
  • App: The Kokori App itself is closed-source commercial packaging.
  • Similar Projects:

Build Difficulty: Medium (~1 person-week). Since the model and API are open-source, the main work lies in the macOS app packaging and menu bar integration.

Business Model

  • Monetization: Currently looks like a free lead magnet; paid features may be added later.
  • Pricing: Completely free, unlimited use.
  • Market Reference: Kokoro API market price is roughly $1 per million characters.

Big Tech Risks

Low. TTS is a mature market. While giants (Google, Amazon, Microsoft) have cloud services, local free TTS is a differentiated play. Apple might improve its native TTS in macOS/iOS, but it's unlikely to match this audio quality in the short term.


For Product Managers

Pain Point Analysis

What problem does it solve?:

  1. Cloud TTS is expensive (ElevenLabs $330/2M chars).
  2. High costs during dev/testing when tweaking APIs.
  3. Privacy-sensitive use cases that require local processing.

Severity: High frequency + Essential need. Creators and developers use this daily, often processing massive amounts of text.

User Persona

  • Primary Users: Indie developers, small startup teams.
  • Scenarios: Prototyping products, bulk content production.
  • Willingness to Pay: Willing to pay to save time, but strongly dislikes per-character billing.

Feature Breakdown

FeatureTypeDescription
Local TTS GenerationCoreBased on Kokoro-82M, 54 voices
REST API ServerCoreOpenAI-compatible for easy integration
Menu Bar Quick ActionsCoreNative macOS experience
Speed/Pitch ControlCoreCustomize output effects
Local File StorageBonusAutomatically saves generated audio

Competitive Differentiation

vsKokoriElevenLabsmacOS Native
PriceFree$5-330/moFree
QualityHigh (TTS Arena #1)HighestAverage
Local RunYesNoYes
API SupportYesYesNo
Voice CloningNoYesNo

Key Takeaways

  1. "Download and Use" Philosophy: Zero-config is a killer feature compared to other TTS tools that require Python, dependencies, and environment setup.
  2. Menu Bar Entry: Matches macOS user habits and significantly lowers the friction of use.
  3. OpenAI-Compatible API: Zero migration cost is a brilliant design choice.

For Tech Bloggers

Founder Story

The Kokori App developer is anonymous. However, the underlying Kokoro-82M model was developed by hexgrad and trained by @rzvzn. The name "Kokoro" comes from Japanese, meaning "heart" or "soul."

Interestingly, both Kokoro and its G2P library, misaki, are named after characters from the Terminator series.

Timeline:

  • Dec 25, 2024: Kokoro v0.19 released (as a Christmas gift).
  • Jan 2, 2025: 10 voice packs + ONNX version released.
  • Jan 30, 2026: Kokori App launches on ProductHunt.

Controversy / Discussion Angles

  1. Small Model vs. Giants: How an 82M parameter model beat the 467M XTTS v2 and 1.2B MetaVoice.
  2. Open Source vs. Commercial Wrappers: Is it fair to charge for (or commercially package) an Apache-licensed open-source model?
  3. The Local AI Renaissance: Privacy and cost are driving a return to local models.

Hype Data

  • PH Ranking: 90 votes (moderate heat).
  • Base Model Popularity: Ranked #1 on TTS Arena (Open Source).
  • GitHub Activity: The main Kokoro repository is seeing continuous updates.

Content Suggestions

  • Angles: "The Free ElevenLabs Alternative," "The Power of Small AI Models."
  • Trending Topics: AI infrastructure costs, data privacy.

For Early Adopters

Pricing Analysis

TierPriceFeaturesIs it enough?
Full VersionFreeAll featuresAbsolutely

There are no paid tiers; it is completely free for unlimited use.

Getting Started

Setup Time: 5 minutes Learning Curve: Low

Steps:

  1. Visit kokori.app and download the DMG.
  2. Drag it to your Applications folder.
  3. Open it; an icon will appear in your menu bar.
  4. Enter text, choose a voice, and click generate.
  5. Developers: Access the API at localhost:8880.

Pitfalls and Complaints

  1. macOS Only: Windows/Linux users must self-deploy the open-source version.
  2. No Voice Cloning: Trained on less than 100 hours of data; cannot learn new voices.
  3. Limited Emotion: Laughter, anger, and sadness effects are mediocre.
  4. English-Centric: Supports 8 languages, but English quality is noticeably superior.

Security and Privacy

  • Data Storage: Entirely local; nothing is uploaded.
  • Privacy Policy: Zero data collection (runs offline).
  • Audit: No concerns; data never leaves the device.

Alternatives

AlternativeAdvantageDisadvantage
ElevenLabsBest quality, supports cloningExpensive, per-character billing
Self-hosted KokoroTotal controlRequires technical expertise
Fish AudioAffordable ($9.99/mo)Cloud-dependent
macOS Native TTSSystem integratedAverage audio quality

For Investors

Market Analysis

  • Market Size: $4B (2024) → $7.6B (2029), CAGR 13.7%.
  • Long-term Forecast: $34.5B (2035), CAGR 23.3%.
  • Growth Drivers: AI content production, accessibility needs, and multilingual globalization.

Competitive Landscape

TierPlayersPositioning
Top TierAmazon Polly, Google TTS, Microsoft AzureCloud services, usage-based billing
Mid TierElevenLabs, Play.ht, Murf.aiHigh quality, subscription-based
New EntrantsKokori, Fish AudioLow cost / Localized solutions

Timing Analysis

Why now?:

  1. Model Efficiency Breakthrough: 82M parameters can now outperform 1B+ models, making local execution viable.
  2. Cost Consciousness: SaaS fatigue is real; users are pushing back against usage-based billing.
  3. Privacy Regulations: Local processing is becoming a hard requirement for many industries.

Tech Maturity: High; the Kokoro model is already proven on TTS Arena. Market Readiness: High; there is clear demand for free, local alternatives.

Team Background

  • Kokori App: Developer anonymous.
  • Kokoro Model: Hexgrad team, led by @rzvzn.

Funding Status

  • Funding: Undisclosed (likely an indie project).
  • Exit Path: Freemium conversion or acquisition by a larger creative suite.

Conclusion

Kokori is the best local TTS choice for macOS users—free, powerful, and zero-configuration.

User TypeRecommendation
DevelopersHighly Recommended: Free API, OpenAI-compatible, low integration cost.
Product ManagersRecommended: Local AI is the trend; the business model is worth studying.
BloggersGreat Content: The "small model beats big model" story generates traffic.
Early AdoptersHighly Recommended: Free, no-risk, 5-minute setup.
InvestorsWatch: Large market, but the monetization path is currently unclear.

Resource Links

ResourceLink
Official Websitehttps://kokori.app/
ProductHunthttps://www.producthunt.com/products/kokori
Kokoro Modelhttps://huggingface.co/hexgrad/Kokoro-82M
GitHub (Model)https://github.com/hexgrad/kokoro
iOS/macOS Open Sourcehttps://github.com/mlalma/kokoro-ios

Sources


2026-01-31 | Trend-Tracker v7.3

One-line Verdict

Kokori is the best local TTS choice for macOS users—free, powerful, and zero-configuration.

FAQ

Frequently Asked Questions about Kokori

Kokori is a free macOS app wrapping the open-source Kokoro-82M TTS model, offering local, unlimited text-to-speech conversion with menu bar access and a built-in API.

The main features of Kokori include: Local TTS Generation (Kokoro-82M, 54 voices), REST API Server (OpenAI-compatible).

Completely free for unlimited use.

Indie developers, content creators, audiobook producers, and accessibility users needing TTS.

Alternatives to Kokori include: ElevenLabs, macOS Native, Amazon Polly, Google TTS, Microsoft Azure.

Data source: ProductHuntFeb 2, 2026
Last updated: