Text to Speech
Text ToolsConvert text to natural-sounding speech in dozens of languages using browser speech synthesis. Adjust rate, pitch, and volume, preview voices, and download MP3 for videos, accessibility, and apps.. Free, private — all processing in your browser.
This tool is coming soon. Check back later!
Text to speech has gone from robotic and monotone to genuinely useful in the last few years. Every modern browser ships with the Web Speech API, which gives you access to the same voices your operating system uses for accessibility — including the high-quality neural voices Apple, Google, and Microsoft have shipped. This generator surfaces that system directly so you can pick a voice, tune rate and pitch, and hear your text spoken aloud with one click.
The tool lists every voice available on your device, grouped by language. Install a new voice in your OS settings and it appears here too. That means high-quality English voices like Samantha (macOS) or Eva (Windows) are available alongside dozens of other languages — Spanish, French, Mandarin, Japanese, Arabic, and many more — without any cloud service, API key, or upload. Pick a voice, paste text, click play. You can adjust speaking rate (0.1x to 10x), pitch, and volume before or during playback.
For workflow integration, the tool lets you export the audio as a file. Recording uses the MediaRecorder API to capture the speech output and produce a downloadable MP3 or WebM audio file — perfect for adding voice narration to video tutorials, building accessible web apps, prototyping audiobook workflows, or giving your eyes a rest while someone reads you long articles. All of this runs locally in your browser with no rate limits, no sign-up, and no text ever leaving your device.
Text to Speech — key features
System voices
Access every TTS voice installed on your operating system, grouped by language.
Rate, pitch, volume
Fine control over how the speech sounds without re-recording.
Live playback
Play, pause, and stop instantly without latency.
Audio download
Record the synthesized speech as MP3 or WebM for video narration and other projects.
Language grouping
Voices organized by language so finding a Spanish or Japanese speaker is quick.
Highlight current word
Shows which word is being spoken in real time, useful for reading along.
Privacy-first
All synthesis happens locally in your browser. Your text never leaves the device.
How to use the Text to Speech
- 1
Paste your text
Drop the text you want spoken. Short passages and long articles both work.
- 2
Choose a voice
Browse voices by language. High-quality neural voices are marked when available.
- 3
Adjust rate and pitch
Slower rates help for learning; higher pitch can feel more animated.
- 4
Play
Click play and the text is spoken through your speakers or headphones.
- 5
Download (optional)
Record the output and download as an audio file for video narration or other use.
Common use cases for the Text to Speech
Accessibility
- →:
- →:
- →:
Content creation
- →:
- →:
- →:
Learning
- →:
- →:
- →:
Development
- →:
- →:
- →:
Text to Speech — examples
English article
Natural voice at default rate
paste article
spoken audio with high-quality voice
Spanish learning
Slow rate for practice
Spanish text, rate 0.7
clear slow Spanish
Video narration
Recorded for post
narration script + record
MP3 ready for video editor
Proofreading
Hear drafts aloud
blog post draft
audio playback reveals awkward phrasing
Multilingual
Switch voices mid-session
English then Japanese
correct accent per passage
Technical details
The Web Speech API's SpeechSynthesis interface provides text-to-speech functionality in browsers. It exposes voices from the operating system and any installed speech engines. You create a SpeechSynthesisUtterance with text, set properties like voice, rate (0.1-10), pitch (0-2), and volume (0-1), then call speechSynthesis.speak().
Voice availability differs by platform:
- macOS and iOS: ships dozens of high-quality Apple voices; Siri voices can be downloaded
- Windows: ships standard Microsoft voices; Edge Read Aloud uses Azure Cognitive Services neural voices online
- Android: uses Google TTS engine with multiple languages
- Chromebook and ChromeOS: Google TTS voices
- Linux: uses eSpeak or installed voice packages
Neural voices (Samantha on macOS, Eva on Windows 10+) produce markedly more natural speech than older formant-synthesis voices. When available, they appear with a "network" or "high quality" label. Some platforms require downloading enhanced voices from system preferences — this tool surfaces whatever your system has installed.
For recording output, the tool uses MediaRecorder to capture audio from a silent audio element piped through SpeechSynthesis. This approach works in Chrome and Firefox but has inconsistent support in Safari. Exported files are WebM or MP3 depending on browser support, suitable for video editors, podcasts, and audio post-production. Rate and pitch adjustments happen at synthesis time, so they're baked into the recording. Language detection from the input text can auto-select a matching voice where ambiguity is low (common when the text is clearly in one language).
Common problems and solutions
⚠Voice availability varies
Your OS determines which voices you can use. Users on different systems hear different defaults.
⚠Browser differences
Safari's speech synthesis has known bugs around pause and voice selection. Chrome and Firefox are more reliable.
⚠Recording constraints
Recording browser TTS is tricky. Some combinations of browser and OS do not permit clean MP3 capture.
⚠Rate extremes
Rates below 0.5 or above 2 sound glitchy with many voices. Stick to 0.8-1.3 for natural output.
⚠Language detection
If text contains mixed languages, the chosen voice may mispronounce words outside its language. Break into separate runs for each language.
⚠Latency on first use
Some systems lazily load voices. First play after reload may pause briefly while the voice engine initializes.
Text to Speech — comparisons and alternatives
Cloud TTS services like Amazon Polly and Google WaveNet produce very natural voices but cost money, need API keys, and send your text to their servers. Desktop apps like Balabolka work offline but require installation. This browser-based tool uses the voices already on your machine, adds rate/pitch controls, language grouping, and recording support, all with zero setup. Perfect for quick narration, accessibility checks, and any TTS need where your OS voices are good enough (which for most modern devices is very good).
Frequently asked questions about the Text to Speech
▶Is this free?
Completely. It uses your OS voices, which are free on all major platforms.
▶Can I use this commercially?
The synthesized audio is generally usable for any purpose, but check your OS vendor's license for voices. System-installed voices typically allow commercial use; enhanced download voices sometimes restrict it.
▶Why is the voice quality different on my colleague's machine?
Different OS, different voices. Install enhanced voices from your OS speech settings to match.
▶Does this work offline?
Yes for system voices. Cloud neural voices (some Edge Read Aloud voices) require network.
▶Can I record the audio?
Yes in Chrome and Firefox using MediaRecorder. Safari has limited support.
▶How long can text be?
Most engines handle many thousands of characters. Very long text can get chunked; break into paragraphs for reliability.
▶Can I change how names are pronounced?
SSML (Speech Synthesis Markup Language) supports pronunciation hints but SpeechSynthesisUtterance has limited SSML support in browsers. Workarounds are possible.
▶Does my text get sent anywhere?
No for local system voices. Some platforms use cloud neural voices that send text to the vendor; the tool flags these.
Additional resources
Related tools
All Text ToolsCase Converter
Convert between upper, lower, title, camel, snake, kebab, Pascal, CONSTANT cases
Fake Data Generator
Generate realistic fake test data — names, emails, addresses, phones, dates, UUIDs, and more — for development and demos.
Fancy Text Generator
Convert plain text into dozens of fancy Unicode font styles (bold, italic, script, monospace, double-struck, circled, bubble, and more) ready to paste into social media bios and messages.
Image to Text (OCR)
Extract text from images using OCR technology. Works with photos, screenshots, scanned documents, and supports dozens of languages.
Lorem Ipsum Alternatives
Generate themed placeholder text with alternatives to lorem ipsum — pirate, hipster, corporate, bacon, cupcake, zombie, and other styles perfect for mockups, demos, and presentations that need personality.
Lorem Ipsum Generator
Generate Lorem Ipsum or alternative placeholder text for designs and mockups
Learn more
Explore more tools
200+ free tools that run in your browser.
Browse all tools →