🔊

Text to Speech

Convert text to natural-sounding speech in dozens of languages using browser speech synthesis. Adjust rate, pitch, and volume, preview voices, and download MP3 for videos, accessibility, and apps.. Free, private — all processing in your browser.

Text to speak

77 characters

Voice

Rate 1.0

Pitch 1.0

Volume 1.0

Audio is generated locally by your browser and operating system voices. Nothing is uploaded. Available voices vary by device and OS.

Text to speech has gone from robotic and monotone to genuinely useful in the last few years. Every modern browser ships with the Web Speech API, which gives you access to the same voices your operating system uses for accessibility — including the high-quality neural voices Apple, Google, and Microsoft have shipped. This generator surfaces that system directly so you can pick a voice, tune rate and pitch, and hear your text spoken aloud with one click.

The tool lists every voice available on your device, grouped by language. Install a new voice in your OS settings and it appears here too. That means high-quality English voices like Samantha (macOS) or Eva (Windows) are available alongside dozens of other languages — Spanish, French, Mandarin, Japanese, Arabic, and many more — without any cloud service, API key, or upload. Pick a voice, paste text, click play. You can adjust speaking rate (0.1x to 10x), pitch, and volume before or during playback.

For workflow integration, the tool lets you export the audio as a file. Recording uses the MediaRecorder API to capture the speech output and produce a downloadable MP3 or WebM audio file — perfect for adding voice narration to video tutorials, building accessible web apps, prototyping audiobook workflows, or giving your eyes a rest while someone reads you long articles. All of this runs locally in your browser with no rate limits, no sign-up, and no text ever leaving your device.

Features at a glance

System voices

Access every TTS voice installed on your operating system, grouped by language.

Rate, pitch, volume

Fine control over how the speech sounds without re-recording.

Live playback

Play, pause, and stop instantly without latency.

Audio download

Record the synthesized speech as MP3 or WebM for video narration and other projects.

Language grouping

Voices organized by language so finding a Spanish or Japanese speaker is quick.

Highlight current word

Shows which word is being spoken in real time, useful for reading along.

Privacy-first

All synthesis happens locally in your browser. Your text never leaves the device.

How to use the Text to Speech

1
Paste your text
Drop the text you want spoken. Short passages and long articles both work.
2
Choose a voice
Browse voices by language. High-quality neural voices are marked when available.
3
Adjust rate and pitch
Slower rates help for learning; higher pitch can feel more animated.
4
Play
Click play and the text is spoken through your speakers or headphones.
5
Download (optional)
Record the output and download as an audio file for video narration or other use.

Common use cases for the Text to Speech

Accessibility

→:
→:
→:

Content creation

→:
→:
→:

Learning

→:
→:
→:

Development

→:
→:
→:

Text to Speech in practice

English article

Natural voice at default rate

Input

paste article

Output

spoken audio with high-quality voice

Spanish learning

Slow rate for practice

Input

Spanish text, rate 0.7

Output

clear slow Spanish

Video narration

Recorded for post

Input

narration script + record

Output

MP3 ready for video editor

Proofreading

Hear drafts aloud

Input

blog post draft

Output

audio playback reveals awkward phrasing

Multilingual

Switch voices mid-session

Input

English then Japanese

Output

correct accent per passage

Technical details

The Web Speech API's SpeechSynthesis interface provides text-to-speech functionality in browsers. It exposes voices from the operating system and any installed speech engines. You create a SpeechSynthesisUtterance with text, set properties like voice, rate (0.1-10), pitch (0-2), and volume (0-1), then call speechSynthesis.speak().

Voice availability differs by platform:
- macOS and iOS: ships dozens of high-quality Apple voices; Siri voices can be downloaded
- Windows: ships standard Microsoft voices; Edge Read Aloud uses Azure Cognitive Services neural voices online
- Android: uses Google TTS engine with multiple languages
- Chromebook and ChromeOS: Google TTS voices
- Linux: uses eSpeak or installed voice packages

Neural voices (Samantha on macOS, Eva on Windows 10+) produce markedly more natural speech than older formant-synthesis voices. When available, they appear with a "network" or "high quality" label. Some platforms require downloading enhanced voices from system preferences — this tool surfaces whatever your system has installed.

For recording output, the tool uses MediaRecorder to capture audio from a silent audio element piped through SpeechSynthesis. This approach works in Chrome and Firefox but has inconsistent support in Safari. Exported files are WebM or MP3 depending on browser support, suitable for video editors, podcasts, and audio post-production. Rate and pitch adjustments happen at synthesis time, so they're baked into the recording. Language detection from the input text can auto-select a matching voice where ambiguity is low (common when the text is clearly in one language).

Troubleshooting

⚠Voice availability varies

Your OS determines which voices you can use. Users on different systems hear different defaults.

⚠Browser differences

Safari's speech synthesis has known bugs around pause and voice selection. Chrome and Firefox are more reliable.

⚠Recording constraints

Recording browser TTS is tricky. Some combinations of browser and OS do not permit clean MP3 capture.

⚠Rate extremes

Rates below 0.5 or above 2 sound glitchy with many voices. Stick to 0.8-1.3 for natural output.

⚠Language detection

If text contains mixed languages, the chosen voice may mispronounce words outside its language. Break into separate runs for each language.

⚠Latency on first use

Some systems lazily load voices. First play after reload may pause briefly while the voice engine initializes.

Text to Speech — comparisons and alternatives

Cloud TTS services like Amazon Polly and Google WaveNet produce very natural voices but cost money, need API keys, and send your text to their servers. Desktop apps like Balabolka work offline but require installation. This browser-based tool uses the voices already on your machine, adds rate/pitch controls, language grouping, and recording support, all with zero setup. Perfect for quick narration, accessibility checks, and any TTS need where your OS voices are good enough (which for most modern devices is very good).

Questions and answers

▶Is this free?

Completely. It uses your OS voices, which are free on all major platforms.

▶Can I use this commercially?

The synthesized audio is generally usable for any purpose, but check your OS vendor's license for voices. System-installed voices typically allow commercial use; enhanced download voices sometimes restrict it.

▶Why is the voice quality different on my colleague's machine?

Different OS, different voices. Install enhanced voices from your OS speech settings to match.

▶Does this work offline?

Yes for system voices. Cloud neural voices (some Edge Read Aloud voices) require network.

▶Can I record the audio?

Yes in Chrome and Firefox using MediaRecorder. Safari has limited support.

▶How long can text be?

Most engines handle many thousands of characters. Very long text can get chunked; break into paragraphs for reliability.

▶Can I change how names are pronounced?

SSML (Speech Synthesis Markup Language) supports pronunciation hints but SpeechSynthesisUtterance has limited SSML support in browsers. Workarounds are possible.

▶Does my text get sent anywhere?

No for local system voices. Some platforms use cloud neural voices that send text to the vendor; the tool flags these.

Related tools

All Text Tools

Case Converter

Convert between upper, lower, title, camel, snake, kebab, Pascal, CONSTANT cases

🎭

Fake Data Generator

Generate realistic fake test data — names, emails, addresses, phones, dates, UUIDs, and more — for development and demos.

✨

Fancy Text Generator

Convert plain text into dozens of fancy Unicode font styles (bold, italic, script, monospace, double-struck, circled, bubble, and more) ready to paste into social media bios and messages.

📖

Image to Text (OCR)

Extract text from images using OCR technology. Works with photos, screenshots, scanned documents, and supports dozens of languages.

📝

Lorem Ipsum Alternatives

Generate themed placeholder text with alternatives to lorem ipsum — pirate, hipster, corporate, bacon, cupcake, zombie, and other styles perfect for mockups, demos, and presentations that need personality.

📝

Lorem Ipsum Generator

Generate Lorem Ipsum or alternative placeholder text for designs and mockups

Explore more tools

200+ free tools that run in your browser.

Browse all tools →

Features at a glance

System voices

Rate, pitch, volume

Live playback

Audio download

Language grouping

Highlight current word

Privacy-first

How to use the Text to Speech

Paste your text

Choose a voice

Adjust rate and pitch

Play

Download (optional)

Common use cases for the Text to Speech

Accessibility

Content creation

Learning

Development

Text to Speech in practice

English article

Spanish learning

Video narration

Proofreading

Multilingual

Technical details

Troubleshooting

⚠Voice availability varies

⚠Browser differences

⚠Recording constraints

⚠Rate extremes

⚠Language detection

⚠Latency on first use

Text to Speech — comparisons and alternatives

Questions and answers

Further reading

Related tools

Case Converter

Fake Data Generator

Fancy Text Generator

Image to Text (OCR)

Lorem Ipsum Alternatives

Lorem Ipsum Generator

Learn more

How to Decode a JWT: A Practical Debugging Guide (with the Base64URL Gotcha Nobody Warns You About)

UUID v4 vs v7: The Default Has Quietly Changed

MD5, SHA-1, SHA-256: Three Kinds of Hashing Everyone Confuses

Explore more tools