📧

Extract Emails from Text

Extract every email address from pasted text with validation, deduplication, and clean export to CSV or line-separated list.. Free, private — all processing in your browser.

Paste any text containing email addresses

The Extract Emails tool pulls every email address out of any pasted text — documents, log files, web pages, email threads, HTML source, code files, whatever you drop in. This is one of those operations everybody needs eventually: a CSV arrived with emails buried in a notes column, a log file contains user addresses you need to audit, an exported contact list needs dedup, a support ticket thread has email addresses in the message bodies. Manually scanning and copying is tedious and error-prone. The tool does it in milliseconds with an RFC-compliant regex pattern.

Output is clean, deduplicated, and optionally validated. You get a newline-separated list of unique email addresses, sorted if you want, with duplicates removed automatically. Optional validation applies stricter regex to catch typos and malformed addresses. Export formats include plain list, CSV, JSON array, or a SQL INSERT statement ready for database loading. All processing is local to your browser — addresses never touch a server, which matters when the source text is internal or regulated. Remember that extracting emails to send unsolicited messages is illegal in most jurisdictions (GDPR, CAN-SPAM, CASL). Use this tool for legitimate purposes: deduping lists you own, auditing logs, or extracting contacts from your own exports.

What the Extract Emails from Text can do

RFC-practical pattern

Uses a regex that matches 99%+ of real email addresses without false positives on common text.

Automatic deduplication

Unique addresses only, with case-insensitive matching so Admin@example.com and admin@example.com become one entry.

Optional validation

Verify TLD is a real top-level domain and domain structure is well-formed beyond just regex match.

Multiple output formats

Plain list, CSV, JSON array, or SQL INSERT — pick the format your downstream tool expects.

Surrounding text stripped

Removes angle brackets, labels ("Name" <email>), and punctuation around addresses automatically.

Count and stats

Shows how many addresses were found, how many unique, and how many duplicates were removed.

Client-side only

Emails never leave your browser — safe for internal logs, customer lists, and confidential documents.

Fast on large text

Extracts from multi-megabyte documents in milliseconds.

How to use the Extract Emails from Text

1
Paste text
Drop any text containing email addresses — log file, CSV, email thread, or document — into the input.
2
Extract
Click extract and the tool finds every email address, removes duplicates, and shows the count.
3
Optional validation
Enable validation to filter out malformed or suspicious addresses before exporting.
4
Pick output format
Choose plain list, CSV, JSON, or SQL INSERT depending on where the extracted emails are going.
5
Copy or download
Copy to clipboard or download as a .txt, .csv, or .json file for import into your target system.

Common use cases for the Extract Emails from Text

Data processing

→Contact list extraction: Pull email addresses from a pasted CSV’s notes column into a clean list for import.
→Log file audit: Extract user emails from application logs to check which users were affected by an incident.
→Support thread mining: Pull every email address mentioned across a long support thread for contact follow-up.

Team operations

→Duplicate contact detection: Dedupe exported contacts from multiple systems (CRM, HelpDesk, marketing) before loading into one master list.
→Employee directory cleanup: Extract emails from free-text fields in HR exports to normalize them into a proper database column.
→Meeting attendee parsing: Pull attendee emails from calendar invite text for record-keeping.

Development and security

→Test data preparation: Extract real email patterns from production samples to build realistic test fixtures (after masking or synthesizing).
→Security log analysis: Find email addresses mentioned in login failure logs to identify targeted accounts.
→Privacy audit: Scan database dumps or code repos for exposed email addresses that should not be there.

Worked examples

From CSV notes

Extracting emails embedded in free-form notes.

Input

id,notes
1,"Contact John at john@example.com for approval"
2,"Email ana@company.org and team@company.org"

Output

john@example.com
ana@company.org
team@company.org

From email thread

Extracting from forwarded email headers.

Input

From: "Ana K" <ana@company.com>
To: Support <support@tooleras.com>, Ben <ben@company.com>
Subject: Help

Output

ana@company.com
support@tooleras.com
ben@company.com

Deduplication

Same address with different case treated as one.

Input

contact admin@site.com or ADMIN@site.com or Admin@site.com

Output

admin@site.com (3 occurrences, 1 unique)

With validation

Filtering out malformed addresses.

Input

good@example.com, bad@, also-bad@.com, fine@domain.org

Output

good@example.com
fine@domain.org
(bad@ and also-bad@.com filtered out)

SQL export

Ready-to-run INSERT statement.

Input

Extracted emails: a@x.com, b@x.com

Output

INSERT INTO contacts (email) VALUES
('a@x.com'),
('b@x.com');

Under the hood

Matching email addresses with a regex is notoriously tricky because RFC 5322 allows extremely permissive syntax (quoted local parts with almost any character, internationalized domain names, extensive comments). A fully compliant matcher runs hundreds of lines; a practical matcher covers 99%+ of real addresses with a short pattern.

This tool uses a practical pattern: [A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,24}

Local part: alphanumeric plus dot, underscore, percent, plus, hyphen. Does not allow quoted local parts (rare) or unusual characters.

Separator: @ symbol.

Domain: alphanumeric with dots and hyphens. Does not allow IP-literal domains or internationalized domain names (IDN punycode required).

TLD: 2 to 24 letters. Modern TLDs like .museum and .photography fit; weird ones longer than 24 characters would fail (rare in practice).

This pattern catches virtually every email you will encounter in real text without false positives on common patterns. For internationalized email (Unicode local parts, IDN domains), enable the strict-Unicode mode which uses \\p{L} Unicode property escapes.

Validation beyond regex: the tool optionally checks that the TLD is in the IANA published list (no fake TLDs), the domain has at least one dot, and the local part has reasonable length (under 64 characters per RFC). It cannot check whether the address is deliverable without sending mail — that requires SMTP verification, which the tool does not do.

Deduplication: extracted addresses are normalized (lowercased) and deduped via Set. Case is preserved in output by default (first occurrence wins); enable \"force lowercase output\" if your downstream system requires it.

Surrounding context removal: emails often appear with angle brackets (<john@example.com>), labels (\"John Doe\" <john@example.com>), or punctuation (comma-separated lists). The tool extracts just the address part, removing decoration.

Performance: regex extraction on multi-megabyte text runs in milliseconds. Works on logs, exports, and documents of any typical size.

Common problems and solutions

⚠Internationalized emails missed

The default regex covers ASCII emails only. For Cyrillic, Chinese, or other non-ASCII local parts, enable Unicode-aware mode. For internationalized domains (IDN), emails usually appear in punycode (xn--) form in logs, which the default regex handles.

⚠False positives from partial matches

A long alphanumeric string followed by @ followed by alphanumeric and dot can match things that are not really emails (user@host in SSH logs, for instance). Enable validation to filter implausible TLDs or check domain DNS separately.

⚠Obfuscated emails skipped

"john [at] example [dot] com" is not matched because it is not in standard format. Preprocess such obfuscations manually or with a custom regex before extraction.

⚠Quoted emails with unusual chars

RFC 5322 allows "very.(),:;<>[]\".VERY.\"very\@\\ \"very\".unusual"@strange.example.com as a valid email. Practical extractors miss these. If you need full RFC compliance, use a dedicated library.

⚠Catch-all aliases treated as regular

info@company.com and sales@company.com are extracted like any other address. Some of these may be role-based aliases rather than individual recipients — treat accordingly downstream.

⚠Privacy and legal concerns

Extracting emails from third-party sources for unsolicited messaging violates GDPR, CAN-SPAM, CASL, and similar laws. Only use this tool on data you have legal basis to process — your own exports, internal systems, or documents you own.

⚠Extraction from HTML

HTML email addresses may be wrapped in <a href="mailto:..."> tags. The pattern catches the href but may also grab stray @ signs in other attributes. Verify output on HTML source text.

Alternatives and comparisons

Compared to writing a custom regex in code, this tool gives an instant interactive workflow with validation and deduplication built in. For automation in a data pipeline, write code; for interactive extraction from pasted text, this tool is faster.

Compared to dedicated contact management tools, this tool is the extraction step before import. Run it to clean and dedupe a list, then import into Mailchimp, HubSpot, or whatever you use.

Compared to Unix grep with an email regex, this tool has a visual UI, automatic deduplication, and multiple output formats. grep is better for scripting and large-file piping; this tool is better for interactive work.

Questions and answers

▶How do I extract emails from a large document?

Paste the full document into the input field and click extract. Every email address is found and deduped automatically. Works on plain text, HTML source, log files, and CSV exports up to several megabytes.

▶Are duplicate emails removed automatically?

Yes. The tool deduplicates with case-insensitive matching (Admin@example.com and admin@example.com count as one). The count shows both total occurrences and unique addresses so you can verify.

▶Does the tool validate that emails are deliverable?

No — that requires SMTP verification which the tool does not do. It validates structure (local part @ domain.tld format, reasonable TLD) but cannot confirm the address exists or accepts mail. For deliverability, use a dedicated email verification service.

▶Can I export to CSV or JSON?

Yes. After extraction, pick your output format: plain newline-separated list, CSV with one address per row, JSON array, or SQL INSERT statement. Each format is ready to paste into the target system.

▶Is it safe for confidential data?

Yes. All processing runs in your browser — the text and extracted emails never leave your machine. No network requests are made for the extraction itself. Safe for internal logs, customer lists, and regulated data, subject to your organization’s policies.

▶What about GDPR and email extraction?

GDPR (EU) and similar laws (CCPA, CAN-SPAM) restrict what you can do with extracted email addresses. Extracting from third-party sources for unsolicited contact is generally prohibited. Use this tool only for data you have legal basis to process — your own customer lists, internal logs, or content you own.

▶Can the tool handle emails with labels like Name <email>?

Yes. The extraction pulls out just the address part, stripping the display name and angle brackets. So "John Doe" <john@example.com> becomes john@example.com in the output.

▶Does it find obfuscated emails like name (at) domain (dot) com?

No — obfuscated emails are not in standard format, and the regex does not match them. If you expect obfuscated emails, pre-process the text with a find-and-replace to convert "(at)" to @ and "(dot)" to . before extracting.

Additional resources

RFC 5322 (Email format) — Internet Message Format standard defining email address syntax.
IANA TLD list — Official authoritative list of top-level domains, used for email TLD validation.
Regular expressions for email — Reference collection of email-matching regex patterns in various languages.
GDPR and email lists — Official EU GDPR text, relevant when processing personal data like email addresses.
MDN Regular Expressions — JavaScript regex guide for building custom extraction patterns.

Related tools

All Text Tools

📱

Extract Phone Numbers

Extract phone numbers from pasted text in US, UK, EU, and international formats, deduplicate, and export to CSV, JSON, or plain list.

🔗

Extract URLs from Text

Extract every URL from pasted text with deduplication, validation, and export to CSV, JSON, or newline-separated list.

🔎

Find and Replace

Find and replace text with regex support, case sensitivity, whole-word matching, and preview of all changes before applying.

📏

Line Counter

Count lines in text with separate totals for blank lines, non-blank lines, words, characters, and paragraphs for detailed statistics.

🎯

Regex Tester

Test and debug regular expressions with live matching and explanation

🧹

Remove Duplicate Lines

Remove duplicate lines from text with case-sensitive or case-insensitive matching, preserving original order or sorting the result.

Explore more tools

200+ free tools that run in your browser.

Browse all tools →

What the Extract Emails from Text can do

RFC-practical pattern

Automatic deduplication

Optional validation

Multiple output formats

Surrounding text stripped

Count and stats

Client-side only

Fast on large text

How to use the Extract Emails from Text

Paste text

Extract

Optional validation

Pick output format

Copy or download

Common use cases for the Extract Emails from Text

Data processing

Team operations

Development and security

Worked examples

From CSV notes

From email thread

Deduplication

With validation

SQL export

Under the hood

Common problems and solutions

⚠Internationalized emails missed

⚠False positives from partial matches

⚠Obfuscated emails skipped

⚠Quoted emails with unusual chars

⚠Catch-all aliases treated as regular

⚠Privacy and legal concerns

⚠Extraction from HTML

Alternatives and comparisons

Questions and answers

Additional resources

Related tools

Extract Phone Numbers

Extract URLs from Text

Find and Replace

Line Counter

Regex Tester

Remove Duplicate Lines

Learn more

How to Decode a JWT: A Practical Debugging Guide (with the Base64URL Gotcha Nobody Warns You About)

UUID v4 vs v7: The Default Has Quietly Changed

MD5, SHA-1, SHA-256: Three Kinds of Hashing Everyone Confuses

Explore more tools