Extract contacts from text or HTML
Pull every email, phone, and social profile URL out of a text or HTML blob.
/v1/extract/contacts You supply the text or HTML โ we extract every email address, phone number, and social-profile URL we can find. Auto-detects HTML and strips it before extraction (scripts/styles decomposed, hrefs preserved). Phones are normalized and shape-checked (7โ15 digits, optional E.164 prefix). Socials are matched against a curated platform list: LinkedIn, Twitter/X, GitHub, Instagram, Facebook, YouTube, TikTok, Mastodon, Bluesky, Threads, Reddit. No network call โ pure extraction.
Email syntax matching (>99% of real-world addresses), phone shape-and-digit-count validation, socials from 11 curated platforms.
Does not validate that emails are deliverable (use /v1/parse/email or /v1/verify/email). Does not validate that phone numbers are dial-able in a given country (that needs libphonenumber + a country hint). Does not mine JSON-LD; use /v1/extract/structured for that.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| text | string | yes | โ | Text or HTML blob. Auto-detected โ no flag needed. Body limit 1 MB. |
| include_phones | boolean | no | true | Set false to skip phone extraction. |
| include_socials | boolean | no | true | Set false to skip social-URL extraction. |
Request
curl -X POST https://api.qcrawl.com/v1/extract/contacts \
-H "Authorization: Bearer osk_..." \
-d '{"text": "Reach us at [email protected] or +1 (415) 555-0142. We are on https://twitter.com/example and https://linkedin.com/in/jane-doe."}' Response
{
"status": "success",
"emails": [
{"address": "[email protected]", "is_noreply": false}
],
"phones": [
{"raw": "+1 (415) 555-0142", "normalized": "+14155550142", "is_e164_shape": true}
],
"socials": {
"twitter": [{"handle": "example", "url": "https://twitter.com/example"}],
"linkedin": [{"handle": "jane-doe", "url": "https://linkedin.com/in/jane-doe"}]
},
"counts": {"emails": 1, "phones": 1, "socials": 2}
}