Toolkit for extracting email addresses from HTML content and remote websites.
pnpm add email-scrapeimport {
scrapeEmailsFromWebsite,
scrapeEmailFromWebsite,
extractEmails,
} from "email-scrape";
// Extract emails from a string
const emails = extractEmails("Contact us at hello@example.com");
// Fetch a webpage and return ranked list of emails
// Automatically checks contact/about pages for more emails
const ranked = await scrapeEmailsFromWebsite("https://example.com");
// Skip contact page discovery for faster scraping
const main = await scrapeEmailsFromWebsite("https://example.com", {
followContactPages: false,
});
// Convenience helper returning the single highest-ranked email
const top = await scrapeEmailFromWebsite("https://example.com");- Smart email validation: Rejects malformed emails and text that looks like emails but isn't properly formatted
- Contact page discovery: Automatically finds and scrapes
/contact,/about, and similar pages for additional email addresses - Ranked results: Returns emails sorted by source quality (mailto links ranked highest, then structured data, then plain text)
- Keyword boosting: Emails containing keywords like "support", "contact", "info" get higher rankings
scrapeEmailsFromWebsite(url, options)
fetch: custom fetch implementation (defaults to globalfetch).signal: abort signal to cancel the request.userAgent: override the default user-agent string.headers: additional headers to merge with defaults.followContactPages: if true (default), automatically discovers and scrapes contact/about pages for additional emails.
pnpm clean # remove dist/coverage artifacts
pnpm lint # run Biome linting
pnpm format # format code with Biome
pnpm check # lint + format + auto-fix
pnpm test # run unit tests
pnpm test:integration # run integration tests (hits live websites)
pnpm test:all # run all tests
pnpm changeset # create a changeset for version bump
pnpm release # publish using changesetsThe project uses Changesets for version management and npm provenance for secure, transparent publishing.
-
One-time setup (if you haven't already):
- Go to npmjs.com → Account Settings → Access Tokens
- Create a new Automation token (granular access token with publish permission)
- In your GitHub repo: Settings → Secrets and variables → Actions → New repository secret
- Name it
NPM_TOKENand paste your token - The workflow now uses this with npm provenance for secure publishing
-
To publish a new version:
pnpm changeset # Describe your changes and choose semver bump (patch/minor/major) git add .changeset/* git commit -m "Add changeset for new feature" git push
-
The CI workflow automatically:
- Detects the changeset
- Bumps the version in
package.json - Publishes to npm with cryptographic provenance
- Pushes version commits and tags back to the repo
pnpm changeset version # Bump version
pnpm install # Update lockfile
pnpm test # Run tests
pnpm release # Publish to npmpnpm install
pnpm lint
pnpm test