Skip to content

Commit 0024c82

Browse files
Sponsors/new (unclecode#1637)
1 parent f68e753 commit 0024c82

File tree

2 files changed

+62
-1
lines changed

2 files changed

+62
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1034,7 +1034,7 @@ Our enterprise sponsors and technology partners help scale Crawl4AI to power pro
10341034

10351035
| Company | About | Sponsorship Tier |
10361036
|------|------|----------------------------|
1037-
| <a href="https://app.scrapeless.com/passport/register?utm_source=official&utm_term=crawl4ai" target="_blank"><picture><source width="250" media="(prefers-color-scheme: dark)" srcset="https://gist.githubusercontent.com/aravindkarnam/0d275b942705604263e5c32d2db27bc1/raw/Scrapeless-light-logo.svg"><source width="250" media="(prefers-color-scheme: light)" srcset="https://gist.githubusercontent.com/aravindkarnam/22d0525cc0f3021bf19ebf6e11a69ccd/raw/Scrapeless-dark-logo.svg"><img alt="Scrapeless" src="https://gist.githubusercontent.com/aravindkarnam/22d0525cc0f3021bf19ebf6e11a69ccd/raw/Scrapeless-dark-logo.svg"></picture></a> | Scrapeless is the best full-stack web scraping toolkit offering Scraping API, Scraping Browser, Web Unlocker, Captcha Solver, and Proxies, designed to handle all your data collection needs. | 🥈 Silver |
1037+
| <a href="https://app.scrapeless.com/passport/register?utm_source=official&utm_term=crawl4ai" target="_blank"><picture><source width="250" media="(prefers-color-scheme: dark)" srcset="https://gist.githubusercontent.com/aravindkarnam/0d275b942705604263e5c32d2db27bc1/raw/Scrapeless-light-logo.svg"><source width="250" media="(prefers-color-scheme: light)" srcset="https://gist.githubusercontent.com/aravindkarnam/22d0525cc0f3021bf19ebf6e11a69ccd/raw/Scrapeless-dark-logo.svg"><img alt="Scrapeless" src="https://gist.githubusercontent.com/aravindkarnam/22d0525cc0f3021bf19ebf6e11a69ccd/raw/Scrapeless-dark-logo.svg"></picture></a> | Scrapeless provides production-grade infrastructure for Crawling, Automation, and AI Agents, offering Scraping Browser, 4 Proxy Types and Universal Scraping API. | 🥈 Silver |
10381038
| <a href="https://dashboard.capsolver.com/passport/register?inviteCode=ESVSECTX5Q23" target="_blank"><picture><source width="120" media="(prefers-color-scheme: dark)" srcset="https://docs.crawl4ai.com/uploads/sponsors/20251013045338_72a71fa4ee4d2f40.png"><source width="120" media="(prefers-color-scheme: light)" srcset="https://www.capsolver.com/assets/images/logo-text.png"><img alt="Capsolver" src="https://www.capsolver.com/assets/images/logo-text.png"></picture></a> | AI-powered Captcha solving service. Supports all major Captcha types, including reCAPTCHA, Cloudflare, and more | 🥉 Bronze |
10391039
| <a href="https://kipo.ai" target="_blank"><img src="https://docs.crawl4ai.com/uploads/sponsors/20251013045751_2d54f57f117c651e.png" alt="DataSync" width="120"/></a> | Helps engineers and buyers find, compare, and source electronic & industrial parts in seconds, with specs, pricing, lead times & alternatives.| 🥇 Gold |
10401040
| <a href="https://www.kidocode.com/" target="_blank"><img src="https://docs.crawl4ai.com/uploads/sponsors/20251013045045_bb8dace3f0440d65.svg" alt="Kidocode" width="120"/><p align="center">KidoCode</p></a> | Kidocode is a hybrid technology and entrepreneurship school for kids aged 5–18, offering both online and on-campus education. | 🥇 Gold |
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
import json
2+
import asyncio
3+
from urllib.parse import quote, urlencode
4+
from crawl4ai import CrawlerRunConfig, BrowserConfig, AsyncWebCrawler
5+
6+
# Scrapeless provides a free anti-detection fingerprint browser client and cloud browsers:
7+
# https://www.scrapeless.com/en/blog/scrapeless-nstbrowser-strategic-integration
8+
9+
async def main():
10+
# customize browser fingerprint
11+
fingerprint = {
12+
"userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.1.2.3 Safari/537.36",
13+
"platform": "Windows",
14+
"screen": {
15+
"width": 1280, "height": 1024
16+
},
17+
"localization": {
18+
"languages": ["zh-HK", "en-US", "en"], "timezone": "Asia/Hong_Kong",
19+
}
20+
}
21+
22+
fingerprint_json = json.dumps(fingerprint)
23+
encoded_fingerprint = quote(fingerprint_json)
24+
25+
scrapeless_params = {
26+
"token": "your token",
27+
"sessionTTL": 1000,
28+
"sessionName": "Demo",
29+
"fingerprint": encoded_fingerprint,
30+
# Sets the target country/region for the proxy, sending requests via an IP address from that region. You can specify a country code (e.g., US for the United States, GB for the United Kingdom, ANY for any country). See country codes for all supported options.
31+
# "proxyCountry": "ANY",
32+
# create profile on scrapeless
33+
# "profileId": "your profileId",
34+
# For more usage details, please refer to https://docs.scrapeless.com/en/scraping-browser/quickstart/getting-started
35+
}
36+
query_string = urlencode(scrapeless_params)
37+
scrapeless_connection_url = f"wss://browser.scrapeless.com/api/v2/browser?{query_string}"
38+
async with AsyncWebCrawler(
39+
config=BrowserConfig(
40+
headless=False,
41+
browser_mode="cdp",
42+
cdp_url=scrapeless_connection_url,
43+
)
44+
) as crawler:
45+
result = await crawler.arun(
46+
url="https://www.scrapeless.com/en",
47+
config=CrawlerRunConfig(
48+
wait_for="css:.content",
49+
scan_full_page=True,
50+
),
51+
)
52+
print("-" * 20)
53+
print(f'Status Code: {result.status_code}')
54+
print("-" * 20)
55+
print(f'Title: {result.metadata["title"]}')
56+
print(f'Description: {result.metadata["description"]}')
57+
print("-" * 20)
58+
59+
if __name__ == "__main__":
60+
asyncio.run(main())
61+

0 commit comments

Comments
 (0)