Commit d6b6d11
committed
docs: update browser and crawler run config documentation to match async_configs.py implementation
Updated browser-crawler-config.md and parameters.md to ensure complete
accuracy with the actual BrowserConfig and CrawlerRunConfig implementations.
Changes:
- Removed non-existent parameters from documentation:
* enable_rate_limiting, rate_limit_config (never implemented)
* memory_threshold_percent, check_interval, max_session_permit (internal to AsyncDispatcher)
* display_mode (doesn't exist)
- Added missing BrowserConfig parameters (14 total):
* browser_mode, use_managed_browser, cdp_url, debugging_port, host
* viewport, chrome_channel, channel
* accept_downloads, downloads_path, storage_state, sleep_on_close
* user_agent_mode, user_agent_generator_config, enable_stealth
- Added missing CrawlerRunConfig parameters (29 total):
* chunking_strategy, keep_attrs, parser_type, scraping_strategy
* proxy_config, proxy_rotation_strategy
* locale, timezone_id, geolocation, fetch_ssl_certificate
* shared_data, wait_for_timeout
* c4a_script, max_scroll_steps
* exclude_all_images, table_score_threshold, table_extraction
* exclude_internal_links, score_links
* capture_network_requests, capture_console_messages
* method, stream, url, user_agent, user_agent_mode, user_agent_generator_config
* deep_crawl_strategy, link_preview_config, url_matcher, match_mode, experimental
- Marked deprecated cache parameters (bypass_cache, disable_cache, no_cache_read, no_cache_write)
- Reorganized parameters into logical sections (Content Processing, Browser Location & Identity,
Caching & Session, Page Navigation & Timing, Page Interaction, Media Handling, Link/Domain
Handling, Debug & Logging, Connection & HTTP, Virtual Scroll, URL Matching, Advanced Features)
- Ensured all parameter descriptions match source code docstrings
- Added proper default values from __init__ signatures1 parent 89cc29f commit d6b6d11
2 files changed
+191
-86
lines changed
0 commit comments