Knowledge & data

Website crawler

Crawl public URLs and sitemaps to keep knowledge in sync with your site.

Website crawler

Open Crawler

Dashboard → Crawler (/crawl) starts website crawl jobs that fetch and ingest content for your knowledge base.

What it does

Visits URLs or sitemap entries you configure
Extracts readable text for chunking and embedding
Updates or adds chunks tied to your org’s knowledge store

When to use it

Marketing sites that change frequently
Public documentation you want the agent to mirror
Large sites where manual PDF export is impractical

Runtime & limits

Crawls may be async and subject to rate limits (site-side and WisebotAI-side).
Robots.txt and paywalled pages may block content—verify in preview or logs if your deployment exposes them.
Very large crawls can take 15–30+ minutes; plan off-peak updates.

Best practices

Start with important URLs or a sitemap section before full-site crawls.
Re-run after major site updates.
Combine with manual uploads for content not reachable on the web (internal PDFs).

Knowledge base files

Upload and manage documents that ground your agent’s answers.

Retrieval & RAG

How vector search and prompts combine for grounded answers.

On this page

Website crawler Open Crawler What it does When to use it Runtime & limits Best practices