Data sources

Data source best practices

Get better results from your bot with these tips.

Structuring large websites

For sites with 100+ pages, consider splitting into multiple data sources:

  • By section: Docs, Blog, Product pages as separate data sources
  • By update frequency: Static pages vs frequently updated content
  • By importance: Core pages synced daily, archives synced weekly

This lets you:

  • Sync only what changed (saves storage tokens)
  • Use different CSS selectors per section
  • Troubleshoot issues more easily

CSS selectors for common platforms

The right CSS selector captures your main content and excludes headers, footers, and navigation.

PlatformRecommended selector
WordPressarticle or .entry-content
Webflow.main-content or main
Squarespace.main-content
Shopifymain or .product-description
GitBook.markdown-body
Notion (public).notion-page-content
Genericmain or article

💡 Tip: Inspect your page in browser DevTools to find the best selector. Test on one page before syncing your whole site.

Crawl vs Sitemap vs Manual URLs

MethodBest for
CrawlDiscovering all pages on a site (up to 600)
SitemapLarge sites with existing sitemaps
Manual URLsSpecific pages only, or when crawl misses pages
Bulk URLsAdding a known list of pages quickly

Sync frequency recommendations

Content typeRecommended schedule
News/blog (daily posts)Daily
DocumentationDaily or weekly
Marketing pagesWeekly
Product pagesDaily (if prices/stock change)
Static contentWeekly or manual

Reducing storage token usage

  • Use CSS selectors to exclude unnecessary content
  • Split large data sources so you only re-sync what changed
  • Remove pages you don't need from the data source
  • Use auto-sync with diffs (only syncs changed content)

Notion tips

  • Add top-level pages only - child pages are included automatically
  • Don't add the same page multiple times
  • If permissions change, disconnect and reconnect Notion in account settings

YouTube tips

  • Transcripts must be enabled on videos
  • Add playlists to bulk-add related videos
  • Auto-generated captions work, but manual captions are more accurate