seo-ai-crawlerslisted

Audit AI crawler access and citability for a page — confirm retrieval/citation bots (OAI-SearchBot, Claude-SearchBot, PerplexityBot) are allowed and the Googlebot vs Google-Extended split is correct, classify training vs search/retrieval vs user-fetch user-agents, check the page is server-rendered enough for non-JS AI crawlers, validate llms.txt / llms-full.txt (also covers M21), and generate a choice-gated robots.txt preset. Module M14. Feeds the AI Visibility score.
Hainrixz/claude-seo-ai · ★ 14 · AI & Automation · score 81

Install: claude install-skill Hainrixz/claude-seo-ai

# seo-ai-crawlers (M14) Controls whether AI search engines can crawl and cite the page, and whether they can read it without JS. The training-vs-search-vs-fetch distinction is everything. Reference: `references/ai-crawlers.md`. ## Audits Working from the PageSnapshot (`rendered_dom` if present, else `raw_html`) plus the site `robots.txt`: 1. **Citation access**: are retrieval/citation bots — `OAI-SearchBot`, `Claude-SearchBot`, `PerplexityBot`, `Bingbot` — actually allowed (not caught by a broad `Disallow: /` or a wildcard block)? Confirm `Googlebot` is not blocked and the `Googlebot` (search) vs `Google-Extended` (Gemini training control) split is correct. 2. **User-agent classification**: bucket every AI agent in `robots.txt` into **training** (GPTBot, ClaudeBot, Google-Extended, Applebot-Extended, CCBot), **search/retrieval** (OAI-SearchBot, Claude-SearchBot, PerplexityBot), and **user-triggered fetch** (ChatGPT-User, Claude-User, Perplexity-User). Match user-agents case-insensitively; treat the table in `references/ai-crawlers.md` as a starting set, not exhaustive. 3. **Renderability for non-JS crawlers**: pull the M4 (seo-crawl-render) render result — most AI crawlers do not execute JS. If primary content only appears in `rendered_dom` and is absent from `raw_html`, flag it as invisible to AI retrieval. 4. **llms.txt / llms-full.txt** (also covers M21): presence at the site root, valid Markdown structure (H1 title, summary blockquote, sectioned link lists), and that li