fetch-sitemap

Solid

Extract URLs from an XML sitemap with optional regex filtering

AI & Automation 47 stars 4 forks Updated 4 days ago MIT

Install

View on GitHub

Quality Score: 90/100

Stars 20%
56
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Fetch Sitemap URLs Extract URLs from an XML sitemap with optional regex filtering. ## Arguments - `$0`: URL (required, must start with `http://` or `https://`) - If the URL ends with `.xml`, use it directly as the sitemap URL (backward compatible) - Otherwise, run the auto-discovery logic below - `$1`: an extended regex pattern for filtering (optional) If `$0` is empty, display the usage below and stop: ``` Usage: /fetch-sitemap <url> [pattern] Examples: /fetch-sitemap https://kotlinlang.org/docs /fetch-sitemap https://example.com/sitemap.xml /fetch-sitemap https://example.com docs /fetch-sitemap https://example.com/sitemap.xml 'skills|hooks' ``` If `$0` does not start with `http://` or `https://`, inform the user that a valid URL is required and stop. ## Sitemap Auto-Discovery When the URL does **not** end with `.xml`, automatically discover the sitemap by probing the following locations **one at a time, stopping as soon as one produces output** (do NOT run probes in parallel): **Probes 1–2** — fetch and extract in a single curl: 1. `{url}/sitemap.xml` — path-specific (e.g., `https://kotlinlang.org/docs/sitemap.xml`) 2. `{origin}/sitemap.xml` — site root (e.g., `https://kotlinlang.org/sitemap.xml`), where `{origin}` is the scheme + host of the URL ```bash curl -sfL --compressed --connect-timeout 5 --max-time 10 <probe-url> | grep -oE '<loc>[^<]+</loc>' | sed 's/<loc>//;s/<\/loc>//' ``` If the output is non-empty, the sitemap is found **and the URLs...

Details

Author
LeeJuOh
Repository
LeeJuOh/claude-code-zero
Created
4 months ago
Last Updated
4 days ago
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed

seo-sitemap

Pull a domain's XML sitemap (and sitemap-of-sitemaps), then compare against the most recent SE Ranking website audit. Surfaces (a) sitemap entries the crawler couldn't find (orphans from the sitemap), (b) audit pages missing from the sitemap (probably an oversight), (c) sitemap entries that are now 404, (d) lastmod inconsistencies. Use when the user asks for "sitemap analysis", "check my sitemap", "sitemap vs audit", "missing pages", "orphan pages", or "sitemap health".

57 Updated 1 weeks ago
seranking
AI & Automation Listed

seo-sitemap

Pull a domain's XML sitemap (and sitemap-of-sitemaps), then compare against what's actually crawled/indexed (GSC indexed pages + a DataForSEO On-Page fetch loop + GSC sitemap ingestion). Surfaces (a) sitemap entries the crawler couldn't find (orphans from the sitemap), (b) crawled/indexed pages missing from the sitemap (probably an oversight), (c) sitemap entries that are now 404, (d) lastmod inconsistencies. Use when the user asks for "sitemap analysis", "check my sitemap", "sitemap vs audit", "missing pages", "orphan pages", or "sitemap health".

0 Updated 2 days ago
amirjahfar1
Web & Frontend Listed

sitemapkit

Discover and extract sitemaps from any website using SitemapKit. Use this skill whenever the user wants to find pages on a website, get a list of URLs from a domain, audit a site's structure, crawl a sitemap, check what pages exist on a site, or do anything involving sitemaps or site URL discovery — even if they don't explicitly say "sitemap". Requires the sitemapkit MCP server configured with a valid SITEMAPKIT_API_KEY.

353 Updated today
aiskillstore