hunt-llm-ai

Solid

Hunt LLM/AI feature bugs — prompt injection, indirect injection, exfiltration via tool-use/markdown, ASCII smuggling, agentic AI security (OWASP Agentic Apps 2026, ASI01-ASI10). Patterns: direct injection ('ignore previous instructions'), indirect injection via documents/web pages/email the model reads, ASCII smuggling (Unicode Tags block U+E0000-U+E007F, invisible to humans, decoded by the model), tool-use exfiltration (model has fetch/browse tool, attacker injects OOB URL, model exfils chat history/secrets), markdown-image zero-click exfil, system-prompt extraction, IDOR-via-AI (cross-tenant data). Targets: chatbots, RAG, summarizers, agentic copilots, MCP tools. Detection: any LLM-backed endpoint, doc upload triggering AI processing, autonomous agent with tools. Validate: OOB/Collaborator callback for exfil, verbatim-reproducible system-prompt leak (run twice), verifiable cross-tenant leak or RCE. Confabulation is NOT a finding. Use when hunting AI features, chatbots, RAG, agentic systems, MCP.

AI & Automation 3,176 stars 485 forks Updated 4 days ago NOASSERTION

Install

View on GitHub

Quality Score: 86/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

## 11. LLM / AI FEATURES LLM bugs are only worth reporting when they cross a trust boundary you can **prove** — an OOB callback, a verbatim-reproducible secret, a cross-tenant record, or code execution. A model "saying something bad once" is confabulation, not a vulnerability. Read the False-Positive Gate before claiming anything. > **Naming note (was wrong in v1):** the model-level list is **OWASP Top 10 for LLM Applications 2025** (LLM01 Prompt Injection, LLM07 System Prompt Leakage, LLM08 Vector/Embedding Weaknesses). The agent-level list is **OWASP Top 10 for Agentic Applications (2026)** from the **Agentic Security Initiative (ASI)**, codes ASI01–ASI10. Do not write "OWASP ASI 2026" as if it were one document — cite the correct list per finding. --- ## False-Positive Gate (Read First) LLMs are non-deterministic. The single biggest source of bogus LLM reports is **confabulation** — the model inventing a plausible "system prompt" or "other user's data" that is not real. Apply every check below before writing a word. 1. **Run-twice rule (verbatim reproducibility).** Send the identical extraction prompt in two fresh sessions (clear cookies/conversation). A real system-prompt leak reproduces **token-for-token**. If the two outputs differ in wording, structure, or detail, it is confabulation — discard it. 2. **Anchor to a known-secret.** Don't ask "what is your system prompt"; ask the model to echo a string only the real prompt would contain (a tool name, an internal URL...

Details

Author: elementalsouls
Repository: elementalsouls/Claude-BugHunter
Created: 2 months ago
Last Updated: 4 days ago
Language: Python
License: NOASSERTION

Integrates with

Anthropic · AI Model Context Protocol · AI

Bundled in these plugins

claude-bughunter

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

ai--llm-security

LLM and AI application security testing — prompt injection, jailbreak resistance, OWASP LLM Top 10 (2025), RAG and agent/tool-use security, model supply chain, and AI red teaming for authorized assessments

204 Updated 1 weeks ago

Masriyan

AI & Automation Solid

adversarial-prompt-testing

Test LLM applications for prompt injection, jailbreak, data exfiltration, and indirect injection attacks — attack taxonomy, test harness design, automated red-team probes, defense patterns, and evaluation rubrics. Use when asked about "prompt injection", "jailbreak", "LLM red team", "adversarial prompts", "indirect injection", "exfiltration via LLM", "test AI security", "LLM attack surface", "OWASP LLM Top 10", "system prompt leak", "prompt leaking", or "AI safety testing". Do NOT use for: traditional app security — see red-team-check or security-review. Do NOT use for: model alignment — focus is on app layer.

2 Updated today

yanacuti1121

AI & Automation Listed

red-team-llm-app

Use this to adversarially test an LLM/agent app before attackers do - prompt injection, jailbreaks, data exfiltration, tool misuse, and unsafe output. Trigger on "red team my LLM", "test for prompt injection", "is my agent secure", "jailbreak testing", "security review of my AI app", especially before shipping anything customer-facing or with tools/data access. Test systematically against the known attack classes, not ad-hoc.

26 Updated today

ContextJet-ai