prompt-injection-detector
SolidPrompt injection detection and prevention for secure LLM applications
Install
Quality Score: 94/100
Skill Content
Details
- Author
- a5c-ai
- Repository
- a5c-ai/babysitter
- Created
- 4 months ago
- Last Updated
- today
- Language
- JavaScript
- License
- MIT
Similar Skills
Semantically similar based on skill content — not just same category
detecting-ai-model-prompt-injection-attacks
Detects prompt injection attacks targeting LLM-based applications using a multi-layered defense combining regex pattern matching for known attack signatures, heuristic scoring for structural anomalies, and transformer-based classification with DeBERTa models. The detector analyzes user inputs before they reach the LLM, flagging direct injections (system prompt overrides, role-play escapes, instruction hijacking) and indirect injections (encoded payloads, multi-language obfuscation, delimiter-based escapes). Based on the OWASP LLM Top 10 (LLM01:2025 Prompt Injection) and Simon Willison's prompt injection taxonomy. Activates for requests involving prompt injection detection, LLM input sanitization, AI security scanning, or prompt attack classification.
adversarial-prompt-testing
Test LLM applications for prompt injection, jailbreak, data exfiltration, and indirect injection attacks — attack taxonomy, test harness design, automated red-team probes, defense patterns, and evaluation rubrics. Use when asked about "prompt injection", "jailbreak", "LLM red team", "adversarial prompts", "indirect injection", "exfiltration via LLM", "test AI security", "LLM attack surface", "OWASP LLM Top 10", "system prompt leak", "prompt leaking", or "AI safety testing". Do NOT use for: traditional app security — see red-team-check or security-review. Do NOT use for: model alignment — focus is on app layer.
prompt-injection-test
Run an OWASP LLM01 injection corpus against the system prompt + tool surface and report which payloads succeeded
ai-llm-safety
This skill should be used when designing, planning, implementing, or reviewing any system that involves LLM agents, tool use, prompt construction, or agentic workflows, or when the user asks to "add guardrails", "prevent prompt injection", "sanitize LLM output" — enforces prompt injection defense, tool safety, and context integrity
prompt-integrity
Verifies the structure, safety, and guardrails of system prompts generated within the project to prevent unintended behavior, hallucination, and prompt injection.