prompt-injection-detector

Solid

Prompt injection detection and prevention for secure LLM applications

AI & Automation 1,160 stars 71 forks Updated today MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
54
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Prompt Injection Detector Skill ## Capabilities - Detect prompt injection attempts - Implement input sanitization - Configure detection classifiers - Design defense layers - Implement canary token detection - Create injection logging and alerting ## Target Processes - prompt-injection-defense - tool-safety-validation ## Implementation Details ### Detection Methods 1. **Pattern Matching**: Known injection patterns 2. **ML Classifiers**: Trained injection detectors 3. **Canary Tokens**: Detect instruction override 4. **LLM-Based**: Use LLM to detect manipulation 5. **Perplexity Analysis**: Unusual input patterns ### Defense Strategies - Input preprocessing - Prompt structure design - Output validation - Sandboxed execution - Multi-layer defense ### Configuration Options - Detection threshold - Pattern rules - Classifier model - Action policies - Alerting settings ### Best Practices - Defense in depth - Regular pattern updates - Monitor false positives - Test with red-team inputs ### Dependencies - rebuff (optional) - transformers - Custom classifiers

Details

Author
a5c-ai
Repository
a5c-ai/babysitter
Created
4 months ago
Last Updated
today
Language
JavaScript
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

detecting-ai-model-prompt-injection-attacks

Detects prompt injection attacks targeting LLM-based applications using a multi-layered defense combining regex pattern matching for known attack signatures, heuristic scoring for structural anomalies, and transformer-based classification with DeBERTa models. The detector analyzes user inputs before they reach the LLM, flagging direct injections (system prompt overrides, role-play escapes, instruction hijacking) and indirect injections (encoded payloads, multi-language obfuscation, delimiter-based escapes). Based on the OWASP LLM Top 10 (LLM01:2025 Prompt Injection) and Simon Willison's prompt injection taxonomy. Activates for requests involving prompt injection detection, LLM input sanitization, AI security scanning, or prompt attack classification.

13,115 Updated today
mukul975
AI & Automation Listed

adversarial-prompt-testing

Test LLM applications for prompt injection, jailbreak, data exfiltration, and indirect injection attacks — attack taxonomy, test harness design, automated red-team probes, defense patterns, and evaluation rubrics. Use when asked about "prompt injection", "jailbreak", "LLM red team", "adversarial prompts", "indirect injection", "exfiltration via LLM", "test AI security", "LLM attack surface", "OWASP LLM Top 10", "system prompt leak", "prompt leaking", or "AI safety testing". Do NOT use for: traditional app security — see red-team-check or security-review. Do NOT use for: model alignment — focus is on app layer.

3 Updated today
phamlongh230-lgtm
AI & Automation Listed

prompt-injection-test

Run an OWASP LLM01 injection corpus against the system prompt + tool surface and report which payloads succeeded

2 Updated today
bakw00ds
AI & Automation Listed

ai-llm-safety

This skill should be used when designing, planning, implementing, or reviewing any system that involves LLM agents, tool use, prompt construction, or agentic workflows, or when the user asks to "add guardrails", "prevent prompt injection", "sanitize LLM output" — enforces prompt injection defense, tool safety, and context integrity

5 Updated today
alo-exp
AI & Automation Listed

prompt-integrity

Verifies the structure, safety, and guardrails of system prompts generated within the project to prevent unintended behavior, hallucination, and prompt injection.

0 Updated today
Gladisintelligible706