constitutional-ai-prompts
SolidConstitutional AI and safety guardrail prompts for aligned LLM behavior
Install
Quality Score: 94/100
Skill Content
Details
- Author
- a5c-ai
- Repository
- a5c-ai/babysitter
- Created
- 4 months ago
- Last Updated
- today
- Language
- JavaScript
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
constitutional-ai
Anthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with self-critique/revision, then RLAIF (RL from AI Feedback). Use for safety alignment, reducing harmful outputs without human labels. Powers Claude's safety system.
constitutional-ai
Anthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with self-critique/revision, then RLAIF (RL from AI Feedback). Use for safety alignment, reducing harmful outputs without human labels. Powers Claude's safety system.
constitutional-reasoning
Self-critique and Constitutional AI reasoning skill. Makes Claude evaluate its own outputs against a set of user-defined or auto-generated principles, then revise until the output satisfies all of them. Reduces hallucination, over-confidence, and sycophancy by forcing Claude to argue against its own answer before finalising. Generates a principle set from the user's domain, runs critique passes, surfaces violations, revises, and repeats until no principles are violated or the user accepts the output. Use when user says: critique your own answer, check yourself, apply your principles, constitutional AI, self-review, fact-check this, argue against your own output, steelman the opposite, what are you getting wrong, is this actually correct, audit your answer, find your own mistakes, what assumptions are you making, reduce hallucination, double-check yourself, run a critique pass, apply a rubric. Do NOT activate for: creative work where principles would suppress quality, requests that explicitly want a single con
ai-prompt-engineering-safety-review
Comprehensive AI prompt engineering safety review and improvement prompt. Analyzes prompts for safety, bias, security vulnerabilities, and effectiveness while providing detailed improvement recommendations with extensive frameworks, testing methodologies, and educational content.
ai-llm-safety
This skill should be used when designing, planning, implementing, or reviewing any system that involves LLM agents, tool use, prompt construction, or agentic workflows, or when the user asks to "add guardrails", "prevent prompt injection", "sanitize LLM output" — enforces prompt injection defense, tool safety, and context integrity