constitutional-ai
FeaturedAnthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with self-critique/revision, then RLAIF (RL from AI Feedback). Use for safety alignment, reducing harmful outputs without human labels. Powers Claude's safety system.
Install
Quality Score: 99/100
Skill Content
Details
- Author
- davila7
- Repository
- davila7/claude-code-templates
- Created
- 11 months ago
- Last Updated
- today
- Language
- Python
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
constitutional-ai
Anthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with self-critique/revision, then RLAIF (RL from AI Feedback). Use for safety alignment, reducing harmful outputs without human labels. Powers Claude's safety system.
constitutional-ai-prompts
Constitutional AI and safety guardrail prompts for aligned LLM behavior
constitutional-reasoning
Self-critique and Constitutional AI reasoning skill. Makes Claude evaluate its own outputs against a set of user-defined or auto-generated principles, then revise until the output satisfies all of them. Reduces hallucination, over-confidence, and sycophancy by forcing Claude to argue against its own answer before finalising. Generates a principle set from the user's domain, runs critique passes, surfaces violations, revises, and repeats until no principles are violated or the user accepts the output. Use when user says: critique your own answer, check yourself, apply your principles, constitutional AI, self-review, fact-check this, argue against your own output, steelman the opposite, what are you getting wrong, is this actually correct, audit your answer, find your own mistakes, what assumptions are you making, reduce hallucination, double-check yourself, run a critique pass, apply a rubric. Do NOT activate for: creative work where principles would suppress quality, requests that explicitly want a single con
ai-constitution
Interviews the operator to produce a project-identity CONSTITUTION.md (Mission / Stakeholders / Vocabulary / Prohibitions / Compliance gates / Anti-goals / Boundaries / Escalation / Language / Lifecycle phase). Trigger for 'set up the constitution', 'define project identity', 'who is this project for', 'what does this project never do', 'amend the constitution'. Not for AI-behaviour rules — those live in CANONICAL.md / AGENTS.md. Not for spec governance; use /ai-governance instead.
ai-safety-guardrails
Design safety experiences for AI products - content moderation UX, bias detection surfaces, harm prevention patterns, and responsible AI interfaces. Use when: AI safety UX, content moderation, responsible AI, AI bias UX, harm prevention, content filtering UX, AI refusal design, safety disclaimers.