constitutional-ai
CommunitySafety alignment via AI self-critique and feedback.
Authorovachiever
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Constitutional AI trains models to be harmless through self-critique and AI feedback, reducing harmful outputs without human labels by using a constitution of guiding principles.
Core Features & Use Cases
- Two-phase safety alignment: supervised learning with self-critique and RL from AI feedback (RLAIF)
- Chain-of-thought critique and revision for transparent reasoning
- Potentially reduces need for human labeling in safety-critical domains
- Applicable to policy, risk assessment, and responsible AI development
Quick Start
Define a constitution of principles, generate critiques, revise outputs, and optionally train using TRL or PPO-based methods.
Dependency Matrix
Required Modules
transformerstorchtrl
Components
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: constitutional-ai Download link: https://github.com/ovachiever/droid-tings/archive/main.zip#constitutional-ai Please download this .zip file, extract it, and install it in the .claude/skills/ directory.