← All creators

ar0cket1

User

Online RL for Hermes Agent — self-improving LoRA adapters from human feedback using MIS-PO

1 indexed · 0 Featured · 13 stars · avg score 84

Categories

Indexed Skills (1)

Bio shown is the top-scored skill's repo description as a fallback — real GitHub bios land in a future update.