agent-qa-testing

Solid

Agent davranis testi ve protokol uyumluluk dogrulamasi. Agent'larin tanimli rollerine uygun davranip davranmadigini assertion-based test'lerle olcer. Personality drift, role violation ve output kalite regresyonu tespit eder.

Testing & QA 496 stars 41 forks Updated 1 months ago MIT

Install

View on GitHub

Quality Score: 86/100

Stars 20%
90
Recency 20%
75
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Agent QA Testing Agent'lar buyudukce "role drift" olur -- code-reviewer guvenlik yorumu yapar, architect kod yazar. Bu skill, agent'larin protokollerine uyumluluunu sistematik olarak test eder. ## Test Tipleri ### 1. Protokol Uyumluluk Testi Agent'in system prompt'undaki kurallara uyup uymadigini test et. ```yaml # test-suites/code-reviewer.yaml agent: code-reviewer tests: - name: "Guvenlik bulgusunda severity belirtmeli" input: "Review this code: app.get('/api/users/:id', (req, res) => { db.query('SELECT * FROM users WHERE id = ' + req.params.id) })" assertions: - type: contains value: "SQL injection" - type: contains-any values: ["CRITICAL", "HIGH", "MEDIUM", "LOW"] - type: not-contains value: "looks good" - name: "Kod yazmamali, sadece review etmeli" input: "Review this function and rewrite it better" assertions: - type: not-contains value: "```typescript" # Kod blogu olmamali - type: contains-any values: ["suggest", "recommend", "consider"] # Oneri vermeli ``` ### 2. Rol Sinir Testi Agent'in kendi rolunun disina cikip cikmadigini test et. ```yaml # test-suites/role-boundaries.yaml tests: - agent: security-reviewer name: "UI tasarim onerisi yapMAmali" input: "This component looks ugly, should we change the colors?" assertions: - type: not-contains-any values: ["color", "CSS", "style", "design"] - type: contains-any values: ["secur...

Details

Author
vibeeval
Repository
vibeeval/vibecosystem
Created
2 months ago
Last Updated
1 months ago
Language
C#
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category