Parity: Auto-evals for harness changes
ストックにはログインが必要です
Catch AI behavior changes before they ship
Artificial Intelligence
Developer Tools
GitHub
Parity helps agent teams verify that prompt and harness changes actually changed behavior. It monitors PRs for behavior-defining changes, identifies what changed, checks existing eval coverage, and generates targeted probe evals to test whether the new behavior shows up and where it stops holding. Built for teams who want something faster and more reliable than manual spot checks and vibe testing.
投票数: 1