ai-eval-ci

Community

Run AI agent and LLM evaluations in CI/CD pipelines — automated quality gates that fail the build when AI output quality drops. Use when someone asks to "test my AI agent", "add evals to CI", "catch prompt regressions", "compare models", "evaluate LLM output quality", "set up AI quality gates", or "benchmark my agent before deploying". Covers eval frameworks (Cobalt, Promptfoo, Braintrust), LLM-as-judge scoring, threshold-based assertions, and GitHub Actions integration.

Install

skillpm install ai-eval-ci

Format score

100/100

Spec

v1.0

Installs

Author

@TerminalSkills

Published

April 1, 2026