Evaluate and score agent behavior against a golden reference. Use this skill whenever the user wants to run evaluation, check pass/fail status, understand metric scores, compare sessions for regressions, validate agent behavior, or score a trace from a file or a live session. Trigger on phrases like "eval this trace", "check my agent output", "did my agent do the right thing", "compare runs", "did my agent regress", "score session X", "evaluate against golden", "run evals". Works with both local trace files and live streaming sessions. --- Evaluate agent behavior and explain what the scores mean. ## Determine the input type First, figure out what to evaluate: - **Trace file(s)** — user mentions a `.json` or `.jsonl` file path → use `evaluate_traces` - **Sessions vs golden** — user has multiple live sessions and wants regression testing → use `evaluate_sessions` - **Single live session** — user wants to score one session against a golden eval set → guide them to use `evaluate_sessions` with one session as golden ## Evaluating trace files 1. Get the file path(s). Check the extension: `.jsonl` → `trace_format: "otlp-json"` | `.json` → `"jaeger-json"` (default) 2. Ask if they have a golden eval set JSON. For `tool_trajectory_avg_score` (the default metric), an eval set is required — it provides the expected tool call sequence to compare against. If they don't have one yet, explain this and suggest starting with `hallucinations_v1`, or ask if they want to create a golden set from a reference run first. 3. Call `evaluate_traces` with the file(s), format, and eval set. 4. Present results as a score table (see Score interpretation below) and explain failures. ## Evaluating sessions (regression testing) This workflow requires the server to be running with the `--dev` flag (which enables WebSocket and session streaming). Plain `agentevals serve` will not have sessions. If you get a connection error from any tool below, tell the user: ```bash uv run agentevals serve --dev ```
React DevTools CLI for AI agents. Use when the user asks you to debug a React or React Native app at runtime, inspect component props/state/hooks, diagnose render performance, profile re-renders, find slow components, or understand why something re-renders. Triggers include "why does this re-render", "inspect the component", "what props does X have", "profile the app", "find slow components", "debug the UI", "check component state", "the app feels slow", or any React runtime debugging task.
Backfill missing ADR from git history and documentation
CLI to deploy and manage applications, add-ons, and configurations on Clever Cloud PaaS. Use when the user needs to deploy apps, view logs, manage environment variables, configure domains, or interact with Clever Cloud services.
React DevTools CLI for AI agents. Use when the user asks you to debug a React or React Native app at runtime, inspect component props/state/hooks, diagnose render performance, profile re-renders, find slow components, or understand why something re-renders. Triggers include "why does this re-render", "inspect the component", "what props does X have", "profile the app", "find slow components", "debug the UI", "check component state", "the app feels slow", or any React runtime debugging task.
>-
>-
> **Этот скилл переехал!**
Batch rename Home Assistant entities to follow a consistent naming convention. Discovers entities, proposes renames, executes via HA API, and updates all YAML/TypeScript references automatically. Trigger phrases: "rename entities", "fix entity names", "standardize entity IDs", "entity rename", "clean up names". --- # Entity Rename Skill Rename Home Assistant entities to follow a consistent `domain.{room}_{descriptor}` convention. Updates all YAML automations, scripts, dashboard code, and .storage/ configs automatically. See `references/naming-convention.md` for the full naming convention. See `docs/solutions/tooling/entity-rename-lessons.md` for safety rules from production use. ## Step 0: Prerequisites
- 📁 rules/
- 📄 AGENTS.md
- 📄 README.md
- 📄 SKILL.md
React composition patterns that scale. Use when refactoring components with boolean prop proliferation, building flexible component libraries, or designing reusable APIs. Triggers on tasks involving compound components, render props, context providers, or component architecture. Includes React 19 API changes.
When the user wants to track follower growth, understand what drives new followers, or analyze audience development. Also use when the user mentions 'follower growth,' 'followers,' 'audience growth,' 'gaining followers,' 'losing followers,' 'who follows me,' or 'grow my audience.' Uses BlackTwist follower data when available. For post-level metrics, see performance-analyzer-sms. For content patterns, see content-pattern-analyzer-sms.
- 📁 references/
- 📄 metadata.json
- 📄 SKILL.md
>-