AI Agent Skills 搜索与发现平台

Daily Featured Skills Count

04/05 04/06 04/07 04/08 04/09 04/10 04/11

♾️ Free & Open Source 🛡️ Secure & Worry-Free

Import Skills

Composite Most Downloads Most Likes Most Comments Newest

guanyang

from GitHub Data & AI

📁 references/
📁 scripts/
📄 SKILL.md

evaluation bias llm-as-judge

advanced-evaluation

This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.

⬇0 ❤573 10 days ago · Uploaded Detail →

agentscope-ai

from GitHub Data & AI

📄 SKILL.md

queries model evaluation

auto-arena

Automatically evaluate and compare multiple AI models or agents without pre-existing test data. Generates test queries from a task description, collects responses from all target endpoints, auto-generates evaluation rubrics, runs pairwise comparisons via a judge model, and produces win-rate rankings with reports and charts. Supports checkpoint resume, incremental endpoint addition, and judge model hot-swap. Use when the user asks to compare, benchmark, or rank multiple models or agents on a custom task, or run an arena-style evaluation. --- # Auto Arena Skill End-to-end automated model comparison using the OpenJudge `AutoArenaPipeline`: 1. **Generate queries** — LLM creates diverse test queries from task description 2. **Collect responses** — query all target endpoints concurrently 3. **Generate rubrics** — LLM produces evaluation criteria from task + sample queries 4. **Pairwise evaluation** — judge model compares every model pair (with position-bias swap) 5. **Analyze & rank** — compute win rates, win matrix, and rankings 6. **Report & charts** — Markdown report + win-rate bar chart + optional matrix heatmap ## Prerequisites ```bash # Install OpenJudge pip install py-openjudge # Extra dependency for auto_arena (chart generation) pip install matplotlib ``` ## Gather from user before running | Info | Required? | Notes | |------|-----------|-------| | Task description | Yes | What the models/agents should do (set in config YAML) | | Target endpoints | Yes | At least 2 OpenAI-compatible endpoints to compare | | Judge endpoint | Yes | Strong model for pairwise evaluation (e.g. `gpt-4`, `qwen-max`) | | API keys | Yes | Env vars: `OPENAI_API_KEY`, `DASHSCOPE_API_KEY`, etc. | | Number of queries | No | Default: `20` | | Seed queries | No | Example queries to guide generation style | | System prompts | No | Per-endpoint system prompts | | Output directory | No | Default: `./evaluation_results` | | Report language | No | `"zh"` (default) or `"en"` | ## Quick start ### CLI `

⬇0 ❤509 10 days ago · Uploaded Detail →

poemswe

from GitHub Content & Multimedia

📄 SKILL.md

critically evaluation arguments

analyze

Critically analyze content, claims, or arguments with rigorous evaluation.

⬇0 ❤59 9 days ago · Uploaded Detail →

mlflow

from GitHub Tools & Productivity

📁 assets/
📁 references/
📁 scripts/
📄 SKILL.md

automation data evaluation

agent-evaluation

Use this when you need to EVALUATE OR IMPROVE or OPTIMIZE an existing LLM agent's output quality - including improving tool selection accuracy, answer quality, reducing costs, or fixing issues where the agent gives wrong/incomplete responses. Evaluates agents systematically using MLflow evaluation with datasets, scorers, and tracing. IMPORTANT - Always also load the instrumenting-with-mlflow-tracing skill before starting any work. Covers end-to-end evaluation workflow or individual components (tracing setup, dataset creation, scorer definition, evaluation execution).

⬇0 ❤20 10 days ago · Uploaded Detail →

akshansh

from GitHub Development & Coding

📄 SKILL.md

recommendations prioritized evaluation

ade-audit

Run a full Build + Style + Move + Write evaluation on a page — score each framework, produce a combined report out of /200 with prioritized recommendations across all four.

⬇0 ❤7 10 days ago · Uploaded Detail →

UKGovernmentBEIS

from GitHub Research & Analysis

📄 SKILL.md

data dataframes evaluation

inspect-ai

Analyze Inspect AI evaluation logs, understand EvalLog structure, extract samples, events, and scoring data using dataframes

⬇0 ❤7 12 days ago · Uploaded Detail →

‹ 1 ›

Creator Leaderboard

Most Published Most Liked Most Replied

1 No data --
2 No data --
3 No data --
4 No data --
5 No data --
6 No data --
7 No data --
8 No data --
9 No data --
10 No data --
11 No data --
12 No data --
13 No data --
14 No data --
15 No data --
16 No data --

Skill File Structure Sample (Reference)

skill-sample/
├─ SKILL.md              ⭐ Required: skill entry doc (purpose / usage / examples / deps)
├─ manifest.sample.json  ⭐ Recommended: machine-readable metadata (index / validation / autofill)
├─ LICENSE.sample        ⭐ Recommended: license & scope (open source / restriction / commercial)
├─ scripts/
│  └─ example-run.py     ✅ Runnable example script for quick verification
├─ assets/
│  ├─ example-formatting-guide.md  🧩 Output conventions: layout / structure / style
│  └─ example-template.tex         🧩 Templates: quickly generate standardized output
└─ references/           🧩 Knowledge base: methods / guides / best practices
   ├─ example-ref-structure.md     🧩 Structure reference
   ├─ example-ref-analysis.md      🧩 Analysis reference
   └─ example-ref-visuals.md       🧩 Visual reference

More Agent Skills specs Anthropic docs: https://agentskills.io/home

SKILL.md Requirements

├─ ⭐ Required: YAML Frontmatter (must be at top)
│  ├─ ⭐ name                 : unique skill name, follow naming convention
│  └─ ⭐ description          : include trigger keywords for matching
│
├─ ✅ Optional: Frontmatter extension fields
│  ├─ ✅ license              : license identifier
│  ├─ ✅ compatibility        : runtime constraints when needed
│  ├─ ✅ metadata             : key-value fields (author/version/source_url...)
│  └─ 🧩 allowed-tools        : tool whitelist (experimental)
│
└─ ✅ Recommended: Markdown body (progressive disclosure)
   ├─ ✅ Overview / Purpose
   ├─ ✅ When to use
   ├─ ✅ Step-by-step
   ├─ ✅ Inputs / Outputs
   ├─ ✅ Examples
   ├─ 🧩 Files & References
   ├─ 🧩 Edge cases
   ├─ 🧩 Troubleshooting
   └─ 🧩 Safety notes

Why SkillWink?

Skill files are scattered across GitHub and communities, difficult to search, and hard to evaluate. SkillWink organizes open-source skills into a searchable, filterable library you can directly download and use.

We provide keyword search, version updates, multi-metric ranking (downloads / likes / comments / updates), and open SKILL.md standards. You can also discuss usage and improvements on skill detail pages.

Keyword Search Version Updates Multi-Metric Ranking Open Standard Discussion

Quick Start:

Import/download skills (.zip/.skill), then place locally:

~/.claude/skills/ (Claude Code)

~/.codex/skills/ (Codex CLI)

One SKILL.md can be reused across tools.

FAQ

Everything you need to know: what skills are, how they work, how to find/import them, and how to contribute.

1. What are Agent Skills?

A skill is a reusable capability package, usually including SKILL.md (purpose/IO/how-to) and optional scripts/templates/examples.

Think of it as a plugin playbook + resource bundle for AI assistants/toolchains.

2. How do Skills work?

Skills use progressive disclosure: load brief metadata first, load full docs only when needed, then execute by guidance.

This keeps agents lightweight while preserving enough context for complex tasks.

3. How can I quickly find the right skill?

Use these three together:

Semantic search: describe your goal in natural language.
Multi-filtering: category/tag/author/language/license.
Sort by downloads/likes/comments/updated to find higher-quality skills.

4. Which import methods are supported?

Upload archive: .zip / .skill (recommended)
Upload skills folder
Import from GitHub repository

Note: file size for all methods should be within 10MB.

5. How to use in Claude / Codex?

Typical paths (may vary by local setup):

Claude Code：~/.claude/skills/
Codex CLI：~/.codex/skills/

One SKILL.md can usually be reused across tools.

6. Can one skill be shared across tools?

Yes. Most skills are standardized docs + assets, so they can be reused where format is supported.

Example: retrieval + writing + automation scripts as one workflow.

7. Are these skills safe to use?

Some skills come from public GitHub repositories and some are uploaded by SkillWink creators. Always review code before installing and own your security decisions.

8. Why does it not work after import?

Most common reasons:

Wrong folder path or nested one level too deep
Invalid/incomplete SKILL.md fields or format
Dependencies missing (Python/Node/CLI)
Tool has not reloaded skills yet

9. Does SkillWink include duplicates/low-quality skills?

We try to avoid that. Use ranking + comments to surface better skills:

Duplicate skills: compare differences (speed/stability/focus)
Low quality skills: regularly cleaned up