Use Tandem Browser's MCP server (local and remote agents) or HTTP API (local and remote agents) to inspect, browse, and interact with the user's shared browser safely. Prefer targeted tabs and sessions, use snapshot refs before raw DOM or JS, verify action completion explicitly, and leave durable handoffs instead of retrying blindly.
Navigate and modify surf-cli codebase - Chrome extension + native host for AI browser automation. Use for surf-cli code work, architecture questions, implementing browser control/CDP/accessibility/network features.
Create, list, and delete scheduled cron jobs
Provides knowledge about acpx CLI for agent-to-agent communication. Use when user asks about acpx commands, ACP protocol, agent sessions, prompt queueing, or scriptable agent workflows.
>-
Build Amazon Machine Images (AMIs) with Packer using the amazon-ebs builder. Use when creating custom AMIs for EC2 instances.
- 📄 EXAMPLES.md
- 📄 LICENSE.txt
- 📄 REFERENCE.md
Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications. Supports remote Browserbase sessions with automatic CAPTCHA solving, anti-bot stealth mode, and residential proxies — ideal for scraping protected websites, bypassing bot detection, and interacting with JavaScript-heavy pages.
Configure Altimate platform credentials for datamate and API access
Use this skill when building, reviewing, or refactoring Go code that must follow strict design discipline — ATDD/TDD workflow, explicit dependency injection, package-boundary discipline, and structured code review. Complements CLAUDE.md by focusing on process and design judgment rather than version-specific Go features.
Automatically evaluate and compare multiple AI models or agents without pre-existing test data. Generates test queries from a task description, collects responses from all target endpoints, auto-generates evaluation rubrics, runs pairwise comparisons via a judge model, and produces win-rate rankings with reports and charts. Supports checkpoint resume, incremental endpoint addition, and judge model hot-swap. Use when the user asks to compare, benchmark, or rank multiple models or agents on a custom task, or run an arena-style evaluation. --- # Auto Arena Skill End-to-end automated model comparison using the OpenJudge `AutoArenaPipeline`: 1. **Generate queries** — LLM creates diverse test queries from task description 2. **Collect responses** — query all target endpoints concurrently 3. **Generate rubrics** — LLM produces evaluation criteria from task + sample queries 4. **Pairwise evaluation** — judge model compares every model pair (with position-bias swap) 5. **Analyze & rank** — compute win rates, win matrix, and rankings 6. **Report & charts** — Markdown report + win-rate bar chart + optional matrix heatmap ## Prerequisites ```bash # Install OpenJudge pip install py-openjudge # Extra dependency for auto_arena (chart generation) pip install matplotlib ``` ## Gather from user before running | Info | Required? | Notes | |------|-----------|-------| | Task description | Yes | What the models/agents should do (set in config YAML) | | Target endpoints | Yes | At least 2 OpenAI-compatible endpoints to compare | | Judge endpoint | Yes | Strong model for pairwise evaluation (e.g. `gpt-4`, `qwen-max`) | | API keys | Yes | Env vars: `OPENAI_API_KEY`, `DASHSCOPE_API_KEY`, etc. | | Number of queries | No | Default: `20` | | Seed queries | No | Example queries to guide generation style | | System prompts | No | Per-endpoint system prompts | | Output directory | No | Default: `./evaluation_results` | | Report language | No | `"zh"` (default) or `"en"` | ## Quick start ### CLI `
Create grouped detection narratives that tie individual rules into coherent threat stories. Covers Splunk Analytic Stories, Elastic detection rule groups, and Sentinel analytics grouping.
- 📁 api/
- 📁 commands/
- 📄 __init__.py
- 📄 builder_fee.py
- 📄 config.py
Autonomous Hyperliquid trading — 14 strategies (MM, momentum, arbitrage, LLM) with APEX multi-slot orchestrator, REFLECT performance review, DSL trailing stops, and builder fee revenue collection.