llm-agentsFeatured

llm-evaluation

5.2k starsUpdated 2025-12-28

GitHub 查看完整技能

Compatible with:claudecodex

描述

使用自动化指标、人工反馈和基准测试为 LLM 应用实施全面的评估策略。

如何使用

访问 GitHub 仓库获取 SKILL.md 文件
将文件复制到您的项目根目录或 .cursor/rules 目录
重启您的 AI 助手或编辑器以应用新技能

完整技能说明

name

llm-evaluation

description

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

llm-evaluation

描述

如何使用

完整技能说明

Tags

相关技能

agent-identifier

configured-agent

command-name

claude-opus-4-5-migration

PPTX creation, editing, and analysis