Instruction-Following Evaluation for Large Language Models Paper • 2311.07911 • Published Nov 14, 2023 • 22
vectara/hallucination_evaluation_model Text Classification • Updated Oct 20, 2025 • 85.1k • 349
A Survey on Evaluation of Large Language Models Paper • 2307.03109 • Published Jul 6, 2023 • 43
Runtime error Agents Featured 434 Open Medical-LLM Leaderboard 🥇 434 Explore and submit models for benchmarking
Running on CPU Upgrade Agents 75 La Leaderboard 🌸 75 Evaluate open LLMs in the languages of LATAM and Spain.