Polish LLM Benchmarks

Comprehensive evaluation platforms for Polish language models

Polish Language Benchmarks

Open PL LLM Leaderboard Polish SpeakLeash

Comprehensive leaderboard for Polish language models

Polish MT-Bench Polish SpeakLeash

Multi-turn conversation benchmark for Polish

Polish EQ-Bench Polish SpeakLeash

Emotional intelligence benchmark for Polish models

CPTUB Leaderboard Polish SpeakLeash

Comprehensive Polish Text Understanding Benchmark

Polish Medical Leaderboard Polish SpeakLeash

Medical domain benchmark for Polish language models

Polish Linguistic and Cultural Competency Benchmark (PLCC) Polish

Evaluates linguistic and cultural understanding in Polish

LLMzSzŁ (LLMs Behind the School Desk) Polish

Educational benchmark for Polish language models

Polish Cultural Vision Benchmark Polish SpeakLeash

Vision Language Model benchmark for Polish cultural understanding

International Benchmarks (Bielik Evaluated)

European LLM Leaderboard

Multi-language European language model evaluation

EuroEval

European multilingual model evaluation platform

Open LLM Leaderboard

Original comprehensive LLM evaluation leaderboard

Open LLM Leaderboard v2

Updated version of the Open LLM Leaderboard

MixEval

Mixed evaluation benchmark for language models

Berkeley Function-Calling Leaderboard

Evaluates function calling capabilities of LLMs

FLORES200 Translation Benchmark

Large-scale multilingual translation evaluation

BenCzechMark

Czech language model benchmark suite

Portuguese Benchmark (Open PT LLM Leaderboard)

Portuguese language model evaluation platform