ZeroBench
ZeroBench is a benchmark designed to evaluate the performance of large language models (LLMs) on tasks requiring zero-shot learning. Zero-shot learning refers to a model's ability to perform a task it has not been explicitly trained on, relying instead on its general understanding of language and concepts. This benchmark aims to provide a standardized way to measure this capability across different LLMs, facilitating comparison and tracking progress in the field.
The benchmark typically consists of a diverse set of tasks that span various natural language processing domains,
By using a zero-shot setting, ZeroBench helps researchers understand how well LLMs can generalize their learned