TotoEval: A Dataset for Evaluating LLM Reasoning

TotoEval is the dataset used in the accompanying paper. The dataset is publicly available for research purposes.

This website may be updated in the future with additional documentation, benchmarks, and leaderboards.