OpenDataArena-Tool Data Scorer Documentation

The data scorer of OpenDataArena-Tool for OpenDataArena offers multi-dimensional score assessments for datasets through a series of automated, multi-faceted evaluation and processing methods.

Installation

conda create -n oda python=3.10
conda activate oda
git clone https://github.com/OpenDataArena/OpenDataArena-Tool.git
cd OpenDataArena/data_scorer
pip install -r requirements.txt
pip install flash_attn==2.7.4.post1 --no-build-isolation
# if you want to calculate fail rate, run the following command, which will install the lighteval package
cd model_based/fail_rate
pip install -e .[dev]

Data Evaluation

The data scorer of OpenDataArena-Tool integrates various advanced data processing and scoring technologies, primarily including the following three core modules.