Only a small sample data are included. More training data are needed for proper fine-tuning. I recommend looking at /src/Separate_.py in particular, as it was the best performing modeling approach (see paper).
Maeda, H. (2024). Field-testing multiple-choice questions with AI examinees: English grammar items. Educational and Psychological Measurement. https://doi.org/10.1177/00131644241281053
@article{maeda2024field,
author = {Hotaka Maeda},
title ={Field-Testing Multiple-Choice Questions With AI Examinees: English Grammar Items},
journal = {Educational and Psychological Measurement},
year = {2024},
doi = {10.1177/00131644241281053},
URL = {https://doi.org/10.1177/00131644241281053},
eprint = {https://doi.org/10.1177/00131644241281053}
}