Instruction Clarification Requests in the CoDraw Dataset

This is the code repository accompanying the following publication:

Instruction Clarification Requests in Multimodal Collaborative Dialogue Games: Tasks, and an Analysis of the CoDraw Dataset. (Madureira, B. & Schlangen, D., EACL 2023).

We implement neural network baseline models to indentify when an iCR should be made and when it has been made in the CoDraw dialogue game.

Description

The directories are:

checkpoints/: contains the trained model checkpoints at the best validation epoch.
codrawmodels/: contains one script from the CoDraw authors with minor adaptations to make it work in our setting.
data/: is where all downloaded data should live, and it also contains the scripts to preprocess the data and generate the incremental images and the embeddings.
env/: contains the files to reconstruct the conda environment (see below).
icr/: code of our implementation.
notebooks/: juypter notebooks used for corpus analysis, trivial baselines and evaluation.
outputs/: the generated outputs of the experiments.

Dependencies

The directory env/ contains the files that can be used to recreate the conda environment. Running

sh create_env.sh

should create it by calling the same installations one by one as we did. In case is does not work, this directory also contain the .yml files and a spec file auto-generated by comet.ml.

Data

Check data/README.md for the details on how to download the necessary datasets. Our annotation is available at OSF: https://osf.io/gcjhz/. You can download it manually or clone via the osfclient.

Replicating the results

After setting up the data directory, you can replicate the results by regenerating the pretrained embeddings and then calling the experiments script. search.py was used for hyperparameter search.

conda activate codraw_pl
sh setup.sh
sh experiments.sh

General usage

main.py can be used to run other experiments. It accepts different hyperparameters via the CLI. Check the arguments in icr/config.py for details.

python3 main.py

Testing

To run unit tests, run:

python3 -m unittest discover -v

Credits

We thank Philipp Sadler for generating the step-by-step CoDraw scenes. The code under data/IncrementalCoDrawImages is his work.

We thank the developers of all the open libraries we use (Pytorch, Pytorch Lightning, h5py, comet.ml, torchmetrics, pandas, matplotlib, scikit-learn, scipy, seaborn, sentence-transformers, wordcloud).

This work is based on the CoDraw dataset and AbstractScenes (see Data section above).

License

Our source code is licensed under the MIT License.

Citation

If you use our work, please cite:

@inproceedings{madureira-schlangen-2023-instruction,
    title = "Instruction Clarification Requests in Multimodal Collaborative Dialogue Games: Tasks, and an Analysis of the {C}o{D}raw Dataset",
    author = "Madureira, Brielen  and
      Schlangen, David",
    booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics",
    month = may,
    year = "2023",
    address = "Dubrovnik, Croatia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.eacl-main.169",
    pages = "2303--2319",
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instruction Clarification Requests in the CoDraw Dataset

Description

Dependencies

Data

Replicating the results

General usage

Testing

Credits

License

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
env		env
icr		icr
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
experiments.sh		experiments.sh
main.py		main.py
pathprep.sh		pathprep.sh
search.py		search.py
setup.sh		setup.sh

License

briemadu/codraw-icr-v1

Folders and files

Latest commit

History

Repository files navigation

Instruction Clarification Requests in the CoDraw Dataset

Description

Dependencies

Data

Replicating the results

General usage

Testing

Credits

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages