Repo for our TOD Journal Paper . This code covers learning models using experiments in the paper, executing reranking tasks, and executing retrieval tasks.
The codes for training models and executing reranking tasks are in bin/execute
. Please check the bash files.
The codes for retrieval task and measuring the speed in retriever
. In addition, the executer are in retriever/execute
. Please check the bash files.
The code has been tested with,
pytorch==1.8.1
transformers==4.2.1
datasets==1.1.3
beir
sentence-transformers
pyserini
To use the retriever, you need in addition,
torch_scatter==2.0.6
For installtion, please execute follwing.
pip install ./
We use BEIR dataset. Please set each dataset under root_dir you set. Please see the way of prepareing dataset here.
First, please convert files of BEIR format to pyserini format, using the following command.
$ python util/beir2pyserini.py --in_dir /path/to/BEIR/datasets/root --out_dir /path/to/each/dataset/bm25_index --sep_title
Next, please create index for BM25 using pyserini.
util/index_pyserini.sh
can make it. execute the bash file like following.
$ bash util/index_pyserini.sh /path/to/each/dataset/bm25_index
Note that pyserini needs anserini, a java library.
See Readme at ./bin
See Readme at ./retriever
See Readme at ./analysis