(For now, please comment out cuvsRMMPoolMemoryResourceEnable
in both CAGRA and Bruteforce build index methods in the C wrapper)
git clone [email protected]:rapidsai/cuvs.git \
&& cd cuvs \
&& git checkout branch-25.02 \
&& ./build.sh libcuvs java
git clone [email protected]:SearchScale/lucene.git \
&& cd lucene \
&& git checkout cuvs-integration-main \
&& ./gradlew compileJava mavenToLocal
Download the Wikipedia Dataset (5M vectors x 2048 dimensions), queries (100 x 2048 dimensions), and groundtruth (100 x 64 topk)
wget https://accounts.searchscale.com/datasets/wikipedia/ground_truth_100x64.csv \
&& wget https://accounts.searchscale.com/datasets/wikipedia/queries_100.csv.mapdb \
&& wget https://accounts.searchscale.com/datasets/wikipedia/wiki_dump_5Mx2048D.csv.gz.mapdb
Steps:
- Add your benchmark job configuration in the
jobs.json
file - do
./benchmarks.sh jobs.json
- If
saveResultsOnDisk
is set astrue
(injobs.json
) then you can find your benchmark results in theresults
folder. For each successful benchmark run, two files are created${benchmark_id}__benchmark_results_${timestamp}.json
and${benchmark_id}__neighbors_${timestamp}.csv