Unable to replicate downstream depth result on KITTI #227

zshn25 · 2023-09-22T08:38:55Z

I evaluate the pretrained dinov2 with different decoder heads on KITTI Eigen Split in order to replicate the paper's numbers. I found the results much worse.

Here's what I did. I load the models as shown in the notebook. I load the appropriate KITTI weights and check on an example KITTI image. Looks good. I modified this evaluation script to not convert disparity to depth and to not scale the output and ran the numbers.

model	abs_rel	sq_rel	rmse	rmse_log	a1	a2	a3
small+dpt	0.378	2.788	7.372	0.336	0.218	0.866	0.983
base+dpt	0.392	3.085	7.963	0.345	0.179	0.852	0.986
large+dpt	0.276	1.938	6.378	0.267	0.536	0.927	0.991

The reported RMSE is 2.34, 2.23, 2.14 for the small, base and large models with DPT respectively. Am I missing something?

zshn25 · 2023-09-22T09:04:45Z

I realized I was missing the input * 255 and then normalization transform. Now I'm able to replicate the paper results

zshn25 · 2023-09-22T12:35:45Z

I was too fast in my previous comment. I wasn't actually able to replicate the result. I now have

model	abs_rel	sq_rel	rmse	rmse_log	a1	a2	a3
small+dpt	0.309	2.213	6.970	0.303	0.403	0.910	0.983

TheoMoutakanni · 2023-09-22T16:28:24Z

Hello @zshn25 ,
We are using this repository to evaluate our models:
https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/depth/datasets/kitti.py

You can look at the functions pre_eval and evaluate that are used in https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/depth/apis/test.py#L213C25-L213C41

There are some subtleties that can change the results by a lot, did you look at the range of the predictions and the ground truth just before computing the metrics to be sure that they are approximately the same ? (between 0 and 80 in the case of KITTI if I remember correctly). Maybe plotting an histogram of both can help with that to understand which scaling are to be removed/needed.
We were using the Eigen crop btw.

Feel free to continue the discussion, I will stay available.

zshn25 · 2023-09-22T19:37:38Z

Thanks for the reply. I’ve checked the predictions. The are in similar range as the GT (0.1 - 80). The eval code I used even scales the predictions to have similar range at GT and still the results were very different. I also use Eigen crop. I will now try to use the same eval library as you mentioned and report back.

zshn25 closed this as completed Sep 22, 2023

zshn25 reopened this Sep 22, 2023

patricklabatut assigned TheoMoutakanni Sep 29, 2023

patricklabatut added the question Further information is requested label Sep 29, 2023

erikjagnandan mentioned this issue Nov 22, 2024

Unable to Replicate Depth Estimation Results on NYUv2 Depth Dataset (Including After Applying Eigen Crop) #484

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to replicate downstream depth result on KITTI #227

Unable to replicate downstream depth result on KITTI #227

zshn25 commented Sep 22, 2023 •

edited

Loading

zshn25 commented Sep 22, 2023

zshn25 commented Sep 22, 2023

TheoMoutakanni commented Sep 22, 2023

zshn25 commented Sep 22, 2023

Unable to replicate downstream depth result on KITTI #227

Unable to replicate downstream depth result on KITTI #227

Comments

zshn25 commented Sep 22, 2023 • edited Loading

zshn25 commented Sep 22, 2023

zshn25 commented Sep 22, 2023

TheoMoutakanni commented Sep 22, 2023

zshn25 commented Sep 22, 2023

zshn25 commented Sep 22, 2023 •

edited

Loading