ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

Zhu, Ruijie; Wang, Chuxin; Song, Ziyang; Liu, Li; Zhang, Tianzhu; Zhang, Yongdong

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.08187 (cs)

[Submitted on 11 Jul 2024]

Title:ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

Authors:Ruijie Zhu, Chuxin Wang, Ziyang Song, Li Liu, Tianzhu Zhang, Yongdong Zhang

View PDF HTML (experimental)

Abstract:Estimating depth from a single image is a challenging visual task. Compared to relative depth estimation, metric depth estimation attracts more attention due to its practical physical significance and critical applications in real-life scenarios. However, existing metric depth estimation methods are typically trained on specific datasets with similar scenes, facing challenges in generalizing across scenes with significant scale variations. To address this challenge, we propose a novel monocular depth estimation method called ScaleDepth. Our method decomposes metric depth into scene scale and relative depth, and predicts them through a semantic-aware scale prediction (SASP) module and an adaptive relative depth estimation (ARDE) module, respectively. The proposed ScaleDepth enjoys several merits. First, the SASP module can implicitly combine structural and semantic features of the images to predict precise scene scales. Second, the ARDE module can adaptively estimate the relative depth distribution of each image within a normalized depth space. Third, our method achieves metric depth estimation for both indoor and outdoor scenes in a unified framework, without the need for setting the depth range or fine-tuning model. Extensive experiments demonstrate that our method attains state-of-the-art performance across indoor, outdoor, unconstrained, and unseen scenes. Project page: this https URL

Comments:	14 pages, 11 figure, 13 tables
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2407.08187 [cs.CV]
	(or arXiv:2407.08187v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.08187

Submission history

From: Ruijie Zhu [view email]
[v1] Thu, 11 Jul 2024 05:11:56 UTC (12,644 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators