Monocular Depth Estimation with Augmented Ordinal Depth Relationships

Yuanzhouhan Cao, Tianqi Zhao, Ke Xian, Chunhua Shen, Zhiguo Cao, Shugong Xu

Research output: Contribution to journalArticlepeer-review

35 Citations (Scopus)

Abstract

Most existing algorithms for depth estimation from single monocular images need large quantities of metric ground-truth depths for supervised learning. We show that relative depth can be an informative cue for metric depth estimation and can be easily obtained from vast stereo videos. Acquiring metric depths from stereo videos are sometimes impracticable due to the absence of camera parameters. In this paper, we propose to improve the performance of metric depth estimation with relative depths collected from stereo movie videos using existing stereo matching algorithm. We introduce a new 'relative depth in stereo' (RDIS) dataset densely labeled with relative depths. We first pretrain a ResNet model on our RDIS dataset. Then, we finetune the model on RGB-D datasets with metric ground-truth depths. During our finetuning, we formulate depth estimation as a classification task. This re-formulation scheme enables us to obtain the confidence of a depth prediction in the form of probability distribution. With this confidence, we propose an information gain loss to make use of the predictions that are close to ground-truth. We evaluate our approach on both indoor and outdoor benchmark RGB-D datasets and achieve the state-of-the-art performance.

Original languageEnglish
Article number8764412
Pages (from-to)2674-2682
Number of pages9
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume30
Issue number8
DOIs
Publication statusPublished - Aug 2020
Externally publishedYes

Keywords

  • deep network
  • Depth estimation
  • ordinal relationship
  • RGB-D dataset

Fingerprint

Dive into the research topics of 'Monocular Depth Estimation with Augmented Ordinal Depth Relationships'. Together they form a unique fingerprint.

Cite this