TY - JOUR
T1 - DSOSplat
T2 - Monocular 3D Gaussian SLAM with Direct Tracking
AU - Zhou, Yi
AU - Guo, Zhetao
AU - Li, Dong
AU - Guan, Runwei
AU - Ren, Yuxiang
AU - Wang, Hongyu
AU - Li, Mingrui
N1 - Publisher Copyright:
© 2001-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Simultaneous Localization and Mapping (SLAM) is one of the key technologies in robotics navigation, augmented reality, and autonomous driving. However, existing dense SLAM methods are constrained by their reliance on external depth observers and high computational costs, limiting applications in the field of human-computer interaction, particularly in AR and VR. We propose DSOSplat, a monocular SLAM framework based on 3D Gaussian Splatting to address these challenges. We generate dense depth maps with absolute scale consistency by employing a self-calibrated adaptive multiview stereo (SC-AMVS) algorithm. Additionally, the accuracy and robustness of depth estimation are significantly improved through dynamic weighted fusion, local constraints, and a scale calibration factor. Our visual odometry module leverages composite depth maps and a keyframe selection strategy to further enhance tracking and reconstruction performance. Furthermore, we propose a depth smoothing regularization (DSR) method that optimizes local gradients and global consistency, thereby improving the geometric expressiveness of Gaussian Splatting and the quality of scene reconstruction. Experimental results demonstrate that DSOSplat achieves efficient localization and high-accuracy scene reconstruction in dynamic and complex environments, offering new possibilities for the development of monocular SLAM. In addition, we perform evaluations in real-world scenarios, where the algorithm also exhibited noteworthy performance.
AB - Simultaneous Localization and Mapping (SLAM) is one of the key technologies in robotics navigation, augmented reality, and autonomous driving. However, existing dense SLAM methods are constrained by their reliance on external depth observers and high computational costs, limiting applications in the field of human-computer interaction, particularly in AR and VR. We propose DSOSplat, a monocular SLAM framework based on 3D Gaussian Splatting to address these challenges. We generate dense depth maps with absolute scale consistency by employing a self-calibrated adaptive multiview stereo (SC-AMVS) algorithm. Additionally, the accuracy and robustness of depth estimation are significantly improved through dynamic weighted fusion, local constraints, and a scale calibration factor. Our visual odometry module leverages composite depth maps and a keyframe selection strategy to further enhance tracking and reconstruction performance. Furthermore, we propose a depth smoothing regularization (DSR) method that optimizes local gradients and global consistency, thereby improving the geometric expressiveness of Gaussian Splatting and the quality of scene reconstruction. Experimental results demonstrate that DSOSplat achieves efficient localization and high-accuracy scene reconstruction in dynamic and complex environments, offering new possibilities for the development of monocular SLAM. In addition, we perform evaluations in real-world scenarios, where the algorithm also exhibited noteworthy performance.
KW - 3D Gaussian Splatting
KW - Scene Reconstruction
KW - Simultaneous Localization and Mapping (SLAM)
UR - http://www.scopus.com/inward/record.url?scp=105004799515&partnerID=8YFLogxK
U2 - 10.1109/JSEN.2025.3566547
DO - 10.1109/JSEN.2025.3566547
M3 - Article
AN - SCOPUS:105004799515
SN - 1530-437X
JO - IEEE Sensors Journal
JF - IEEE Sensors Journal
ER -