Context-based local-global fusion network for 3D point cloud classification and segmentation

Junwei Wu; Mingjie Sun; Chenru Jiang; Jiejie Liu; Jeremy Smith; Quan Zhang

doi:10.1016/j.eswa.2024.124023

Context-based local-global fusion network for 3D point cloud classification and segmentation

Junwei Wu, Mingjie Sun, Chenru Jiang, Jiejie Liu, Jeremy Smith, Quan Zhang^*

^*Corresponding author for this work

Department of Mechatronics and Robotics

Research output: Contribution to journal › Article › peer-review

12 Citations (Scopus)

Abstract

3D point clouds have gained much research attention because of their ability to represent the spatial information of real-world environments in a detailed manner. Despite recent progress in point cloud processing with deep neural networks, most of them either implement sophisticated local feature aggregation methods or imitate 2D convolution operations in the range of K nearest neighbors with limited local context information. These methods may struggle to distinguish between similar geometric shapes within the local region of K nearest neighbors, such as doors and walls. To address this issue, we propose a novel local–global fusion network that captures the diverse local geometric shapes with global structure information. The proposed local–global fusion network comprises two main modules. Firstly, we have developed an effective approach for local context learning using incremental dilated KNN (IDKNN) as the neighbor selecting mechanism to enlarge the receptive field and incorporate more reliable points for local geometric shape learning. Secondly, a three-direction region-wise spatial attention (TRSA) algorithm has been developed to explore the global contextual dependencies. For global context learning, we first split the entire 3D space into regions with equal numbers of points, and, then, intra-region context features are extracted to learn the inter-region relations from three orthogonal directions, taking global structural knowledge into account. By fusing the local context information and global contextual dependencies, we establish a Local-Global Fusion Network, end-to-end framework, called LGFNet. Extensive experimental results on several benchmark datasets clearly demonstrate our approach can achieve state-of-the-art (SOTA) performance on point cloud classification, part segmentation, and indoor semantic segmentation. In addition, TRSA and IKDNN can be easily used in a plug-and-play fashion with various existing SOTA networks to substantially improve their performance. Our code is available at https://github.com/jasonwjw/IDKNN

Original language	English
Article number	124023
Journal	Expert Systems with Applications
Volume	251
DOIs	https://doi.org/10.1016/j.eswa.2024.124023
Publication status	Published - 1 Oct 2024

Keywords

Context learning
Global attention
Local-global fusion
Point cloud

Access to Document

10.1016/j.eswa.2024.124023

Cite this

@article{f196c33d6d114a349434001d48321963,

title = "Context-based local-global fusion network for 3D point cloud classification and segmentation",

abstract = "3D point clouds have gained much research attention because of their ability to represent the spatial information of real-world environments in a detailed manner. Despite recent progress in point cloud processing with deep neural networks, most of them either implement sophisticated local feature aggregation methods or imitate 2D convolution operations in the range of K nearest neighbors with limited local context information. These methods may struggle to distinguish between similar geometric shapes within the local region of K nearest neighbors, such as doors and walls. To address this issue, we propose a novel local–global fusion network that captures the diverse local geometric shapes with global structure information. The proposed local–global fusion network comprises two main modules. Firstly, we have developed an effective approach for local context learning using incremental dilated KNN (IDKNN) as the neighbor selecting mechanism to enlarge the receptive field and incorporate more reliable points for local geometric shape learning. Secondly, a three-direction region-wise spatial attention (TRSA) algorithm has been developed to explore the global contextual dependencies. For global context learning, we first split the entire 3D space into regions with equal numbers of points, and, then, intra-region context features are extracted to learn the inter-region relations from three orthogonal directions, taking global structural knowledge into account. By fusing the local context information and global contextual dependencies, we establish a Local-Global Fusion Network, end-to-end framework, called LGFNet. Extensive experimental results on several benchmark datasets clearly demonstrate our approach can achieve state-of-the-art (SOTA) performance on point cloud classification, part segmentation, and indoor semantic segmentation. In addition, TRSA and IKDNN can be easily used in a plug-and-play fashion with various existing SOTA networks to substantially improve their performance. Our code is available at https://github.com/jasonwjw/IDKNN",

keywords = "Context learning, Global attention, Local-global fusion, Point cloud",

author = "Junwei Wu and Mingjie Sun and Chenru Jiang and Jiejie Liu and Jeremy Smith and Quan Zhang",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Ltd",

year = "2024",

month = oct,

day = "1",

doi = "10.1016/j.eswa.2024.124023",

language = "English",

volume = "251",

journal = "Expert Systems with Applications",

issn = "0957-4174",

publisher = "Elsevier",

}

TY - JOUR

T1 - Context-based local-global fusion network for 3D point cloud classification and segmentation

AU - Wu, Junwei

AU - Sun, Mingjie

AU - Jiang, Chenru

AU - Liu, Jiejie

AU - Smith, Jeremy

AU - Zhang, Quan

PY - 2024/10/1

Y1 - 2024/10/1

N2 - 3D point clouds have gained much research attention because of their ability to represent the spatial information of real-world environments in a detailed manner. Despite recent progress in point cloud processing with deep neural networks, most of them either implement sophisticated local feature aggregation methods or imitate 2D convolution operations in the range of K nearest neighbors with limited local context information. These methods may struggle to distinguish between similar geometric shapes within the local region of K nearest neighbors, such as doors and walls. To address this issue, we propose a novel local–global fusion network that captures the diverse local geometric shapes with global structure information. The proposed local–global fusion network comprises two main modules. Firstly, we have developed an effective approach for local context learning using incremental dilated KNN (IDKNN) as the neighbor selecting mechanism to enlarge the receptive field and incorporate more reliable points for local geometric shape learning. Secondly, a three-direction region-wise spatial attention (TRSA) algorithm has been developed to explore the global contextual dependencies. For global context learning, we first split the entire 3D space into regions with equal numbers of points, and, then, intra-region context features are extracted to learn the inter-region relations from three orthogonal directions, taking global structural knowledge into account. By fusing the local context information and global contextual dependencies, we establish a Local-Global Fusion Network, end-to-end framework, called LGFNet. Extensive experimental results on several benchmark datasets clearly demonstrate our approach can achieve state-of-the-art (SOTA) performance on point cloud classification, part segmentation, and indoor semantic segmentation. In addition, TRSA and IKDNN can be easily used in a plug-and-play fashion with various existing SOTA networks to substantially improve their performance. Our code is available at https://github.com/jasonwjw/IDKNN

AB - 3D point clouds have gained much research attention because of their ability to represent the spatial information of real-world environments in a detailed manner. Despite recent progress in point cloud processing with deep neural networks, most of them either implement sophisticated local feature aggregation methods or imitate 2D convolution operations in the range of K nearest neighbors with limited local context information. These methods may struggle to distinguish between similar geometric shapes within the local region of K nearest neighbors, such as doors and walls. To address this issue, we propose a novel local–global fusion network that captures the diverse local geometric shapes with global structure information. The proposed local–global fusion network comprises two main modules. Firstly, we have developed an effective approach for local context learning using incremental dilated KNN (IDKNN) as the neighbor selecting mechanism to enlarge the receptive field and incorporate more reliable points for local geometric shape learning. Secondly, a three-direction region-wise spatial attention (TRSA) algorithm has been developed to explore the global contextual dependencies. For global context learning, we first split the entire 3D space into regions with equal numbers of points, and, then, intra-region context features are extracted to learn the inter-region relations from three orthogonal directions, taking global structural knowledge into account. By fusing the local context information and global contextual dependencies, we establish a Local-Global Fusion Network, end-to-end framework, called LGFNet. Extensive experimental results on several benchmark datasets clearly demonstrate our approach can achieve state-of-the-art (SOTA) performance on point cloud classification, part segmentation, and indoor semantic segmentation. In addition, TRSA and IKDNN can be easily used in a plug-and-play fashion with various existing SOTA networks to substantially improve their performance. Our code is available at https://github.com/jasonwjw/IDKNN

KW - Context learning

KW - Global attention

KW - Local-global fusion

KW - Point cloud

UR - http://www.scopus.com/inward/record.url?scp=85191610480&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2024.124023

DO - 10.1016/j.eswa.2024.124023

M3 - Article

AN - SCOPUS:85191610480

SN - 0957-4174

VL - 251

JO - Expert Systems with Applications

JF - Expert Systems with Applications

M1 - 124023

ER -

Context-based local-global fusion network for 3D point cloud classification and segmentation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this