CNN-G: convolutional neural network combined with graph for image segmentation with theoretical analysis

Yi Lu; Yaran Chen; Dongbin Zhao; Bao Liu; Zhichao Lai; Jianxin Chen

doi:10.1109/TCDS.2020.2998497

CNN-G: convolutional neural network combined with graph for image segmentation with theoretical analysis

Yi Lu, Yaran Chen, Dongbin Zhao^*, Bao Liu^*, Zhichao Lai, Jianxin Chen

^*Corresponding author for this work

Department of Intelligent Science

Research output: Contribution to journal › Article › peer-review

66 Citations (Scopus)

Abstract

Deep convolutional neural network (CNN), although recognized to be considerably successful in its application to semantic segmentation, is inadequate for extracting the overall structure information, for its representing images with the data in the Euclidean space. To improve this inadequacy, a new model in the graph domain that transforms semantic segmentation into graph node classification is proposed for semantic segmentation. In this model, the image is represented by a graph, with its nodes initialized by the feature map obtained by a CNN, and its edges reflecting the relationships of the nodes. The node relationships that are taken into consideration include distance-based ones and semantic ones, respectively, calculated with the Gauss kernel function and attention mechanism. The graph neural network is also introduced in this model for the classification of graph nodes, which can expand the receptive field without the loss of location information and combine the structure with the feature extraction. Most importantly, it is theoretically concluded that the proposed graph model takes the same role as a Laplace regularization term in image segmentation, which has been proven by multiple comparative experiments that show the effectiveness of the model in image semantic segmentation. The learned attention is visualized by the heatmap to validate the structure learning ability of our model. The results of these experiments show the importance of structural information in image segmentation. Hence, an idea of deep learning combined with graph structural information is provided in theory and method.

Original language	English
Article number	9103557
Pages (from-to)	631-644
Number of pages	14
Journal	IEEE Transactions on Cognitive and Developmental Systems
Volume	13
Issue number	3
DOIs	https://doi.org/10.1109/TCDS.2020.2998497
Publication status	Published - Sept 2021

Keywords

Graph neural network (GNN)
image segmentation
self-attention
structure pattern learning

Access to Document

10.1109/TCDS.2020.2998497

Cite this

@article{6e396a3cfbf449fa95e5134f38d61a18,

title = "CNN-G: convolutional neural network combined with graph for image segmentation with theoretical analysis",

abstract = "Deep convolutional neural network (CNN), although recognized to be considerably successful in its application to semantic segmentation, is inadequate for extracting the overall structure information, for its representing images with the data in the Euclidean space. To improve this inadequacy, a new model in the graph domain that transforms semantic segmentation into graph node classification is proposed for semantic segmentation. In this model, the image is represented by a graph, with its nodes initialized by the feature map obtained by a CNN, and its edges reflecting the relationships of the nodes. The node relationships that are taken into consideration include distance-based ones and semantic ones, respectively, calculated with the Gauss kernel function and attention mechanism. The graph neural network is also introduced in this model for the classification of graph nodes, which can expand the receptive field without the loss of location information and combine the structure with the feature extraction. Most importantly, it is theoretically concluded that the proposed graph model takes the same role as a Laplace regularization term in image segmentation, which has been proven by multiple comparative experiments that show the effectiveness of the model in image semantic segmentation. The learned attention is visualized by the heatmap to validate the structure learning ability of our model. The results of these experiments show the importance of structural information in image segmentation. Hence, an idea of deep learning combined with graph structural information is provided in theory and method.",

keywords = "Graph neural network (GNN), image segmentation, self-attention, structure pattern learning",

author = "Yi Lu and Yaran Chen and Dongbin Zhao and Bao Liu and Zhichao Lai and Jianxin Chen",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.",

year = "2021",

month = sep,

doi = "10.1109/TCDS.2020.2998497",

language = "English",

volume = "13",

pages = "631--644",

journal = "IEEE Transactions on Cognitive and Developmental Systems",

issn = "2379-8920",

number = "3",

}

TY - JOUR

T1 - CNN-G: convolutional neural network combined with graph for image segmentation with theoretical analysis

AU - Lu, Yi

AU - Chen, Yaran

AU - Zhao, Dongbin

AU - Liu, Bao

AU - Lai, Zhichao

AU - Chen, Jianxin

PY - 2021/9

Y1 - 2021/9

N2 - Deep convolutional neural network (CNN), although recognized to be considerably successful in its application to semantic segmentation, is inadequate for extracting the overall structure information, for its representing images with the data in the Euclidean space. To improve this inadequacy, a new model in the graph domain that transforms semantic segmentation into graph node classification is proposed for semantic segmentation. In this model, the image is represented by a graph, with its nodes initialized by the feature map obtained by a CNN, and its edges reflecting the relationships of the nodes. The node relationships that are taken into consideration include distance-based ones and semantic ones, respectively, calculated with the Gauss kernel function and attention mechanism. The graph neural network is also introduced in this model for the classification of graph nodes, which can expand the receptive field without the loss of location information and combine the structure with the feature extraction. Most importantly, it is theoretically concluded that the proposed graph model takes the same role as a Laplace regularization term in image segmentation, which has been proven by multiple comparative experiments that show the effectiveness of the model in image semantic segmentation. The learned attention is visualized by the heatmap to validate the structure learning ability of our model. The results of these experiments show the importance of structural information in image segmentation. Hence, an idea of deep learning combined with graph structural information is provided in theory and method.

AB - Deep convolutional neural network (CNN), although recognized to be considerably successful in its application to semantic segmentation, is inadequate for extracting the overall structure information, for its representing images with the data in the Euclidean space. To improve this inadequacy, a new model in the graph domain that transforms semantic segmentation into graph node classification is proposed for semantic segmentation. In this model, the image is represented by a graph, with its nodes initialized by the feature map obtained by a CNN, and its edges reflecting the relationships of the nodes. The node relationships that are taken into consideration include distance-based ones and semantic ones, respectively, calculated with the Gauss kernel function and attention mechanism. The graph neural network is also introduced in this model for the classification of graph nodes, which can expand the receptive field without the loss of location information and combine the structure with the feature extraction. Most importantly, it is theoretically concluded that the proposed graph model takes the same role as a Laplace regularization term in image segmentation, which has been proven by multiple comparative experiments that show the effectiveness of the model in image semantic segmentation. The learned attention is visualized by the heatmap to validate the structure learning ability of our model. The results of these experiments show the importance of structural information in image segmentation. Hence, an idea of deep learning combined with graph structural information is provided in theory and method.

KW - Graph neural network (GNN)

KW - image segmentation

KW - self-attention

KW - structure pattern learning

UR - http://www.scopus.com/inward/record.url?scp=85114751779&partnerID=8YFLogxK

U2 - 10.1109/TCDS.2020.2998497

DO - 10.1109/TCDS.2020.2998497

M3 - Article

AN - SCOPUS:85114751779

SN - 2379-8920

VL - 13

SP - 631

EP - 644

JO - IEEE Transactions on Cognitive and Developmental Systems

JF - IEEE Transactions on Cognitive and Developmental Systems

IS - 3

M1 - 9103557

ER -

CNN-G: convolutional neural network combined with graph for image segmentation with theoretical analysis

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this