DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation

Feilong Tang; Zhongxing Xu; Qiming Huang; Jinfeng Wang; Xianxu Hou; Jionglong Su; Jingxin Liu

doi:10.1007/978-981-99-8469-5_27

DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation

Feilong Tang, Zhongxing Xu, Qiming Huang, Jinfeng Wang, Xianxu Hou, Jionglong Su^*, Jingxin Liu

^*Corresponding author for this work

School of AI and Advanced Computing

Xi'an Jiaotong-Liverpool University

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

30 Citations (Scopus)

Abstract

Transformer-based models have been widely demonstrated to be successful in computer vision tasks by modeling long-range dependencies and capturing global representations. However, they are often dominated by features of large patterns leading to the loss of local details (e.g., boundaries and small objects), which are critical in medical image segmentation. To alleviate this problem, we propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs, namely, the Global-to-Local Spatial Aggregation (GLSA) and Selective Boundary Aggregation (SBA) modules. The GLSA has the ability to aggregate and represent both global and local spatial features, which are beneficial for locating large and small objects, respectively. The SBA module aggregates the boundary characteristic from low-level features and semantic information from high-level features for better-preserving boundary details and locating the re-calibration objects. Extensive experiments in six benchmark datasets demonstrate that our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images and polyps in colonoscopy images. In addition, our approach is more robust than existing methods in various challenging situations, such as small object segmentation and ambiguous object boundaries. The project is available at https://github.com/Barrett-python/DuAT.

Original language	English
Title of host publication	Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings
Editors	Qingshan Liu, Hanzi Wang, Rongrong Ji, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	343-356
Number of pages	14
ISBN (Print)	9789819984688
DOIs	https://doi.org/10.1007/978-981-99-8469-5_27
Publication status	Published - 2024
Event	6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023 - Xiamen, China Duration: 13 Oct 2023 → 15 Oct 2023

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	14429 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023
Country/Territory	China
City	Xiamen
Period	13/10/23 → 15/10/23

Keywords

Dual decoder
Polyp segmentation
Vision Transformers

Access to Document

10.1007/978-981-99-8469-5_27

Cite this

Tang, F., Xu, Z., Huang, Q., Wang, J., Hou, X., Su, J., & Liu, J. (2024). DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation. In Q. Liu, H. Wang, R. Ji, Z. Ma, W. Zheng, H. Zha, X. Chen, & L. Wang (Eds.), Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings (pp. 343-356). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14429 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-99-8469-5_27

Tang, Feilong ; Xu, Zhongxing ; Huang, Qiming et al. / DuAT : Dual-Aggregation Transformer Network for Medical Image Segmentation. Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings. editor / Qingshan Liu ; Hanzi Wang ; Rongrong Ji ; Zhanyu Ma ; Weishi Zheng ; Hongbin Zha ; Xilin Chen ; Liang Wang. Springer Science and Business Media Deutschland GmbH, 2024. pp. 343-356 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{13c61f4860004b3d8bf08cf945a1d9c8,

title = "DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation",

abstract = "Transformer-based models have been widely demonstrated to be successful in computer vision tasks by modeling long-range dependencies and capturing global representations. However, they are often dominated by features of large patterns leading to the loss of local details (e.g., boundaries and small objects), which are critical in medical image segmentation. To alleviate this problem, we propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs, namely, the Global-to-Local Spatial Aggregation (GLSA) and Selective Boundary Aggregation (SBA) modules. The GLSA has the ability to aggregate and represent both global and local spatial features, which are beneficial for locating large and small objects, respectively. The SBA module aggregates the boundary characteristic from low-level features and semantic information from high-level features for better-preserving boundary details and locating the re-calibration objects. Extensive experiments in six benchmark datasets demonstrate that our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images and polyps in colonoscopy images. In addition, our approach is more robust than existing methods in various challenging situations, such as small object segmentation and ambiguous object boundaries. The project is available at https://github.com/Barrett-python/DuAT.",

keywords = "Dual decoder, Polyp segmentation, Vision Transformers",

author = "Feilong Tang and Zhongxing Xu and Qiming Huang and Jinfeng Wang and Xianxu Hou and Jionglong Su and Jingxin Liu",

note = "Publisher Copyright: {\textcopyright} 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.; 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023 ; Conference date: 13-10-2023 Through 15-10-2023",

year = "2024",

doi = "10.1007/978-981-99-8469-5_27",

language = "English",

isbn = "9789819984688",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "343--356",

editor = "Qingshan Liu and Hanzi Wang and Rongrong Ji and Zhanyu Ma and Weishi Zheng and Hongbin Zha and Xilin Chen and Liang Wang",

booktitle = "Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings",

}

Tang, F, Xu, Z, Huang, Q, Wang, J, Hou, X, Su, J & Liu, J 2024, DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation. in Q Liu, H Wang, R Ji, Z Ma, W Zheng, H Zha, X Chen & L Wang (eds), Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14429 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 343-356, 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023, Xiamen, China, 13/10/23. https://doi.org/10.1007/978-981-99-8469-5_27

DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation. / Tang, Feilong; Xu, Zhongxing; Huang, Qiming et al.
Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings. ed. / Qingshan Liu; Hanzi Wang; Rongrong Ji; Zhanyu Ma; Weishi Zheng; Hongbin Zha; Xilin Chen; Liang Wang. Springer Science and Business Media Deutschland GmbH, 2024. p. 343-356 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14429 LNCS).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - DuAT

T2 - 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023

AU - Tang, Feilong

AU - Xu, Zhongxing

AU - Huang, Qiming

AU - Wang, Jinfeng

AU - Hou, Xianxu

AU - Su, Jionglong

AU - Liu, Jingxin

PY - 2024

Y1 - 2024

N2 - Transformer-based models have been widely demonstrated to be successful in computer vision tasks by modeling long-range dependencies and capturing global representations. However, they are often dominated by features of large patterns leading to the loss of local details (e.g., boundaries and small objects), which are critical in medical image segmentation. To alleviate this problem, we propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs, namely, the Global-to-Local Spatial Aggregation (GLSA) and Selective Boundary Aggregation (SBA) modules. The GLSA has the ability to aggregate and represent both global and local spatial features, which are beneficial for locating large and small objects, respectively. The SBA module aggregates the boundary characteristic from low-level features and semantic information from high-level features for better-preserving boundary details and locating the re-calibration objects. Extensive experiments in six benchmark datasets demonstrate that our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images and polyps in colonoscopy images. In addition, our approach is more robust than existing methods in various challenging situations, such as small object segmentation and ambiguous object boundaries. The project is available at https://github.com/Barrett-python/DuAT.

AB - Transformer-based models have been widely demonstrated to be successful in computer vision tasks by modeling long-range dependencies and capturing global representations. However, they are often dominated by features of large patterns leading to the loss of local details (e.g., boundaries and small objects), which are critical in medical image segmentation. To alleviate this problem, we propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs, namely, the Global-to-Local Spatial Aggregation (GLSA) and Selective Boundary Aggregation (SBA) modules. The GLSA has the ability to aggregate and represent both global and local spatial features, which are beneficial for locating large and small objects, respectively. The SBA module aggregates the boundary characteristic from low-level features and semantic information from high-level features for better-preserving boundary details and locating the re-calibration objects. Extensive experiments in six benchmark datasets demonstrate that our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images and polyps in colonoscopy images. In addition, our approach is more robust than existing methods in various challenging situations, such as small object segmentation and ambiguous object boundaries. The project is available at https://github.com/Barrett-python/DuAT.

KW - Dual decoder

KW - Polyp segmentation

KW - Vision Transformers

UR - http://www.scopus.com/inward/record.url?scp=85180801055&partnerID=8YFLogxK

U2 - 10.1007/978-981-99-8469-5_27

DO - 10.1007/978-981-99-8469-5_27

M3 - Conference Proceeding

AN - SCOPUS:85180801055

SN - 9789819984688

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 343

EP - 356

BT - Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings

A2 - Liu, Qingshan

A2 - Wang, Hanzi

A2 - Ji, Rongrong

A2 - Ma, Zhanyu

A2 - Zheng, Weishi

A2 - Zha, Hongbin

A2 - Chen, Xilin

A2 - Wang, Liang

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 13 October 2023 through 15 October 2023

ER -

Tang F, Xu Z, Huang Q, Wang J, Hou X, Su J et al. DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation. In Liu Q, Wang H, Ji R, Ma Z, Zheng W, Zha H, Chen X, Wang L, editors, Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings. Springer Science and Business Media Deutschland GmbH. 2024. p. 343-356. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-981-99-8469-5_27