HAL: Hybrid active learning for efficient labeling in medical domain

Xing Wu; Cheng Chen; Mingyu Zhong; Jianjia Wang

doi:10.1016/j.neucom.2020.10.115

HAL: Hybrid active learning for efficient labeling in medical domain

Xing Wu^*, Cheng Chen, Mingyu Zhong, Jianjia Wang

^*Corresponding author for this work

Shanghai University

Research output: Contribution to journal › Article › peer-review

20 Citations (Scopus)

Abstract

The success of the deep convolutional neural networks in computer vision tasks mainly relies on massive labeled training data. However, in the field of medical images, it is difficult to construct large labeled datasets since the labeling of medical images is time-consuming, labor-intensive, and medical expertise demanded. To meet the challenge, we propose a hybrid active learning framework HAL for efficient labeling in the medical domain, which integrates active learning into deep learning to reduce the cost of manual labeling and take the advantages of deep neural networks. The proposed HAL utilizes a hybrid sampling strategy considering both sample diversity and prediction loss simultaneously. The effectiveness and efficiency of proposed HAL are validated on three medical image datasets. The experimental results show that the proposed HAL outperforms several state-of-the-art active learning methods. On the Hyper-Kvasir Dataset, with only 10% of the labels, the HAL achieves 95% performance of the deep learning method trained on the entire dataset. The quantitative and qualitative analysis proves that HAL can greatly reduce the number of labels needed for training a deep neural network, which is robust to address efficient labeling problems even with imbalanced data distribution.

Original language	English
Pages (from-to)	563-572
Number of pages	10
Journal	Neurocomputing
Volume	456
DOIs	https://doi.org/10.1016/j.neucom.2020.10.115
Publication status	Published - 7 Oct 2021
Externally published	Yes

Keywords

Active learning
Computer-aided diagnosis
Prediction loss
Sample diversity
Transfer learning

Access to Document

10.1016/j.neucom.2020.10.115

Cite this

@article{db45a740b431459ea558958a69141e77,

title = "HAL: Hybrid active learning for efficient labeling in medical domain",

abstract = "The success of the deep convolutional neural networks in computer vision tasks mainly relies on massive labeled training data. However, in the field of medical images, it is difficult to construct large labeled datasets since the labeling of medical images is time-consuming, labor-intensive, and medical expertise demanded. To meet the challenge, we propose a hybrid active learning framework HAL for efficient labeling in the medical domain, which integrates active learning into deep learning to reduce the cost of manual labeling and take the advantages of deep neural networks. The proposed HAL utilizes a hybrid sampling strategy considering both sample diversity and prediction loss simultaneously. The effectiveness and efficiency of proposed HAL are validated on three medical image datasets. The experimental results show that the proposed HAL outperforms several state-of-the-art active learning methods. On the Hyper-Kvasir Dataset, with only 10% of the labels, the HAL achieves 95% performance of the deep learning method trained on the entire dataset. The quantitative and qualitative analysis proves that HAL can greatly reduce the number of labels needed for training a deep neural network, which is robust to address efficient labeling problems even with imbalanced data distribution.",

keywords = "Active learning, Computer-aided diagnosis, Prediction loss, Sample diversity, Transfer learning",

author = "Xing Wu and Cheng Chen and Mingyu Zhong and Jianjia Wang",

note = "Publisher Copyright: {\textcopyright} 2021 Elsevier B.V.",

year = "2021",

month = oct,

day = "7",

doi = "10.1016/j.neucom.2020.10.115",

language = "English",

volume = "456",

pages = "563--572",

journal = "Neurocomputing",

issn = "0925-2312",

}

TY - JOUR

T1 - HAL

T2 - Hybrid active learning for efficient labeling in medical domain

AU - Wu, Xing

AU - Chen, Cheng

AU - Zhong, Mingyu

AU - Wang, Jianjia

PY - 2021/10/7

Y1 - 2021/10/7

N2 - The success of the deep convolutional neural networks in computer vision tasks mainly relies on massive labeled training data. However, in the field of medical images, it is difficult to construct large labeled datasets since the labeling of medical images is time-consuming, labor-intensive, and medical expertise demanded. To meet the challenge, we propose a hybrid active learning framework HAL for efficient labeling in the medical domain, which integrates active learning into deep learning to reduce the cost of manual labeling and take the advantages of deep neural networks. The proposed HAL utilizes a hybrid sampling strategy considering both sample diversity and prediction loss simultaneously. The effectiveness and efficiency of proposed HAL are validated on three medical image datasets. The experimental results show that the proposed HAL outperforms several state-of-the-art active learning methods. On the Hyper-Kvasir Dataset, with only 10% of the labels, the HAL achieves 95% performance of the deep learning method trained on the entire dataset. The quantitative and qualitative analysis proves that HAL can greatly reduce the number of labels needed for training a deep neural network, which is robust to address efficient labeling problems even with imbalanced data distribution.

AB - The success of the deep convolutional neural networks in computer vision tasks mainly relies on massive labeled training data. However, in the field of medical images, it is difficult to construct large labeled datasets since the labeling of medical images is time-consuming, labor-intensive, and medical expertise demanded. To meet the challenge, we propose a hybrid active learning framework HAL for efficient labeling in the medical domain, which integrates active learning into deep learning to reduce the cost of manual labeling and take the advantages of deep neural networks. The proposed HAL utilizes a hybrid sampling strategy considering both sample diversity and prediction loss simultaneously. The effectiveness and efficiency of proposed HAL are validated on three medical image datasets. The experimental results show that the proposed HAL outperforms several state-of-the-art active learning methods. On the Hyper-Kvasir Dataset, with only 10% of the labels, the HAL achieves 95% performance of the deep learning method trained on the entire dataset. The quantitative and qualitative analysis proves that HAL can greatly reduce the number of labels needed for training a deep neural network, which is robust to address efficient labeling problems even with imbalanced data distribution.

KW - Active learning

KW - Computer-aided diagnosis

KW - Prediction loss

KW - Sample diversity

KW - Transfer learning

UR - http://www.scopus.com/inward/record.url?scp=85105547193&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2020.10.115

DO - 10.1016/j.neucom.2020.10.115

M3 - Article

AN - SCOPUS:85105547193

SN - 0925-2312

VL - 456

SP - 563

EP - 572

JO - Neurocomputing

JF - Neurocomputing

ER -

HAL: Hybrid active learning for efficient labeling in medical domain

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this