Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN

Yohanes Nuwara; Wong W. Kitt; Filbert H. Juwono; Gregory Ollivierre

doi:10.1117/12.2679828

Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN

Yohanes Nuwara^*, Wong W. Kitt, Filbert H. Juwono, Gregory Ollivierre

^*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Deep learning models for computer vision in remote sensing such as Convolutional Neural Network (CNN) has benefitted acceleration from the usage of multiple CPUs and GPUs. There are several ways to make the training stage more effective in terms of utilizing multiple cores at the same time by processing different image mini-batches with a duplicated model called Distributed Data Parallelization (DDP) and computing the parameters in a lower precision floating-point number called Automatic Mixed Precision (AMP). We would like to investigate the impact of DDP and AMP training modes on the overall utilization and memory consumption of CPU and GPU, as well as the accuracy of a CNN model. The study is performed on the EuroSAT dataset, a Sentinel-2-based benchmark satellite image dataset for image classification of land covers. We compare training using 1 CPU, using DDP, and using both DDP and AMP over 100 epochs using ResNet-18 architecture. The hardware that we used are Intel Xeon Silver 4116 CPU with 24 cores and an NVIDIA v100 GPU. We find that although parallelization of CPUs or DDP takes less time to train on the images, it can take 50 MB more memory than using only a single CPU. The combination of DDP and AMP can release memory up to 160 MB and reduce computation time by 20 seconds. The test accuracy is slightly higher for both DDP and DDP-AMP at 90.61% and 90.77% respectively than without DDP and AMP at 89.84%. Hence, training using Distributed Data Parallelization (DDP) and Automatic Mixed Precision (AMP) has more benefits in terms of lower GPU memory consumption, faster training execution time, faster convergence towards solutions, and finally, higher accuracy.

Original language	English
Title of host publication	Fifteenth International Conference on Machine Vision, ICMV 2022
Editors	Wolfgang Osten, Dmitry Nikolaev, Jianhong Zhou
Publisher	SPIE
ISBN (Electronic)	9781510666184
DOIs	https://doi.org/10.1117/12.2679828
Publication status	Published - 2023
Externally published	Yes
Event	15th International Conference on Machine Vision, ICMV 2022 - Rome, Italy Duration: 18 Nov 2022 → 20 Nov 2022

Publication series

Name	Proceedings of SPIE - The International Society for Optical Engineering
Volume	12701
ISSN (Print)	0277-786X
ISSN (Electronic)	1996-756X

Conference

Conference	15th International Conference on Machine Vision, ICMV 2022
Country/Territory	Italy
City	Rome
Period	18/11/22 → 20/11/22

Keywords

Automatic Mixed Precision
Convolutional Neural Network
Distributed Data-Parallel
Graphics Processing Unit
Remote Sensing

Access to Document

10.1117/12.2679828

Cite this

Nuwara, Y., Kitt, W. W., Juwono, F. H., & Ollivierre, G. (2023). Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN. In W. Osten, D. Nikolaev, & J. Zhou (Eds.), Fifteenth International Conference on Machine Vision, ICMV 2022 Article 127011O (Proceedings of SPIE - The International Society for Optical Engineering; Vol. 12701). SPIE. https://doi.org/10.1117/12.2679828

Nuwara, Yohanes ; Kitt, Wong W. ; Juwono, Filbert H. et al. / Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN. Fifteenth International Conference on Machine Vision, ICMV 2022. editor / Wolfgang Osten ; Dmitry Nikolaev ; Jianhong Zhou. SPIE, 2023. (Proceedings of SPIE - The International Society for Optical Engineering).

@inproceedings{c90a859a03e447d4aca46c11bd75873d,

title = "Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN",

abstract = "Deep learning models for computer vision in remote sensing such as Convolutional Neural Network (CNN) has benefitted acceleration from the usage of multiple CPUs and GPUs. There are several ways to make the training stage more effective in terms of utilizing multiple cores at the same time by processing different image mini-batches with a duplicated model called Distributed Data Parallelization (DDP) and computing the parameters in a lower precision floating-point number called Automatic Mixed Precision (AMP). We would like to investigate the impact of DDP and AMP training modes on the overall utilization and memory consumption of CPU and GPU, as well as the accuracy of a CNN model. The study is performed on the EuroSAT dataset, a Sentinel-2-based benchmark satellite image dataset for image classification of land covers. We compare training using 1 CPU, using DDP, and using both DDP and AMP over 100 epochs using ResNet-18 architecture. The hardware that we used are Intel Xeon Silver 4116 CPU with 24 cores and an NVIDIA v100 GPU. We find that although parallelization of CPUs or DDP takes less time to train on the images, it can take 50 MB more memory than using only a single CPU. The combination of DDP and AMP can release memory up to 160 MB and reduce computation time by 20 seconds. The test accuracy is slightly higher for both DDP and DDP-AMP at 90.61% and 90.77% respectively than without DDP and AMP at 89.84%. Hence, training using Distributed Data Parallelization (DDP) and Automatic Mixed Precision (AMP) has more benefits in terms of lower GPU memory consumption, faster training execution time, faster convergence towards solutions, and finally, higher accuracy.",

keywords = "Automatic Mixed Precision, Convolutional Neural Network, Distributed Data-Parallel, Graphics Processing Unit, Remote Sensing",

author = "Yohanes Nuwara and Kitt, {Wong W.} and Juwono, {Filbert H.} and Gregory Ollivierre",

note = "Publisher Copyright: {\textcopyright} 2023 SPIE.; 15th International Conference on Machine Vision, ICMV 2022 ; Conference date: 18-11-2022 Through 20-11-2022",

year = "2023",

doi = "10.1117/12.2679828",

language = "English",

series = "Proceedings of SPIE - The International Society for Optical Engineering",

publisher = "SPIE",

editor = "Wolfgang Osten and Dmitry Nikolaev and Jianhong Zhou",

booktitle = "Fifteenth International Conference on Machine Vision, ICMV 2022",

}

Nuwara, Y, Kitt, WW, Juwono, FH & Ollivierre, G 2023, Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN. in W Osten, D Nikolaev & J Zhou (eds), Fifteenth International Conference on Machine Vision, ICMV 2022., 127011O, Proceedings of SPIE - The International Society for Optical Engineering, vol. 12701, SPIE, 15th International Conference on Machine Vision, ICMV 2022, Rome, Italy, 18/11/22. https://doi.org/10.1117/12.2679828

Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN. / Nuwara, Yohanes; Kitt, Wong W.; Juwono, Filbert H. et al.
Fifteenth International Conference on Machine Vision, ICMV 2022. ed. / Wolfgang Osten; Dmitry Nikolaev; Jianhong Zhou. SPIE, 2023. 127011O (Proceedings of SPIE - The International Society for Optical Engineering; Vol. 12701).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN

AU - Nuwara, Yohanes

AU - Kitt, Wong W.

AU - Juwono, Filbert H.

AU - Ollivierre, Gregory

PY - 2023

Y1 - 2023

N2 - Deep learning models for computer vision in remote sensing such as Convolutional Neural Network (CNN) has benefitted acceleration from the usage of multiple CPUs and GPUs. There are several ways to make the training stage more effective in terms of utilizing multiple cores at the same time by processing different image mini-batches with a duplicated model called Distributed Data Parallelization (DDP) and computing the parameters in a lower precision floating-point number called Automatic Mixed Precision (AMP). We would like to investigate the impact of DDP and AMP training modes on the overall utilization and memory consumption of CPU and GPU, as well as the accuracy of a CNN model. The study is performed on the EuroSAT dataset, a Sentinel-2-based benchmark satellite image dataset for image classification of land covers. We compare training using 1 CPU, using DDP, and using both DDP and AMP over 100 epochs using ResNet-18 architecture. The hardware that we used are Intel Xeon Silver 4116 CPU with 24 cores and an NVIDIA v100 GPU. We find that although parallelization of CPUs or DDP takes less time to train on the images, it can take 50 MB more memory than using only a single CPU. The combination of DDP and AMP can release memory up to 160 MB and reduce computation time by 20 seconds. The test accuracy is slightly higher for both DDP and DDP-AMP at 90.61% and 90.77% respectively than without DDP and AMP at 89.84%. Hence, training using Distributed Data Parallelization (DDP) and Automatic Mixed Precision (AMP) has more benefits in terms of lower GPU memory consumption, faster training execution time, faster convergence towards solutions, and finally, higher accuracy.

AB - Deep learning models for computer vision in remote sensing such as Convolutional Neural Network (CNN) has benefitted acceleration from the usage of multiple CPUs and GPUs. There are several ways to make the training stage more effective in terms of utilizing multiple cores at the same time by processing different image mini-batches with a duplicated model called Distributed Data Parallelization (DDP) and computing the parameters in a lower precision floating-point number called Automatic Mixed Precision (AMP). We would like to investigate the impact of DDP and AMP training modes on the overall utilization and memory consumption of CPU and GPU, as well as the accuracy of a CNN model. The study is performed on the EuroSAT dataset, a Sentinel-2-based benchmark satellite image dataset for image classification of land covers. We compare training using 1 CPU, using DDP, and using both DDP and AMP over 100 epochs using ResNet-18 architecture. The hardware that we used are Intel Xeon Silver 4116 CPU with 24 cores and an NVIDIA v100 GPU. We find that although parallelization of CPUs or DDP takes less time to train on the images, it can take 50 MB more memory than using only a single CPU. The combination of DDP and AMP can release memory up to 160 MB and reduce computation time by 20 seconds. The test accuracy is slightly higher for both DDP and DDP-AMP at 90.61% and 90.77% respectively than without DDP and AMP at 89.84%. Hence, training using Distributed Data Parallelization (DDP) and Automatic Mixed Precision (AMP) has more benefits in terms of lower GPU memory consumption, faster training execution time, faster convergence towards solutions, and finally, higher accuracy.

KW - Automatic Mixed Precision

KW - Convolutional Neural Network

KW - Distributed Data-Parallel

KW - Graphics Processing Unit

KW - Remote Sensing

UR - http://www.scopus.com/inward/record.url?scp=85172881005&partnerID=8YFLogxK

U2 - 10.1117/12.2679828

DO - 10.1117/12.2679828

M3 - Conference Proceeding

AN - SCOPUS:85172881005

T3 - Proceedings of SPIE - The International Society for Optical Engineering

BT - Fifteenth International Conference on Machine Vision, ICMV 2022

A2 - Osten, Wolfgang

A2 - Nikolaev, Dmitry

A2 - Zhou, Jianhong

PB - SPIE

T2 - 15th International Conference on Machine Vision, ICMV 2022

Y2 - 18 November 2022 through 20 November 2022

ER -

Nuwara Y, Kitt WW, Juwono FH, Ollivierre G. Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN. In Osten W, Nikolaev D, Zhou J, editors, Fifteenth International Conference on Machine Vision, ICMV 2022. SPIE. 2023. 127011O. (Proceedings of SPIE - The International Society for Optical Engineering). doi: 10.1117/12.2679828

Automatic Mixed Precision and Distributed Data-Parallel Training for Satellite Image Classification using CNN

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this