TY - GEN
T1 - How many labeled license plates are needed?
AU - Wu, Changhao
AU - Xu, Shugong
AU - Song, Guocong
AU - Zhang, Shunqing
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2018.
PY - 2018
Y1 - 2018
N2 - Training a good deep learning model often requires a lot of annotated data. As a large amount of labeled data is typically difficult to collect and even more difficult to annotate, data augmentation and data generation are widely used in the process of training deep neural networks. However, there is no clear common understanding on how much labeled data is needed to get satisfactory performance. In this paper, we try to address such a question using vehicle license plate character recognition as an example application. We apply computer graphic scripts and Generative Adversarial Networks to generate and augment a large number of annotated, synthesized license plate images with realistic colors, fonts, and character composition from a small number of real, manually labeled license plate images. Generated and augmented data are mixed and used as training data for the license plate recognition network modified from DenseNet. The experimental results show that the model trained from the generated mixed training data has good generalization ability, and the proposed approach achieves a new state-of-the-art accuracy on Dataset-1 and AOLP, even with a very limited number of original real license plates. In addition, the accuracy improvement caused by data generation becomes more significant when the number of labeled images is reduced. Data augmentation also plays a more significant role when the number of labeled images is increased.
AB - Training a good deep learning model often requires a lot of annotated data. As a large amount of labeled data is typically difficult to collect and even more difficult to annotate, data augmentation and data generation are widely used in the process of training deep neural networks. However, there is no clear common understanding on how much labeled data is needed to get satisfactory performance. In this paper, we try to address such a question using vehicle license plate character recognition as an example application. We apply computer graphic scripts and Generative Adversarial Networks to generate and augment a large number of annotated, synthesized license plate images with realistic colors, fonts, and character composition from a small number of real, manually labeled license plate images. Generated and augmented data are mixed and used as training data for the license plate recognition network modified from DenseNet. The experimental results show that the model trained from the generated mixed training data has good generalization ability, and the proposed approach achieves a new state-of-the-art accuracy on Dataset-1 and AOLP, even with a very limited number of original real license plates. In addition, the accuracy improvement caused by data generation becomes more significant when the number of labeled images is reduced. Data augmentation also plays a more significant role when the number of labeled images is increased.
KW - Data augmentation
KW - GANs
KW - License plate recognition
UR - http://www.scopus.com/inward/record.url?scp=85057168918&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-03341-5_28
DO - 10.1007/978-3-030-03341-5_28
M3 - Conference Proceeding
AN - SCOPUS:85057168918
SN - 9783030033408
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 334
EP - 346
BT - Pattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings
A2 - Chen, Xilin
A2 - Lai, Jian-Huang
A2 - Zheng, Nanning
A2 - Liu, Cheng-Lin
A2 - Tan, Tieniu
A2 - Zhou, Jie
A2 - Zha, Hongbin
PB - Springer Verlag
T2 - 1st Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2018
Y2 - 23 November 2018 through 26 November 2018
ER -