EID-GAN: Generative Adversarial Nets for Extremely Imbalanced Data Augmentation

Wei Li, Jinlin Chen, Jiannong Cao, Chao Ma, Jia Wang, Xiaohui Cui, Ping Chen

Research output: Contribution to journalArticlepeer-review

23 Citations (Scopus)

Abstract

Imbalanced data causes deep neural networks to output biased results, and it becomes more serious when facing extremely imbalanced data regarding the outliers with tiny size (the ratio of the outlier size to the image size is around 0.05&#x0025;). Many data argumentation models are proposed to supplement imbalanced data to alleviate biased results. However, the existing augmentation models cannot synthesize tiny outliers, which makes the generated data unavailable. In this paper, we propose a new augmentation model named <bold>Extremely Imbalanced Data Augmentation Generative Adversarial Nets</bold> (<inline-formula><tex-math notation="LaTeX">$EID$</tex-math></inline-formula>-GAN) to address the extremely imbalanced data augmentation problem. First, we design a new penalty function by subtracting the outliers from the cropped region of generated instance to guide the generator to learn the features of outliers. After that, we combine the output value of the penalty function with the generator loss to jointly update the generator&#x2019;s parameters with back-propagation. Second, we propose a new evaluation approach that adopts two outlier detectors with k-fold cross-validation to assess the availability of generated instances. We conduct extensive experiments to demonstrate the significant performance improvement of <inline-formula><tex-math notation="LaTeX">$EID$</tex-math></inline-formula>-GAN on two extremely imbalanced datasets: industrial Piston and Fabric datasets; and one general imbalanced dataset: the public DAGM dataset. The experimental results show that our <inline-formula><tex-math notation="LaTeX">$EID$</tex-math></inline-formula>-GAN outperforms the SOTA augmentation models on different imbalanced datasets.

Original languageEnglish
Pages (from-to)1-10
Number of pages10
JournalIEEE Transactions on Industrial Informatics
Volume19
Issue number3
DOIs
Publication statusAccepted/In press - 2022

Keywords

  • Data models
  • Detectors
  • Extremely imbalanced data augmentation
  • Fabrics
  • GAN
  • generated data evaluation
  • Generators
  • norm penalty function
  • Pistons
  • Prototypes
  • Training

Fingerprint

Dive into the research topics of 'EID-GAN: Generative Adversarial Nets for Extremely Imbalanced Data Augmentation'. Together they form a unique fingerprint.

Cite this