An iterative guided active learning-based approach for real-time sampling in smart manufacturing systems

Abdelrahman Farrag, Nieqing Cao, Mohammed Khalil Ghali, Daehan Won, Yu Jin*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Smart manufacturing systems increasingly rely on data-driven decision-making, where timely and accurate sampling of process data is critical for predictive modeling and quality control. Conventional sampling approaches often demand large labeled datasets, incur high computational costs, and fail to capture representative instances. This paper introduces BURGAL (Batch-Uncertainty-Representativeness Guided Active Learning), a novel framework that iteratively selects both informative and representative data instances in batch mode. BURGAL reduces the amount of labeled data required while maintaining predictive accuracy and computational efficiency by integrating uncertainty-based sampling with a representativeness criterion. The framework is validated on two distinct datasets: (1) a high-volume Surface Mount Technology (SMT) dataset collected from a real-time production line, and (2) the SECOM semiconductor dataset, characterized by high dimensionality and severe class imbalance. In the SMT case study, BURGAL achieved real-time sampling by selecting 50% of a 3-million-instance dataset within 247 seconds, while preserving the population distribution (Cohen’s) and enabling predictive models with mean absolute errors ranging from 22% at a 10% sampling ratio to 10% at a 50% sampling ratio for identifying defective printed circuit board (PCB) regions. In the SECOM study, BURGAL was adapted as a feature selection strategy, integrated with a generative adversarial network (GAN) for minority-class augmentation and a Random Forest classifier. This pipeline achieved an average precision of 0.91 and a recall of 0.83 under 10-fold Leave-One-Out Cross-Validation, outperforming existing baseline methods in both predictive performance and sampling efficiency.

Original languageEnglish
JournalInternational Journal of Advanced Manufacturing Technology
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • Active learning
  • Feature selection
  • Machine learning
  • Sampling
  • Smart manufacturing

Cite this