Mining top-k high average-utility itemsets based on breadth-first search

Xuan Liu, Genlang Chen, Fangyu Wu*, Shiting Wen, Wanli Zuo

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

High average-utility itemset mining is a subfield of data mining that has extensive practical applications. However, it is difficult for users to determine a proper minimum threshold because they cannot accurately predict the number of patterns mined at a given threshold. To address this issue, top-k high average-utility itemset mining has been proposed where k is the number of high average-utility itemsets to be mined. In this paper, we design an effective algorithm (named ETAUIM) for finding top-k high average-utility itemsets. ETAUIM employs a breadth-first search strategy to efficiently explore the search space, and it utilizes a tighter upper bound instead of the average-utility upper bound to limit the search space. Additionally, ETAUIM removes irrelevant items during the mining process and utilizes an early abandoning strategy to terminate unnecessary join operations in advance. To evaluate the proposed algorithm, extensive experiments were conducted on six sparse datasets and two dense datasets. Four state-of-the-art algorithms were used for comparison. The experimental results show that ETAUIM has excellent performance and scalability. Moreover, ETAUIM always performs better for sparse datasets.

Original languageEnglish
Pages (from-to)29319-29337
Number of pages19
JournalApplied Intelligence
Volume53
Issue number23
DOIs
Publication statusPublished - Dec 2023

Keywords

  • Breadth-first search
  • Data mining
  • High average-utility itemset
  • Top-k high average-utility itemsets

Fingerprint

Dive into the research topics of 'Mining top-k high average-utility itemsets based on breadth-first search'. Together they form a unique fingerprint.

Cite this