Anomaly Detection by Using Streaming K-Means and Batch K-Means

Zhuo Wang, Yanghui Zhou, Gangmin Li

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

33 Citations (Scopus)

Abstract

This paper introduces K-Means algorithm as new technique for detecting anomaly. Data analysis has been applied to industry field widely and plays important role in it. However, conventional data analysis method cannot process large-scale data in considerable time and waste lots of computing resources. Conversely, Batch processing and Stream processing are equipped with property of processing data in short time interval, especially stream processing, can process data in real-time. This paper also compares Batch K-Means processing with Streaming K-Means processing according to distance, cost value and cluster distribution factors. Moreover, this paper also discusses how to reach optimized K value of Batch K-means model and Streaming K-means model, analyzes attributes of Batch K-Means processing and Streaming K-Means processing and finds limitations of these two processing models. Finally, the paper proposes limitations of research experiment and future improvement of clustering technique.

Original languageEnglish
Title of host publication2020 5th IEEE International Conference on Big Data Analytics, ICBDA 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages11-17
Number of pages7
ISBN (Electronic)9781728141114
DOIs
Publication statusPublished - May 2020
Event5th IEEE International Conference on Big Data Analytics, ICBDA 2020 - Xiamen, China
Duration: 8 May 202011 May 2020

Publication series

Name2020 5th IEEE International Conference on Big Data Analytics, ICBDA 2020

Conference

Conference5th IEEE International Conference on Big Data Analytics, ICBDA 2020
Country/TerritoryChina
CityXiamen
Period8/05/2011/05/20

Keywords

  • big data
  • cluster distribution
  • k-means clustering
  • optimized K-value
  • streaming k-means clustering

Fingerprint

Dive into the research topics of 'Anomaly Detection by Using Streaming K-Means and Batch K-Means'. Together they form a unique fingerprint.

Cite this