Data Convexity and Parameter Independent Clustering for Biomedical Datasets

Md Anisur Rahman, Li Minn Ang, Kah Phooi Seng*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)


In machine learning, the nature of the dataset itself such as convexity of the data point sets affects the right choice of clustering algorithm to give good performance. This brief paper first focuses on how data convexity influences the clustering performance on biomedical datasets. Then it addresses the main challenges of two well-known clustering groups which are centroid-based and density-based clustering. These techniques typically require a set of parameters to be provided by the user before the algorithms can perform well in terms of good clustering and give the optimal number of clusters. Two parameter independent clustering techniques utilizing unique neighborhood sets (UNSs) called Parameter Independent Convex Centroid-based Clustering (ConvexClust) for convex-dominated datasets and Parameter Independent Non-Convex Density-based Clustering (NonConvexClust) for nonconvex-dominated datasets are introduced. The ConvexClust and NonConvex Clust algorithms are extensively evaluated on real-world biomedical datasets. Their performances are also compared with other clustering algorithms using evaluation criteria such as SSE, entropy and purity. The results have revealed the good performance of the proposed parameter-independent clustering techniques and also shown that most of the biomedical datasets in the experiments demonstrated their tendency towards convex-dominated data point sets.

Original languageEnglish
Article number9024144
Pages (from-to)765-772
Number of pages8
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Issue number2
Publication statusPublished - 1 Mar 2021
Externally publishedYes


  • Biomedical
  • centroid-based clustering
  • convexity
  • density-based clustering
  • unique neighborhood set


Dive into the research topics of 'Data Convexity and Parameter Independent Clustering for Biomedical Datasets'. Together they form a unique fingerprint.

Cite this