Details

Publication Date

Fri Aug 01 2025

Journal Name

Journal Of Engineering

Volume

31

Issue Number

8

Choose Citation Style

Statistics

Abstract Views

51

Galley Views

18

Statistics

Regularized K-Means Clustering via Fully Corrective Frank-Wolfe Optimization

Clustering

Regularized k-means

Fully corrective Frank-Wolfe

Ahmed Yacoub Yousif

Basad Al-Sarray

...Show More Authors

Clustering high-dimensional data remains challenging because traditional k-means is sensitive to noise, outliers, and high dimensionality, often leading to unstable performance. The research presents a robust clustering system which combines the Fully Corrective Frank-Wolfe (FCFW) algorithm with k-means objective that uses Frobenius norm regularization. The addition of Frobenius norm regularization in the model produces more stable clusters while preventing overfitting and promoting cluster compactness. The proposed method uses probabilistic cluster assignments to enable each data point to join multiple clusters at different membership levels, thus supporting clusters with overlapping boundaries. The Kruskal-Wallis test functions as a feature selection method to identify crucial genes, which then guide the clustering operation toward important features in high-dimensional datasets. The FCFW-regularized k-means outperforms traditional k-means in all experiments performed on synthetic and real gene expression datasets. On a breast cancer gene expression dataset (GSE10797), it achieved an Accuracy of 89.39%, compared to 58% for traditional -means. Moreover, it surpassed a recent deep subspace clustering method (scPEDSSC) in Adjusted Rand Index by 8.3% on the Goolam single-cell dataset (0.968 vs. 0.885) and 7.2% on the Deng dataset (0.801 vs. 0.729). Overall, the proposed approach attained the highest ARI and Normalized Mutual Information (NMI) scores across five benchmark datasets. These results confirm that the FCFW-regularized -means yields more accurate and stable clustering results, demonstrating robust performance on high-dimensional data.

View Publication Preview PDF

Quick Preview PDF