An Improved Density Peak Clustering Algorithm Based on Chebyshev Inequality and Differential Privacy
Hua Chen, Yuan Zhou, Kehui Mei, Nan Wang, Mengdi Tang, Guangxing Cai- Fluid Flow and Transfer Processes
- Computer Science Applications
- Process Chemistry and Technology
- General Engineering
- Instrumentation
- General Materials Science
This study aims to improve the quality of the clustering results of the density peak clustering (DPC) algorithm and address the privacy protection problem in the clustering analysis process. To achieve this, a DPC algorithm based on Chebyshev inequality and differential privacy (DP-CDPC) is proposed. Firstly, the distance matrix is calculated using cosine distance instead of Euclidean distance when dealing with high-dimensional datasets, and the truncation distance is automatically calculated using the dichotomy method. Secondly, to solve the difficulty in selecting suitable clustering centers in the DPC algorithm, statistical constraints are constructed from the perspective of the decision graph using Chebyshev inequality, and the selection of clustering centers is achieved by adjusting the constraint parameters. Finally, to address the privacy leakage problem in the cluster analysis, the Laplace mechanism is applied to introduce noise to the local density in the process of cluster analysis, enabling the privacy protection of the algorithm. The experimental results demonstrate that the DP-CDPC algorithm can effectively select the clustering centers, improve the quality of clustering results, and provide good privacy protection performance.