Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Parallel Clustering Algorithm for the k-medoids Problem in High-dimensional Space for Large-scale Datasets. / Vandanov, Sergey; Plyasunov, Aleksandr; Ushakov, Anton.
Proceedings - 2023 19th International Asian School-Seminar on Optimization Problems of Complex Systems, OPCS 2023. Institute of Electrical and Electronics Engineers Inc., 2023. p. 119-124 (Proceedings - 2023 19th International Asian School-Seminar on Optimization Problems of Complex Systems, OPCS 2023).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Parallel Clustering Algorithm for the k-medoids Problem in High-dimensional Space for Large-scale Datasets
AU - Vandanov, Sergey
AU - Plyasunov, Aleksandr
AU - Ushakov, Anton
N1 - The study of the second author was carried out within the framework of the state contract of the Sobolev Institute of Mathematics (project FWNF-2022-0019). The research of the third author was funded by the Ministry of Education and Science of the Russian Federation No. 121041300065-9. Публикация для корректировки.
PY - 2023
Y1 - 2023
N2 - We present a robust, parallel primal-dual heuristic algorithm for the k-medoids clustering problem, a widely utilized method in data mining and machine learning. Our approach surpasses current algorithms by effectively addressing their limitations, such as time-consuming distance matrix calculations, inefficient nearest-neighbor searches, and difficulties in handling large-scale datasets. To overcome these challenges, we employ an efficient parallel implementation, combined with a pioneering subgradient search algorithm. We evaluate our algorithm on the BIRCH and Stanford Dog datasets and demonstrate its superiority over existing k-medoids clustering algorithms in terms of solution quality and run time. Additionally, we introduce a novel vectorization technique that enables our algorithm to handle various types of data, such as images, text, and point data. Overall, our work contributes to the field of data mining and machine learning by providing an efficient and effective solution for the k-medoids clustering problem. The proposed algorithm offers improved performance, and versatility, making it a valuable tool for a wide range of applications.
AB - We present a robust, parallel primal-dual heuristic algorithm for the k-medoids clustering problem, a widely utilized method in data mining and machine learning. Our approach surpasses current algorithms by effectively addressing their limitations, such as time-consuming distance matrix calculations, inefficient nearest-neighbor searches, and difficulties in handling large-scale datasets. To overcome these challenges, we employ an efficient parallel implementation, combined with a pioneering subgradient search algorithm. We evaluate our algorithm on the BIRCH and Stanford Dog datasets and demonstrate its superiority over existing k-medoids clustering algorithms in terms of solution quality and run time. Additionally, we introduce a novel vectorization technique that enables our algorithm to handle various types of data, such as images, text, and point data. Overall, our work contributes to the field of data mining and machine learning by providing an efficient and effective solution for the k-medoids clustering problem. The proposed algorithm offers improved performance, and versatility, making it a valuable tool for a wide range of applications.
KW - Lagrangian relaxation
KW - clustering
KW - facility location
KW - k-medoids
KW - machine learning
KW - p-median
UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85175470838&origin=inward&txGid=cfcb79323bf771c75ad0414650c230f6
UR - https://www.mendeley.com/catalogue/5f1f923e-80cf-31f2-89ea-8da97cd6bde6/
U2 - 10.1109/OPCS59592.2023.10275752
DO - 10.1109/OPCS59592.2023.10275752
M3 - Conference contribution
SN - 9798350331134
T3 - Proceedings - 2023 19th International Asian School-Seminar on Optimization Problems of Complex Systems, OPCS 2023
SP - 119
EP - 124
BT - Proceedings - 2023 19th International Asian School-Seminar on Optimization Problems of Complex Systems, OPCS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
ER -
ID: 59187926