발간년도 : [2022]
논문정보 |
|
논문명(한글) |
[Vol.17, No.1] An Efficient Bucket Clustering to Improve Performance of K-means |
|
논문투고자 |
Su-Young Han |
|
논문내용 |
In this paper, we present an efficient clustering technique to find clusters with one scan of a data set. Clustering is one of the data mining techniques and is usefully used for group analysis of data such as customer data analysis and voter analysis, and strategies building for target marketing. The most commonly used k-means method is a method finding the optimal cluster through iterative search and is used to find high-quality clusters. By applying data compression technology, it can be performed faster than the existing k-means, and it aims to find a cluster of the same quality as before. The experiment performed the algorithm and evaluated its performance using real data used in the KDD data mining competition. The proposed method creates clustering with almost the same quality as the existing k-means method, and reduces the number of processed data through bucket compression, making it possible to perform the clustering process for large amounts of data more efficiently. Although the value of data and the utility of analysis are being maximized in the recent industrial environment, the complexity and performance of analysis algorithms are always a problem for large-scale data. It can be expected to be a useful analysis algorithm even in the latest data platforms such as big data. |
|
첨부논문 |
|
|
|
|
|