| 3,855 | 90 | 180 |
| 下载次数 | 被引频次 | 阅读次数 |
K means算法是聚类分析中使用最为广泛的算法之一。然而,该算法通常受到初始聚类条件的影响。关于这个问题的详细讨论可参看文献[1]。该算法的另一个不足之处是,聚类数目K必须作为参数由用户提供。笔者提出了一个新的有关聚类有效性的度量指标和优化的K means算法。它能自动确定最佳聚类个数。
Abstract:K-Means Clustering Algorithm is one of the most popular methods in cluster analysis. However, it is well known that K-means algorithm suffers from initial starting conditions effects(initial clustering and instance order effects). For more detailed discussion on initialization methods, see literature \. Another weakness of k-means algorithm is that the number of clusters, k, must be supplied as a parameter. In this paper, a new validity measure for k-means clustering is presented to allow the number of clusters to be determined automatically.
[1]UsamaM.FayyadCoryA.ReinaPaulS.Bradley,InitializationofIterativeRefinementClusteringAlgorithms[C].Proc.4thInternationalConf.OnKnowledgeDiscovery&DataMining,1998.
[2]PenaJM,J.A.Lozano,andP.Larranaga,AnEmpiricalComparisonoffourInitializationMethodsfortheK MeansAlgorithm[J].PatternRecognitionLetters,1999,20:1027-1040.
[3]PalNRandJ.C.Bezdek,OnClusterValidityfortheFuzzyc MeansModel,IEEETransactionsonFuzzySystems[J].1995,3:370—390.
[4]RezaeeMR,BPFLelieveldtandJ.H.C.Reiber,ANewClusterValidityIndexforFuzzyc Means[J].PatternRecognitionLetters,1998,19:237—246.
[5]RaySandRHTuri,DeterminationofNumberofClustersinK MeansClusteringandApplicationinColourImageSegmentation[C].ICAPRDT'99,Calcutta,India,27—29December,1999.
基本信息:
DOI:10.16191/j.cnki.hbkx.2003.04.003
中图分类号:O151.21
引用信息:
[1]李双虎,王铁洪.Kmeans聚类分析算法中一个新的确定聚类个数有效性的指标[J].河北省科学院学报,2003(04):199-202.DOI:10.16191/j.cnki.hbkx.2003.04.003.
