更改

K-means聚类 (查看源代码)

2020年4月21日 (二) 23:56的版本

删除2字节、 2020年4月21日 (二) 23:56

第130行：第130行：

* 一些方法尝试使用三角形不等式来加快每个k-means步骤。<ref name="phillips2" /><ref name="elkan2" /><ref name="hamerly22" /><ref>{{Cite journal |last=Drake |first=Jonathan |date=2012 |title=Accelerated ''k''-means with adaptive distance bounds |url=http://opt.kyb.tuebingen.mpg.de/papers/opt2012_paper_13.pdf |journal=The 5th NIPS Workshop on Optimization for Machine Learning, OPT2012 }}</ref><ref name="hamerly32" />

* 通过在集群之间交换点来逃避局部最优。<ref name="hartigan19792" />

−

* [[球形k均值聚类 Spherical k-means clustering]]算法适用于文本数据。.<ref>{{Cite journal |last1=Dhillon |first1=I. S. |last2=Modha |first2=D. M. |year=2001 |title=Concept decompositions for large sparse text data using clustering |journal=Machine Learning |volume=42 |issue=1 |pages=143–175 |doi=10.1023/a:1007612920971 |doi-access=free }}</ref>

+

* [[球形k均值聚类 Spherical k-means clustering]]算法适用于文本数据。<ref>{{Cite journal |last1=Dhillon |first1=I. S. |last2=Modha |first2=D. M. |year=2001 |title=Concept decompositions for large sparse text data using clustering |journal=Machine Learning |volume=42 |issue=1 |pages=143–175 |doi=10.1023/a:1007612920971 |doi-access=free }}</ref>

* 分层变体，例如[[二分k均值 Bisecting k-means]]，<ref>{{cite journal | last1 = Steinbach | first1 = M. | last2 = Karypis | first2 = G. | last3 = Kumar | first3 = V. | year = 2000 | title = "A comparison of document clustering techniques". In | url = | journal = KDD Workshop on Text Mining | volume = 400 | issue = 1| pages = 525–526 }}</ref>[[X均值聚类 X-means clustering]]<ref>Pelleg, D.; & Moore, A. W. (2000, June). "[http://cs.uef.fi/~zhao/Courses/Clustering2012/Xmeans.pdf X-means: Extending ''k''-means with Efficient Estimation of the Number of Clusters]". In ''ICML'', Vol. 1</ref>和[[G均值聚类 G-means clustering]]<ref>{{cite journal | last1 = Hamerly | first1 = Greg | last2 = Elkan | first2 = Charles | year = 2004 | title = | url = | journal = Advances in Neural Information Processing Systems | volume = 16 | issue = | page = 281 }}</ref> 反复拆分聚类以构建层次结构，还可以尝试自动确定数据集中聚类的最佳数量。

* 内部集群评估方法（例如集群轮廓）可以帮助确定集群的数量。

* [[Minkowski加权k均值 Minkowski weighted k-means]]：自动计算聚类特定的特征权重，支持直观的想法，即一个特征在不同的特征上可能具有不同的相关度。<ref>{{Cite journal |last1=Amorim |first1=R. C. |last2=Mirkin |first2=B. |year=2012 |title=Minkowski Metric, Feature Weighting and Anomalous Cluster Initialisation in ''k''-Means Clustering |journal=Pattern Recognition |volume=45 |issue=3 |pages=1061–1075 |doi=10.1016/j.patcog.2011.08.012 }}</ref>这些权重还可以用于重新缩放给定的数据集，增加了在预期的群集数下优化群集有效性指标的可能性。<ref>{{Cite journal |last1=Amorim |first1=R. C. |last2=Hennig |first2=C. |year=2015 |title=Recovering the number of clusters in data sets with noise features using feature rescaling factors |journal=Information Sciences |volume=324 |pages=126–145 |arxiv=1602.06989 |doi=10.1016/j.ins.2015.06.039 }}</ref>

* Mini-batch k-means：针对不适合内存的数据集使用“mini batch”样本。<ref>{{Cite conference |last=Sculley |first=David |date=2010 |title=Web-scale ''k''-means clustering |url=http://dl.acm.org/citation.cfm?id=1772862 |publisher=ACM |pages=1177–1178 |accessdate=2016-12-21 |booktitle=Proceedings of the 19th international conference on World Wide Web }}</ref>

−

===Hartigan-Wong方法===

薄荷

7,129

个编辑

更改

K-means聚类 (查看源代码)

2020年4月21日 (二) 23:56的版本

导航菜单

搜索