更改

层次聚类 (查看源代码)

2020年10月7日 (三) 14:07的版本

大小无更改、 2020年10月7日 (三) 14:07

无编辑摘要

第43行：第43行：

The standard algorithm for hierarchical agglomerative clustering (HAC) has a time complexity of <math>\mathcal{O}(n^3)</math> and requires <math>\mathcal{O}(n^2)</math> memory, which makes it too slow for even medium data sets. However, for some special cases, optimal efficient agglomerative methods (of complexity <math>\mathcal{O}(n^2)</math>) are known: SLINK for single-linkage and CLINK for complete-linkage clustering. With a heap the runtime of the general case can be reduced to <math>\mathcal{O}(n^2 \log n)</math> at the cost of further increasing the memory requirements. In many cases, the memory overheads of this approach are too large to make it practically usable.

−

'''<font color="#ff8000"> 层次凝聚聚类Hierarchical agglomerative clustering</font>'''(HAC)的标准算法的时间复杂度为 < math > mathical { o }(n ^ 3) </math > ，并且需要 < math > mathcal { o }(n ^ 2) </math > 占用内存，这使得它对于中等数据集来说效率太低了。然而，对于某些特殊情况，已知的最佳有效凝聚方法(复杂度 < math > mathcal { o }(n ^ 2) </math >)是: 单连锁的 SLINK < ! ——粗体 wp: r # pla-- > 和完全连锁的 CLINK。对于数据群而言，一般情况下的运行时可以缩减为 < math > mathcal { o }(n ^ 2 log n) </math > ~~，代价是进一步增加内存需求。在许多情况下，这种方法的内存开销太大，并不实用。~~

+

'''<font color="#ff8000"> 层次凝聚聚类Hierarchical agglomerative clustering</font>'''(HAC)的标准算法的时间复杂度为 < math > mathical { o }(n ^ 3) </math > ，并且需要 < math > mathcal { o }(n ^ 2) </math > 占用内存，这使得它对于中等数据集来说效率太低了。然而，对于某些特殊情况，已知的最佳有效凝聚方法(复杂度 < math > mathcal { o }(n ^ 2) </math >)是: 单连锁的 SLINK < ! ——粗体 wp: r # pla-- > 和完全连锁的 CLINK。对于数据群而言，一般情况下的运行时可以缩减为 < math > mathcal { o }(n ^ 2 log n) </math > ，代价是进一步增加内存需求。在多数情况下，这种方法的内存开销太大，并不实用。

CecileLi

526

个编辑

更改

层次聚类 (查看源代码)

2020年10月7日 (三) 14:07的版本

导航菜单

搜索