更改

跳到导航 跳到搜索
添加4字节 、 2020年10月7日 (三) 14:25
无编辑摘要
第67行: 第67行:  
In order to decide which clusters should be combined (for agglomerative), or where a cluster should be split (for divisive), a measure of dissimilarity between sets of observations is required. In most methods of hierarchical clustering, this is achieved by use of an appropriate metric (a measure of distance between pairs of observations), and a linkage criterion which specifies the dissimilarity of sets as a function of the pairwise distances of observations in the sets.
 
In order to decide which clusters should be combined (for agglomerative), or where a cluster should be split (for divisive), a measure of dissimilarity between sets of observations is required. In most methods of hierarchical clustering, this is achieved by use of an appropriate metric (a measure of distance between pairs of observations), and a linkage criterion which specifies the dissimilarity of sets as a function of the pairwise distances of observations in the sets.
   −
为了决定哪些集群应该被组合起来(用于聚合) ,或者哪些集群应该被分割(用于分裂) ,需要在观察组之间进行不同程度的度量。在大多数层次聚类方法中,这是通过使用适当的度量(对观测值之间的距离度量)和联系准则来实现的,联系准则将集合的不同指定为观测值在集合中的成对距离的函数。
+
为了决定哪些集群应该被组合起来(用于聚合) ,或者哪些集群应该被分离(用于分裂) ,需要在观察组之间进行不同程度的度量。在大多数层次聚类方法中,这是通过使用适当的度量(对观测值之间的距离度量)和联系准则来实现的,联系准则将集合的不同指定为观测值在集合中的成对距离的函数。
      第81行: 第81行:  
The choice of an appropriate metric will influence the shape of the clusters, as some elements may be close to one another according to one distance and farther away according to another. For example, in a 2-dimensional space, the distance between the point (1,0) and the origin (0,0) is always 1 according to the usual norms, but the distance between the point (1,1) and the origin (0,0) can be 2 under Manhattan distance, <math>\scriptstyle\sqrt{2}</math> under Euclidean distance, or 1 under maximum distance.
 
The choice of an appropriate metric will influence the shape of the clusters, as some elements may be close to one another according to one distance and farther away according to another. For example, in a 2-dimensional space, the distance between the point (1,0) and the origin (0,0) is always 1 according to the usual norms, but the distance between the point (1,1) and the origin (0,0) can be 2 under Manhattan distance, <math>\scriptstyle\sqrt{2}</math> under Euclidean distance, or 1 under maximum distance.
   −
选择合适的度量将影响星系团的形状,因为某些元素可能根据一个距离彼此接近,而根据另一个距离彼此更远。例如,在一个二维空间中,点(1,0)和原点(0,0)之间的距离通常是1,但是点(1,1)和原点(0,0)之间的距离在曼哈顿距离下可以是2,在欧几里得度量下可以是1,在最大距离下可以是1。
+
度量方式的选择将影响星系团的形状,因为某些元素依据一个距离可能彼此接近,而依据另一个距离可能彼此远离。例如,在一个二维空间中,点(1,0)和原点(0,0)之间的距离通常是1,但是点(1,1)和原点(0,0)之间的距离在曼哈顿距离下可以是2,在欧几里得度量下可以是1,在最大距离下可以是1。
      第440行: 第440行:     
* The increment of some cluster descriptor (i.e., a quantity defined for measuring the quality of a cluster) after merging two clusters.<ref>Zhang, et al. "Agglomerative clustering via maximum incremental path integral." Pattern Recognition (2013).</ref><ref>Zhao, and Tang. "Cyclizing clusters via zeta function of a graph."Advances in Neural Information Processing Systems. 2008.</ref><ref>Ma, et al. "Segmentation of multivariate mixed data via lossy data coding and compression." IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(9) (2007): 1546-1562.</ref>
 
* The increment of some cluster descriptor (i.e., a quantity defined for measuring the quality of a cluster) after merging two clusters.<ref>Zhang, et al. "Agglomerative clustering via maximum incremental path integral." Pattern Recognition (2013).</ref><ref>Zhao, and Tang. "Cyclizing clusters via zeta function of a graph."Advances in Neural Information Processing Systems. 2008.</ref><ref>Ma, et al. "Segmentation of multivariate mixed data via lossy data coding and compression." IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(9) (2007): 1546-1562.</ref>
  −
      
== Discussion 讨论==
 
== Discussion 讨论==
526

个编辑

导航菜单