“聚类系数”的版本间的差异

2020年4月8日 (三) 14:40的版本

图论中，聚集系数用于衡量节点聚集的程度。有证据表明，大多数现实世界的网络中，特别是在社交网络中，节点倾向于创建相对紧密联系的群体; 这种可能性往往大于在两个节点之间随机建立关系的平均概率(Holland 和 Leinhardt，1971; ^[1] Watts 和 Strogatz，1998^[2] )。

全局聚集系数

全局集聚系数以节点三重性（triplets）为基础。一个三元组是由两个(开三元组)或三个(闭三元组)无向联系相连接的三个节点。因此，一个三角形图包括三个闭合的三联体，每个节点中心为一个(注意，这意味着三角形中的三联体来自节点的重叠选择)。全局集聚系数是三联体总数除以闭合三联体(或3 x 三角形)的个数。 Luce和Perry(1949年)第一次尝试对其进行测量。 ^[3]这个度量概括了整个网络(全局)中的集群，并可应用于无向和有向网络(通常称为传递性，见 Wasserman 和 Faust，1994，第243页。^[4] 全局集聚系数的定义是:

[math]\displaystyle{ C = \frac{\mbox{number of closed triplets}}{\mbox{number of all triplets (open and closed)}} }[/math].

闭合三联体的数目在文献中也被称为3×三角形，所以:

[math]\displaystyle{ C = \frac{3\times (\mbox{number of triangles)}}{\mbox{number of all triplets}} }[/math].

Opsahl 和 Panzarasa (2009)提出了加权网络的一般化，^[5]和一种双模式网络(包括二进制和加权)。

局部聚集系数

图中某顶点(节点)的局部集聚系数量化了其邻居节点相互聚集构成团(完全图)的程度。 Duncan J. Watts 和 Steven Strogatz 在1998年引入该测量方法来确定一个图是否构成小世界网络。

一个图 G=(V,E)，形式上由一组节点和节点之间的连边组成。边连接节点。一个节点[math]\displaystyle{ v_{i} }[/math] 的邻域[math]\displaystyle{ N_{i} }[/math]被定义为其紧邻的节点，具体如下:

[math]\displaystyle{ N_{i}=\{v_{j}:e_{ij}\in E\lor e_{ji}\in E\}. }[/math]

我们将 [math]\displaystyle{ k_{i} }[/math]定义为节点的个数，|[math]\displaystyle{ N_{i} }[/math]| 为邻域，[math]\displaystyle{ N_{i} }[/math]为邻域节点数目。

一个节点[math]\displaystyle{ v_{i} }[/math]的局部集聚系数[math]\displaystyle{ C_{i} }[/math]由邻域内节点之间的连边除以它们之间可能存在的连边数量的比例给出。对于有向图，[math]\displaystyle{ e_{ij} }[/math]不同于[math]\displaystyle{ e_{ji} }[/math] ，因此对于每个邻域[math]\displaystyle{ N_{i} }[/math] ，邻域内的节点之间可能存在[math]\displaystyle{ k_{i}(k_{i}-1) }[/math]链([math]\displaystyle{ k_{i} }[/math]是一个顶点的邻域数)。因此，有向图的局部集聚系数为^[2]

[math]\displaystyle{ C_i = \frac{|\{e_{jk}: v_j,v_k \in N_i, e_{jk} \in E\}|}{k_i(k_i-1)}. }[/math]

无向图的[math]\displaystyle{ e_{ij} }[/math]和[math]\displaystyle{ e_{ji} }[/math]被认为具有相同的性质。因此，如果一个顶点有[math]\displaystyle{ v_{i} }[/math]邻域， [math]\displaystyle{ k_i }[/math]边可以存在于邻域内的顶点之间。因此，无向图的局部集聚系数可以定义为

[math]\displaystyle{ C_i = \frac{2|\{e_{jk}: v_j,v_k \in N_i, e_{jk} \in E\}|}{k_i(k_i-1)}. }[/math]

设[math]\displaystyle{ \lambda_G{(v)} }[/math]是无向图[math]\displaystyle{ G }[/math]上的三角形个数[math]\displaystyle{ v \in V(G) }[/math]。也就是说，[math]\displaystyle{ \lambda_G{(v)} }[/math]是 [math]\displaystyle{ G }[/math]的3条边和3个节点的子图的个数，其中一个是[math]\displaystyle{ v }[/math]。设[math]\displaystyle{ \tau_G{(v)} }[/math]是[math]\displaystyle{ v \in V(G) }[/math]上的三倍数。即，[math]\displaystyle{ \tau_G{(v)} }[/math]具有2条边和3个顶点的子图(不一定是诱导的)的个数，其中一条边与两条边相关。那么我们也可以将集聚系数定义为

[math]\displaystyle{ C_{i} = \frac{\lambda_G{(v)}}{\tau_G{(v)}}. }[/math]

易证明前面的两个定义是相同的，因为

[math]\displaystyle{ \tau_G(v) = C({k_i},2) = \frac{1}{2}k_i(k_i-1). }[/math]

如果所连接的每个邻居也连接到邻近的每个其他节点[math]\displaystyle{ v_{i} }[/math]，则这些测量值为1; 如果没有任何节点[math]\displaystyle{ v_{i} }[/math]连接到所连接的任何其他节点[math]\displaystyle{ v_{i} }[/math]，则这些测量值为0。

网络整体聚类水平

作为全局聚类集聚系数的代替，Watts 和 Strogatz ^[2]将所有顶点局部聚类系数的平均值作为网络整体聚类水平[math]\displaystyle{ n }[/math] :^[6]

[math]\displaystyle{ \bar{C} = \frac{1}{n}\sum_{i=1}^{n} C_i. }[/math]

值得注意的是，该度量将更多权重放在低度节点上，而传递性比率将更多权重放在高度节点上。事实上，每个局部聚类分数由[math]\displaystyle{ k_{i}(k_{i}-1) }[/math]的加权平均数模型与全局聚类集聚系数模型是相同的。如果图有一个小的平均最短路径长度与网络中节点数量的自然对数[math]\displaystyle{ \ ln{{N}} }[/math]一起延展^[7]，那么这个图被认为是小世界，。例如，随机图是小世界图，而格子图不是，无标度图通常被认为是超小世界图，因为它们的平均最短路径长度随[math]\displaystyle{ \ln{\ln{N}} }[/math]延展。 Barrat 等人提出了广义的加权网络(2004) ，^[8]，Latapy 等人(2008)^[9]和 Opsahl (2009).<>重新定义了二部图(也称为双模网络)

Fagiolo (2007)^[10] , Clemente 和Grassi (2018)^[11] 提出了另一种加权有向图网络的推广概念。默认情况下，这个公式不适用于具有孤立顶点的图; 参见 Kaiser (2008)^[12]和 Barmpoutis 等^[13] . 具有最大可能平均集聚系数的网络被发现具有模块结构，且不同节点之间存在尽可能小的平均距离。^[13]

参考链接

↑ P. W. Holland and S. Leinhardt (1971). "Transitivity in structural models of small groups". Comparative Group Studies. 2 (2): 107–124. doi:10.1177/104649647100200201.
↑ ^2.0 ^2.1 ^2.2 D. J. Watts and Steven Strogatz (June 1998). "Collective dynamics of 'small-world' networks". Nature. 393 (6684): 440–442. Bibcode:1998Natur.393..440W. doi:10.1038/30918. PMID 9623998.
↑ R. D. Luce and A. D. Perry (1949). "A method of matrix analysis of group structure". Psychometrika. 14 (1): 95–116. doi:10.1007/BF02289146. PMID 18152948.
↑ Stanley Wasserman, Katherine Faust, 1994. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press.
↑ Tore Opsahl and Pietro Panzarasa (2009). "Clustering in Weighted Networks". Social Networks. 31 (2): 155–163. doi:10.1016/j.socnet.2009.02.002.
↑ Kemper, Andreas (2009). Valuation of Network Effects in Software Markets: A Complex Networks Approach. Springer. p. 142. ISBN 9783790823660. https://books.google.com/books?id=isdza0IiXbUC&pg=PA142.
↑ http://networksciencebook.com/4#ultra-small
↑ Barrat, A.; Barthelemy, M.; Pastor-Satorras, R.; Vespignani, A. (2004). "The architecture of complex weighted networks". Proceedings of the National Academy of Sciences. 101 (11): 3747–3752. arXiv:cond-mat/0311416. Bibcode:2004PNAS..101.3747B. doi:10.1073/pnas.0400087101. PMC 374315. PMID 15007165.
↑ Latapy, M.; Magnien, C.; Del Vecchio, N. (2008). "Basic Notions for the Analysis of Large Two-mode Networks". Social Networks. 30 (1): 31–48. doi:10.1016/j.socnet.2007.04.006.
↑ Fagiolo, G. (2007). "Clustering in complex directed networks". Physical Review E. 76 (2 Pt 2): 026107. CiteSeerX 10.1.1.262.1006. doi:10.1103/PhysRevE.76.026107. PMID 17930104.
↑ Clemente, G.P.; Grassi, R. (2018). "Directed clustering in weighted networks: A new perspective". Chaos, Solitons & Fractals. 107: 26–38. arXiv:1706.07322. Bibcode:2018CSF...107...26C. doi:10.1016/j.chaos.2017.12.007.
↑ Kaiser, Marcus (2008). "Mean clustering coefficients: the role of isolated nodes and leafs on clustering measures for small-world networks". New Journal of Physics. 10 (8): 083042. arXiv:0802.2512. Bibcode:2008NJPh...10h3042K. doi:10.1088/1367-2630/10/8/083042.
↑ ^13.0 ^13.1 Barmpoutis, D.; Murray, R. M. (2010). "Networks with the Smallest Average Distance and the Largest Average Clustering". arXiv:1007.4031 [q-bio.MN].

本中文词条由Gravity PHY用户参与编译，刘佩佩用户审校，欢迎在讨论页面留言。

本词条内容源自wikipedia及公开资料，遵守 CC3.0协议。

[HollandLeinhardt1971-1] P. W. Holland and S. Leinhardt (1971). "Transitivity in structural models of small groups". Comparative Group Studies. 2 (2): 107–124. doi:10.1177/104649647100200201.

[WattsStrogatz1998-2] 2.0 ^2.1 ^2.2 D. J. Watts and Steven Strogatz (June 1998). "Collective dynamics of 'small-world' networks". Nature. 393 (6684): 440–442. Bibcode:1998Natur.393..440W. doi:10.1038/30918. PMID 9623998.

[3] R. D. Luce and A. D. Perry (1949). "A method of matrix analysis of group structure". Psychometrika. 14 (1): 95–116. doi:10.1007/BF02289146. PMID 18152948.

[4] Stanley Wasserman, Katherine Faust, 1994. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press.

[5] Tore Opsahl and Pietro Panzarasa (2009). "Clustering in Weighted Networks". Social Networks. 31 (2): 155–163. doi:10.1016/j.socnet.2009.02.002.

[6] Kemper, Andreas (2009). Valuation of Network Effects in Software Markets: A Complex Networks Approach. Springer. p. 142. ISBN 9783790823660. https://books.google.com/books?id=isdza0IiXbUC&pg=PA142.

[7] ttp://networksciencebook.com/4#ultra-small

[8] Barrat, A.; Barthelemy, M.; Pastor-Satorras, R.; Vespignani, A. (2004). "The architecture of complex weighted networks". Proceedings of the National Academy of Sciences. 101 (11): 3747–3752. arXiv:cond-mat/0311416. Bibcode:2004PNAS..101.3747B. doi:10.1073/pnas.0400087101. PMC 374315. PMID 15007165.

[9] Latapy, M.; Magnien, C.; Del Vecchio, N. (2008). "Basic Notions for the Analysis of Large Two-mode Networks". Social Networks. 30 (1): 31–48. doi:10.1016/j.socnet.2007.04.006.

[10] Fagiolo, G. (2007). "Clustering in complex directed networks". Physical Review E. 76 (2 Pt 2): 026107. CiteSeerX 10.1.1.262.1006. doi:10.1103/PhysRevE.76.026107. PMID 17930104.

[11] Clemente, G.P.; Grassi, R. (2018). "Directed clustering in weighted networks: A new perspective". Chaos, Solitons & Fractals. 107: 26–38. arXiv:1706.07322. Bibcode:2018CSF...107...26C. doi:10.1016/j.chaos.2017.12.007.

[12] Kaiser, Marcus (2008). "Mean clustering coefficients: the role of isolated nodes and leafs on clustering measures for small-world networks". New Journal of Physics. 10 (8): 083042. arXiv:0802.2512. Bibcode:2008NJPh...10h3042K. doi:10.1088/1367-2630/10/8/083042.

[BarmpoutisMurray2010-13] 13.0 ^13.1 Barmpoutis, D.; Murray, R. M. (2010). "Networks with the Smallest Average Distance and the Largest Average Clustering". arXiv:1007.4031 [q-bio.MN].

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

@@ 第26行： / 第26行： @@
 全局集聚系数以节点三重性（triplets）为基础。 一个三元组是由两个(开三元组)或三个(闭三元组)无向联系相连接的三个节点。 因此，一个三角形图包括三个闭合的三联体，每个节点中心为一个(注意，这意味着三角形中的三联体来自节点的重叠选择)。 全局集聚系数是三联体总数除以闭合三联体(或3 x 三角形)的个数。 Luce和Perry(1949年)第一次尝试对其进行测量。 <ref>{{Cite journal
-  | author = [[R. D. Luce]] and [[A. D. Perry]]
+  | author = R. D. Luce and A. D. Perry
   | title = A method of matrix analysis of group structure
   | year = 1949
@@ 第45行： / 第45行： @@
 Opsahl 和 Panzarasa (2009)提出了加权网络的一般化，<ref>{{Cite journal
-  | author = [[Tore Opsahl]] and [[Pietro Panzarasa]]
+  | author = Tore Opsahl and Pietro Panzarasa
   | title = Clustering in Weighted Networks
   | url = http://toreopsahl.com/2009/04/03/article-clustering-in-weighted-networks/