“LFR算法”的版本间的差异

2020年11月22日 (日) 23:23的版本

兰奇基内蒂-福图纳托-拉迪奇基准测试 Lancichinetti–Fortunato–Radicchi benchmark （LFR）是一种生成基准网络(baseline network)（类似于真实世界网络的人工网络）的算法。他们有一个预先已知的社区，用于比较不同的社区检测方法。^[1]与其他方法相比，基准测试的优点在于它解释了节点度分布(node degree distribution)和社区规模分布(Community size distribution)的异质性。^[2]

算法

节点度(node degree)和社区规模(Community size)按幂律分布，但指数不同。基准测试假设节点度和社区规模都具有不同指数的幂律分布，分别为[math]\displaystyle{ \gamma }[/math]和[math]\displaystyle{ \beta }[/math]。[math]\displaystyle{ N }[/math]是节点的数量，平均度为[math]\displaystyle{ \langle k \rangle }[/math]。混合参数[math]\displaystyle{ \mu }[/math]是一个节点的相邻节点的平均比例，这些相邻节点不属于基准节点所属的任何社区。这个参数控制着社区之间的边缘比例。^[3]

生成基准网络的步骤如下：

步骤1：生成一个网络，其节点遵循指数为[math]\displaystyle{ \gamma }[/math]的幂律分布，并选择分布的极值[math]\displaystyle{ k_{\min} }[/math]和[math]\displaystyle{ k_{\max} }[/math]来获得期望平均度[math]\displaystyle{ \langle k\rangle }[/math]。

步骤2：每个节点的[math]\displaystyle{ (1 - \mu) }[/math]链接部分与同一社区的节点相同，而[math]\displaystyle{ \mu }[/math]部分与其他节点相同。

步骤3：根据指数为[math]\displaystyle{ \beta }[/math]的幂律分布生成社区规模。所有规模大小的和必须等于[math]\displaystyle{ N }[/math]。最小和最大的[math]\displaystyle{ s_{\min} }[/math][math]\displaystyle{ s_{\max} }[/math]必须满足社区的定义，这样每个非孤立的节点至少存在于一个群落中：[math]\displaystyle{ s_{\min} \gt k_{\min} }[/math] ； [math]\displaystyle{ s_{\max} \gt k_{\max} }[/math]

步骤4：最初，没有为任何社区分配任何节点。然后，每个节点被随机分配到一个社区。只要社区内相邻节点的数量不超过社区规模，就会向社区添加一个新节点，否则就不会添加。在接下来的迭代中，无归属的节点被随机分配给某个社区。如果该社区是完备的，即规模已经用尽，必须随机选择社区中的一个节点并断开其链接。当所有社区都完备且所有节点都至少属于一个社区时停止迭代。

步骤5：对节点重新布线，保持相同的节点度，但只影响内部和外部链接，使得每个节点在社区外的链接数量约等于混合参数[math]\displaystyle{ \mu }[/math]。^[2]

调试

考虑社区的一个不重叠分割。每次迭代中随机选择的节点的社区遵循一个[math]\displaystyle{ p(C) }[/math]分布，这个分布表示随机选择的节点来自社区[math]\displaystyle{ C }[/math]的概率。考虑同一个网络的一个分割，这个分割由一些社区搜索算法预测得出，并且具有[math]\displaystyle{ p(C_2) }[/math]分布。基准分割具有[math]\displaystyle{ p(C_1) }[/math]分布。

联合分布为[math]\displaystyle{ p(C_1, C_2) }[/math]。这两个分割的相似性可以通过归一化互信息得到。

[math]\displaystyle{ I_n = \frac{\sum_{C_1,C_2} p(C_1,C_2) \log_2 \frac{p(C_1,C_2)}{p(C_1)p(C_2)} }{\frac 1 2 H(\{p(C_1)\}) + \frac 1 2 H(\{p(C_2)\})} }[/math]

如果[math]\displaystyle{ I_n=1 }[/math]基准和检测到的分割相同，且[math]\displaystyle{ I_n=0 }[/math]，那么它们彼此独立。^[4]

参考文献

↑ Hua-Wei Shen (2013). "Community Structure of Complex Networks". Springer Science & Business Media. 11–12.
↑ ^2.0 ^2.1 A. Lancichinetti, S. Fortunato, and F. Radicchi.(2008) Benchmark graphs for testing community detection algorithms. Physical Review E, 78. 模板:ArXiv
↑ Twan van Laarhoven and Elena Marchiori (2013). "Network community detection with edge classifiers trained on LFR graphs". https://www.cs.ru.nl/~elenam/paper-learning-community.pdf
↑ Barabasi, A.-L. (2014). "Network Science". Chapter 9: Communities.

本中文词条粲兰翻译，由黄秋莉审校，薄荷编辑欢迎在讨论页面留言。

本词条内容翻译自 wikipedia.org，遵守 CC3.0协议。

[1] Hua-Wei Shen (2013). "Community Structure of Complex Networks". Springer Science & Business Media. 11–12.

[original-2] 2.0 ^2.1 A. Lancichinetti, S. Fortunato, and F. Radicchi.(2008) Benchmark graphs for testing community detection algorithms. Physical Review E, 78. 模板:ArXiv

[3] Twan van Laarhoven and Elena Marchiori (2013). "Network community detection with edge classifiers trained on LFR graphs". https://www.cs.ru.nl/~elenam/paper-learning-community.pdf

[4] Barabasi, A.-L. (2014). "Network Science". Chapter 9: Communities.

[1]

[2]

[3]

[4]

@@ 第3行： / 第3行： @@
 |description=致命弱点,希腊神话，特洛伊战争
 }}
-'''<font color="#ff8000">兰奇基内蒂-福图纳托-拉迪奇基准测试 Lancichinetti–Fortunato–Radicchi benchmark （LFR）</font>'''是一种生成'''基准网络 baseline network'''（类似于真实世界网络的人工网络）的算法。他们有一个预先已知的社区，用于比较不同的社区检测方法。<ref>Hua-Wei Shen (2013). "Community Structure of Complex Networks". Springer Science & Business Media. 11–12.</ref>与其他方法相比，基准测试的优点在于它解释了'''节点度分布'''和'''社区规模分布'''的[[异质性]]。<ref name="original">A. Lancichinetti, S. Fortunato, and F. Radicchi.(2008) Benchmark graphs for testing community detection algorithms. Physical Review E, 78. {{ArXiv|0805.4770}}</ref>
+'''<font color="#ff8000">兰奇基内蒂-福图纳托-拉迪奇基准测试 Lancichinetti–Fortunato–Radicchi benchmark （LFR）</font>'''是一种生成'''<font color="#ff8000">基准网络(baseline network)</font>'''（类似于真实世界网络的人工网络）的算法。他们有一个预先已知的社区，用于比较不同的社区检测方法。<ref>Hua-Wei Shen (2013). "Community Structure of Complex Networks". Springer Science & Business Media. 11–12.</ref>与其他方法相比，基准测试的优点在于它解释了'''<font color="#ff8000">节点度分布(node degree distribution)</font>'''和'''<font color="#ff8000">社区规模分布(Community size distribution)</font>'''的[[异质性]]。<ref name="original">A. Lancichinetti, S. Fortunato, and F. Radicchi.(2008) Benchmark graphs for testing community detection algorithms. Physical Review E, 78. {{ArXiv|0805.4770}}</ref>
 ==算法==
-'''节点度''' 和''' 社区规模 '''按[[幂律分布]]，但指数不同。基准测试假设节点度和社区规模都具有不同指数的幂律分布，分别为<math>\gamma</math>和<math>\beta</math>。<math>N</math>是节点的数量，平均度为<math>\langle k \rangle</math>。混合参数<math>\mu</math>是一个节点的相邻节点的平均比例，这些相邻节点不属于基准节点所属的任何社区。这个参数控制着社区之间的边缘比例。<ref>Twan van Laarhoven and Elena Marchiori (2013). "Network community detection with edge classifiers trained on LFR graphs". https://www.cs.ru.nl/~elenam/paper-learning-community.pdf</ref>
+'''<font color="#ff8000">节点度(node degree)</font>'''和'''<font color="#ff8000">社区规模(Community size)</font>'''按[[幂律分布]]，但指数不同。基准测试假设节点度和社区规模都具有不同指数的幂律分布，分别为<math>\gamma</math>和<math>\beta</math>。<math>N</math>是节点的数量，平均度为<math>\langle k \rangle</math>。混合参数<math>\mu</math>是一个节点的相邻节点的平均比例，这些相邻节点不属于基准节点所属的任何社区。这个参数控制着社区之间的边缘比例。<ref>Twan van Laarhoven and Elena Marchiori (2013). "Network community detection with edge classifiers trained on LFR graphs". https://www.cs.ru.nl/~elenam/paper-learning-community.pdf</ref>

“LFR算法”的版本间的差异

2020年11月22日 (日) 23:23的版本

算法

调试

参考文献

导航菜单

搜索