LFR算法

兰奇基内蒂-福图纳托-拉迪奇基准测试 Lancichinetti–Fortunato–Radicchi benchmark （LFR）是一种生成基准网络 baseline network（类似于真实世界网络的人工网络）的算法。他们有一个预先已知的社区，用于比较不同的社区检测方法。^[1]与其他方法相比，基准测试的优点在于它解释了节点度分布 node degree distribution和社区规模分布 Community size distribution的异质性。^[2]

算法

节点度 node degree和社区规模 Community size按幂律分布，但指数不同。基准测试假设节点度和社区规模都具有不同指数的幂律分布，分别为[math]\displaystyle{ \gamma }[/math]和[math]\displaystyle{ \beta }[/math]。[math]\displaystyle{ N }[/math]是节点的数量，平均度为[math]\displaystyle{ \langle k \rangle }[/math]。混合参数[math]\displaystyle{ \mu }[/math]是一个节点的相邻节点的平均比例，这些相邻节点不属于基准节点所属的任何社区。这个参数控制着社区之间的边缘比例。^[3]

生成基准网络的步骤如下：

步骤1：生成一个网络，其节点遵循指数为[math]\displaystyle{ \gamma }[/math]的幂律分布，并选择分布的极值[math]\displaystyle{ k_{\min} }[/math]和[math]\displaystyle{ k_{\max} }[/math]来获得期望平均度[math]\displaystyle{ \langle k\rangle }[/math]。

步骤2：每个节点的[math]\displaystyle{ (1 - \mu) }[/math]链接部分与同一社区的节点相同，而[math]\displaystyle{ \mu }[/math]部分与其他节点相同。

步骤3：根据指数为[math]\displaystyle{ \beta }[/math]的幂律分布生成社区规模。所有规模大小的和必须等于[math]\displaystyle{ N }[/math]。最小和最大的[math]\displaystyle{ s_{\min} }[/math][math]\displaystyle{ s_{\max} }[/math]必须满足社区的定义，这样每个非孤立的节点至少存在于一个群落中：[math]\displaystyle{ s_{\min} \gt k_{\min} }[/math] ； [math]\displaystyle{ s_{\max} \gt k_{\max} }[/math]

步骤4：最初，没有为任何社区分配任何节点。然后，每个节点被随机分配到一个社区。只要社区内相邻节点的数量不超过社区规模，就会向社区添加一个新节点，否则就不会添加。在接下来的迭代中，无归属的节点被随机分配给某个社区。如果该社区是完备的，即规模已经用尽，必须随机选择社区中的一个节点并断开其链接。当所有社区都完备且所有节点都至少属于一个社区时停止迭代。

步骤5：对节点重新布线，保持相同的节点度，但只影响内部和外部链接，使得每个节点在社区外的链接数量约等于混合参数[math]\displaystyle{ \mu }[/math]。^[2]

调试

考虑社区的一个不重叠分割。每次迭代中随机选择的节点的社区遵循一个[math]\displaystyle{ p(C) }[/math]分布，这个分布表示随机选择的节点来自社区[math]\displaystyle{ C }[/math]的概率。考虑同一个网络的一个分割，这个分割由一些社区搜索算法预测得出，并且具有[math]\displaystyle{ p(C_2) }[/math]分布。基准分割具有[math]\displaystyle{ p(C_1) }[/math]分布。

联合分布为[math]\displaystyle{ p(C_1, C_2) }[/math]。这两个分割的相似性可以通过归一化互信息得到。

[math]\displaystyle{ I_n = \frac{\sum_{C_1,C_2} p(C_1,C_2) \log_2 \frac{p(C_1,C_2)}{p(C_1)p(C_2)} }{\frac 1 2 H(\{p(C_1)\}) + \frac 1 2 H(\{p(C_2)\})} }[/math]

如果[math]\displaystyle{ I_n=1 }[/math]基准和检测到的分割相同，且[math]\displaystyle{ I_n=0 }[/math]，那么它们彼此独立。^[4]

参考文献

↑ Hua-Wei Shen (2013). "Community Structure of Complex Networks". Springer Science & Business Media. 11–12.
↑ ^2.0 ^2.1 A. Lancichinetti, S. Fortunato, and F. Radicchi.(2008) Benchmark graphs for testing community detection algorithms. Physical Review E, 78. 模板:ArXiv
↑ Twan van Laarhoven and Elena Marchiori (2013). "Network community detection with edge classifiers trained on LFR graphs". https://www.cs.ru.nl/~elenam/paper-learning-community.pdf
↑ Barabasi, A.-L. (2014). "Network Science". Chapter 9: Communities.

本词条内容翻译自 wikipedia.org，遵守 CC3.0协议。

[1] Hua-Wei Shen (2013). "Community Structure of Complex Networks". Springer Science & Business Media. 11–12.

[original-2] 2.0 ^2.1 A. Lancichinetti, S. Fortunato, and F. Radicchi.(2008) Benchmark graphs for testing community detection algorithms. Physical Review E, 78. 模板:ArXiv

[3] Twan van Laarhoven and Elena Marchiori (2013). "Network community detection with edge classifiers trained on LFR graphs". https://www.cs.ru.nl/~elenam/paper-learning-community.pdf

[4] Barabasi, A.-L. (2014). "Network Science". Chapter 9: Communities.

[1]

[2]

[3]

[4]

LFR算法

目录

算法

调试

参考文献

编者推荐

利用强化学习寻找复杂系统的关键节点

书籍领读：随机网络

复杂系统中的幂律分布

导航菜单

搜索