更改

流网络根据流距离嵌入空间 (查看源代码)

2020年10月16日 (五) 19:43的版本

添加14,831字节、 2020年10月16日 (五) 19:43

创建页面，内容为“我们的'''目的是根据流网络的流距离（[http://wiki.swarma.net/index.php/%E5%9F%BA%E4%BA%8E%E9%A9%AC%E5%B0%94%E5%8F%AF%E5%A4%AB%E9%93%BE%E7%9A%84…”

我们的'''目的是根据[[流网络]]的[[流距离]]（[http://wiki.swarma.net/index.php/%E5%9F%BA%E4%BA%8E%E9%A9%AC%E5%B0%94%E5%8F%AF%E5%A4%AB%E9%93%BE%E7%9A%84%E6%B5%81%E7%BD%91%E7%BB%9C%E5%88%86%E6%9E%90 平均首达时间矩阵L]）矩阵，将整个网络嵌入到一个欧氏空间里面去，使得任意两点的欧氏距离尽量靠近他们的流距离.<ref name="ENA">{{cite journal | last = Peiteng Shi | first = Xiaohan Huang | date = 2015 | title = A Geometric Representation of Collective Attention Flows | journal = PLOS ONE}}</ref><ref name="Big">{{cite journal | last = Shavitt Y | first = Tankel T | date = 2004 | title = Big-bang simulation for embedding network distances in euclidean space | journal = IEEE/ACM Transactions on Networking (TON)}}</ref>
=原理解析=
给定某流网络，假想网络中各节点两两之间用弹簧相互连接，嵌入要做的即调整弹簧，使其松弛长度等于流距离。初始时各节点在欧式空间的坐标完全随机，以此我们可以计算节点两两之间的欧式距离。如果两节点间的流距离大于相应的欧式距离，表示弹簧被压缩了，那么我们应该拉伸这两节点间的弹簧；反之，我们应该压缩该弹簧。具体拉伸或者压缩多少取决于弹簧系数k及流距离和欧式距离的偏差x，即形变量=k*x。我们用该形变量去更新相应的节点欧式位置。反复这一过程，直至收敛。（注：实际运用之，我们常常不考虑流网络的源和汇节点，即流距离矩阵中不需要含有源和汇的项）

=嵌入程序（Matlab版本）=
==初始准备==
此处程序没有考虑流矩阵中无穷大的情况
===输入变量流距离矩阵-flow_distance,节点编号-nodes_num，嵌入维数dim===
<syntaxhighlight lang="matlab">

dim=10;
max_distance=max(max(flow_distance));
max_x=max_distance;
max_y=max_x;

</syntaxhighlight>
===%为每个节点生成随机位置===
<syntaxhighlight lang="matlab">

initial_pos=[];
for i=1:length(nodes_num)
tmp=[];
for j=1:dim
tmp=[tmp max_x-2*max_x*rand(1)];
end
initial_pos=[initial_pos;tmp];
end

</syntaxhighlight>
==第一步优化，按照欧氏距离和流距离的差别大小受力==
===开始运动===
<syntaxhighlight lang="matlab">

current_pos=initial_pos;
iterations=1;
complete_temperature_iterations = 0;
current_target= compute_distance(initial_pos,flow_distance );
basic_size=20;
target=[];
all_temperature=[];
probility=[];

while iterations< 200
iterations
%-------------产生新的坐标-------------%
new_pos=[];
distance_matrix=compute_distance_matrix(current_pos);
k=1.0000e-0;
for i=1:size(current_pos,1)
%计算每一个节点受到的合力
%计算节点到其他节点距离和流距离的差值
diff_distance=(distance_matrix(i,:)-flow_distance(i,:))';
force_direction=-1*(repmat(current_pos(i,:),size(current_pos,1),1)-current_pos);
force_direction=force_direction./repmat(sqrt(sum(force_direction.^2,2)),1,size(current_pos,2)) ;
force_direction(isnan(force_direction))=0;

force_vector=diff_distance;
final_force_vector=force_direction.*repmat(force_vector,1,size(current_pos,2));
all_force=sum(final_force_vector,1)/length(diff_distance);
new_pos=[new_pos;current_pos(i,:)+all_force*k];
end

%-------------比较新的位置与原来的位置-------------%
current_distance=compute_distance(current_pos,flow_distance);
new_distance=compute_distance(new_pos,flow_distance);
diff = abs(current_distance - new_distance);
target=[target;current_distance];

if complete_temperature_iterations >= 10
temperature = cooling_rate*temperature;
complete_temperature_iterations = 0;
end
iterations = iterations + 1;
complete_temperature_iterations = complete_temperature_iterations + 1;
current_pos=new_pos;

end

</syntaxhighlight>
===第二步优化，按照每个节点失真程度大小受力===
===开始运动===
<syntaxhighlight lang="matlab">

initial_pos=new_pos;

iterations=1;
complete_temperature_iterations = 0;

current_target= compute_distance(initial_pos,flow_distance );
cooling_rate=0.9;
basic_size=20;
target=[];
all_temperature=[];
probility=[];
while iterations< 200
iterations

%-------------产生新的坐标-------------%
new_pos=[];

distance_matrix=compute_distance_matrix(current_pos);
distortion=distance_matrix./flow_distance-1;
distortion(isnan(distortion))=0;
k=30.0000e-0;
for i=1:size(current_pos,1)
%计算每一个节点受到的合力
%计算节点到其他节点距离和流距离的差值
diff_distance=distortion(:,i);
force_direction=-1*(repmat(current_pos(i,:),size(current_pos,1),1)-current_pos);
force_direction=force_direction./repmat(sqrt(sum(force_direction.^2,2)),1,size(current_pos,2)) ;
force_direction(isnan(force_direction))=0;
force_vector=diff_distance;
final_force_vector=force_direction.*repmat(force_vector,1,size(current_pos,2));
all_force=sum(final_force_vector,1)/length(diff_distance);
new_pos=[new_pos;current_pos(i,:)+all_force*k];
end

%-------------比较新的位置与原来的位置-------------%

current_distance=compute_distance(current_pos,flow_distance);
new_distance=compute_distance(new_pos,flow_distance);
diff = abs(current_distance - new_distance);
target=[target;current_distance];

iterations = iterations + 1;

all_temperature=[all_temperature;temperature];
end
</syntaxhighlight>
==计算拟合效果==
===横轴是流距离，纵轴是欧式距离===
<syntaxhighlight lang="matlab">

figure
tmp=bone_L;
tmp(logical(eye(size(bone_L,1))))=1;
tmp_L=flow_distance(find(tmp>0));
tmp_dis=distance_matrix(find(tmp>0));
plot(tmp_L(:),tmp_dis(:),'.')
hold on
plot([0; max([max(tmp_dis(:)) max(tmp_dis(:))])],[0;max([max(tmp_dis(:)) max(tmp_dis(:))])],'r-')

</syntaxhighlight>
===平均失真程度===
<syntaxhighlight lang="matlab">

distortion=tmp_dis./tmp_L;
distortion(isnan(distortion))=0;

distortion(find(distortion<1&distortion>0))=1./distortion(find(distortion<1&distortion>0));
average_distortion=sum(sum(distortion))/length(find(distortion>0))

</syntaxhighlight>
==中间计算距离的函数==
===现在距离与流距离的总差值===
<syntaxhighlight lang="matlab">

function [ target_distance ] = compute_distance(new_pos,flow_distance )
%UNTITLED Summary of this function goes here
%Detailed explanation goes here
new_distance_matrix=compute_distance_matrix(new_pos);
target_distance=sum(sum((new_distance_matrix-flow_distance).^2));
end
</syntaxhighlight>
===根据节点坐标，计算节点之间距离===
<syntaxhighlight lang="matlab">

function [ distance_matrix] = compute_distance_matrix( nodes )
%UNTITLED3 Summary of this function goes here
% Detailed explanation goes here
distance_matrix=zeros(size(nodes,1),size(nodes,1));
for i=1:size(nodes,1)
for j=i+1:size(nodes,1)
distance_matrix(i,j)=sqrt((nodes(i,1)-nodes(j,1))^2+(nodes(i,2)-nodes(j,2))^2);
distance_matrix(j,i)= distance_matrix(i,j);
end
end
flow_distance=distance_matrix;
end
</syntaxhighlight>

=嵌入程序（python版本）=
==输入参数==
<syntaxhighlight lang="python">

def spring_algorithm(_c, _dim):
pass

</syntaxhighlight >
_c：'''去掉源和汇节点'''的流距离矩阵，且默认已经将流距离无穷大的节点做处理了

_dim：嵌入维数
==初始化==
<syntaxhighlight lang="python">

# 最大小幅变化次数（可弹性调整）
max_count = 10

# 弹簧系数（可弹性调整，一般取在(0,1)之间，且k1>k2）
k1, k2 = [0.8, 0.5]

# 阈值（可弹性调整，依据具体问题而定）
threshold1, threshold2 = [0.05, 0.01]

# 各节点初始位置（dim维）
initial_pos = np.random.rand(_c.shape[0], _dim) # [0, 1)

</syntaxhighlight >

==Step1 按照欧氏距离和流距离的差别大小受力（快速调整）==
<syntaxhighlight lang="python">

# step1 按照欧氏距离和流距离的差别大小受力
total_pos_mat = np.mat(initial_pos)

t1 = 0
# 连续小幅度变化的次数
small_count1 = 0
# 与真实流距离的误差
error_list = []
# 上一次迭代的总距离误差
last_distance_sum_error = float("inf")
while True:
print '距离差受力第%d次迭代' % t1
for _i in range(_c.shape[0]):
node_i_pos_mat = total_pos_mat[_i]
repeat_node_i_pos_mat = np.matlib.repmat(node_i_pos_mat, _c.shape[0], 1) # 重复len(_c)*1次

##### 节点至其它节点的欧式距离euclidean_distance #####
distance_offset_mat = repeat_node_i_pos_mat - total_pos_mat # 当前节点i与其它节点实际位置偏移矩阵
dis_square = np.multiply(distance_offset_mat, distance_offset_mat) # 对应元素相乘
euclidean_distance = np.mat(np.sqrt(np.sum(dis_square, 1))) # 列向量：当前节点到其余所有节点（包括自己）的欧式距离

# 欧式距离和流距离的偏差行向量，即弹簧形变量x
diff_distance = np.mat(_c[_i, :] - euclidean_distance.T)

# 防止除以该向量中的0元素出现Nan
modifier_euclidean_distance = np.ones((_c.shape[0], 1)) - np.sign(euclidean_distance) + euclidean_distance

# 受力方向:单位向量当前节点距其余节点的距离向量/|当前节点距其余节点的距离向量|
direction = distance_offset_mat / modifier_euclidean_distance

##### 距离差受力矢量 F = k*x*direction #####
force = np.multiply(diff_distance.T, direction) * k1

##### 位置调整 #####
sum_force = np.sum(force, 0)
total_pos_mat[_i] += np.divide(sum_force, _c.shape[0]).astype(np.float64)

##### 误差项 #####
node_i_error = np.sum(np.multiply(diff_distance, diff_distance))
error_list.append(node_i_error)

sum_distance_error = sum(error_list)
print sum_distance_error
if abs(last_distance_sum_error - sum_distance_error) <= threshold1:
small_count1 += 1
if small_count1 >= max_count:
break

t1 += 1
last_distance_sum_error = sum_distance_error
error_list = []
</syntaxhighlight >

==Step2 按照每个节点失真程度大小受力（微调）==
<syntaxhighlight lang="python">

# step2 按照每个节点失真程度大小受力
t2 = 0
# 连续小幅度变化的次数
small_count2 = 0
# 失真度误差
distortion_list = []
# 上一次迭代的总失真度误差
last_sum_distortion_error = float("inf")
while True:
print '失真度受力第%d次迭代' % t2
for _i in range(_c.shape[0]):
node_i_pos_mat = total_pos_mat[_i]
repeat_node_i_pos_mat = np.matlib.repmat(node_i_pos_mat, _c.shape[0], 1) # 重复len(_c)*1次

##### 节点至其它节点的欧式距离euclidean_distance #####
distance_offset_mat = repeat_node_i_pos_mat - total_pos_mat # 当前节点i与其它节点实际位置差距矩阵
dis_square = np.multiply(distance_offset_mat, distance_offset_mat) # 对应元素相乘
euclidean_distance = np.mat(np.sqrt(np.sum(dis_square, 1))) # 列向量：当前节点到其余所有节点（包括自己）的欧式距离

# 防止除以该向量中的0元素出现Nan
modifier_euclidean_distance = np.ones((_c.shape[0], 1)) - np.sign(euclidean_distance) + euclidean_distance

# 失真度=max(_c[_i] / modifier_euclidean_distance, modifier_euclidean_distance / _c[_i])
# 这里为方便，只采用失真度=_c[_i] / modifier_euclidean_distance
distortion_ratio = _c[_i, :] / modifier_euclidean_distance.T
modifier_distortion = np.ones((1, _c.shape[0])) - np.sign(distortion_ratio) + distortion_ratio - 1

# 受力方向:单位向量当前节点距其余节点的距离向量/\当前节点距其余节点的距离向量\
direction = distance_offset_mat / modifier_euclidean_distance

##### 失真度受力矢量 #####
force = np.multiply(modifier_distortion.T, direction) * k2

##### 位置调整 #####
sum_force = np.sum(force, 0)
total_pos_mat[_i] += np.divide(sum_force, _c.shape[0]).astype(np.float64)

##### 误差项 #####
node_i_error = np.sum(np.multiply((distortion_ratio - 1), (distortion_ratio - 1)))
distortion_list.append(node_i_error)

sum_distortion_error = sum(distortion_list)
print sum_distortion_error

if abs(last_sum_distortion_error - sum_distortion_error) <= threshold2:
small_count2 += 1
if small_count2 >= max_count:
break

t2 += 1
last_sum_distortion_error = sum_distortion_error
distortion_list = []

return total_pos_mat

</syntaxhighlight>

==测试样例==
===思路===
随机生成n个dim维欧式空间中的点，计算各点间的欧式距离得到n*n维欧式距离矩阵，以此当作c矩阵输入到弹簧算法中，得到n个dim维嵌入向量。若dim>2,则降维到2维，可视化前后两种点坐标表示，比较
===程序===
<syntaxhighlight lang="python">

n, dim = 10, 2
# n各点初始位置（2维）
initial_pos = np.random.rand(n, dim) # [0, 1)
plt.subplot(121)
plt.title('original position')
plt.plot(initial_pos[:, 0], initial_pos[:, 1], 'ro')
for i, (_x, _y) in enumerate(zip(initial_pos[:, 0], initial_pos[:, 1])):
plt.text(_x, _y, i)

# 各节点间的欧式距离
edistance_list = []
for i in range(n):
distance_offset_vec = initial_pos[i] - initial_pos # 当前节点i与所有节点（包括自己）欧式位置差距
dis_square = np.multiply(distance_offset_vec, distance_offset_vec) # 对应元素相乘
edistance = np.sqrt(np.sum(dis_square, 1)).T.tolist() # 行向量
edistance_list.append(edistance)
edistance_matrix = np.mat(edistance_list)

# 嵌入（此处采用迭代500次作为终止条件，也可换成采用阈值，但是需要手动调整大小）
embed_vec = spring_algorithm(edistance_matrix, 2)

plt.subplot(122)
plt.title('embedding position')
plt.plot(embed_vec[:, 0], embed_vec[:, 1], 'bo')
for i, (_x, _y) in enumerate(zip(embed_vec[:, 0], embed_vec[:, 1])):
plt.text(_x, _y, i)

plt.show()

</syntaxhighlight>
===结果===
结果如下：

[[File:figure.png|600px]]

可以看到原始图和嵌入图非常相似，这证明弹簧算法的正确性。

'''注意：'''两幅图之间可能存在平移、旋转、对称线性变换关系，但是这属于正常现象。（要想完全吻合，需要加入摩擦系数）

=补充：[[Deepwalk|Deepwalk方式嵌入]]=
请参考[[Word2Vec与流网络]]

=参考文献=
<references />
[[category:旧词条迁移]]

小趣木木

113

个编辑

更改

流网络根据流距离嵌入空间 (查看源代码)

2020年10月16日 (五) 19:43的版本

导航菜单

搜索