图(抽象数据类型)

来自集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织
923397935讨论 | 贡献2020年8月18日 (二) 09:48的版本
跳到导航 跳到搜索

粗体文字 --趣木木讨论)在篇首写下 本词条由用户名初步翻译 本词条由信白初步翻译 此词条暂由彩云小译翻译,未经人工整理和审校,带来阅读不便,请见谅。

文件:Directed.svg
A directed graph with three vertices (blue circles) and three edges (black arrows).
 --趣木木讨论)注意图的格式  转行写[图1: 英文加翻译内容]

A directed graph with three vertices (blue circles) and three edges (black arrows).


一个[有三个顶点(蓝色圆圈)和三条边(黑色箭头)的有向图]


In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics.

In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics.

在计算机科学中,图是一种抽象的数据类型,用来实现数学中图论领域中的无向图和有向图的概念。


A graph data structure consists of a finite (and possibly mutable) set of vertices (also called nodes or points), together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph. These pairs are known as edges (also called links or lines), and for a directed graph are also known as arrows. The vertices may be part of the graph structure, or may be external entities represented by integer indices or references.

A graph data structure consists of a finite (and possibly mutable) set of vertices (also called nodes or points), together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph. These pairs are known as edges (also called links or lines), and for a directed graph are also known as arrows. The vertices may be part of the graph structure, or may be external entities represented by integer indices or references.

一个图的数据结构由一个有限的(也可能是可变的)顶点集(也称为节点或点) ,以及一组无向图的无序顶点对或有向图的有序对组成。这些连线称为边(也称为链接或直线) ,对于有向图,也称为箭头。顶点可以是图结构的一部分,也可以是由整数索引或引用表示的外部实体。


A graph data structure may also associate to each edge some edge value, such as a symbolic label or a numeric attribute (cost, capacity, length, etc.).

A graph data structure may also associate to each edge some edge value, such as a symbolic label or a numeric attribute (cost, capacity, length, etc.).

图形数据结构还可以为每个边关联一些边值,如符号标签或数字属性(成本、容量、长度等)。).


Operations

操作

 --趣木木讨论)章节名记得翻译

The basic operations provided by a graph data structure G usually include:[1]

The basic operations provided by a graph data structure G usually include:

图形数据结构 G 提供的基本操作通常包括:

 --趣木木讨论)变量斜体
  • adjacent('G', 'x', 'y'): tests whether there is an edge from the vertex x to the vertex y;
  • neighbors(G, x): lists all vertices y such that there is an edge from the vertex x to the vertex y;
  • add_vertex(G, x): adds the vertex x, if it is not there;
  • remove_vertex(G, x): removes the vertex x, if it is there;
  • add_edge(G, x, y): adds the edge from the vertex x to the vertex y, if it is not there;
  • remove_edge(G, x, y): removes the edge from the vertex x to the vertex y, if it is there;
  • get_vertex_value(G, x): returns the value associated with the vertex x;
  • set_vertex_value(G, x, v): sets the value associated with the vertex x to v.


Structures that associate values to the edges usually also provide:[1]

Structures that associate values to the edges usually also provide:

将值关联到边的结构通常还提供:

  • get_edge_value(G, x, y): returns the value associated with the edge (x, y);
  • set_edge_value(G, x, y, v): sets the value associated with the edge (x, y) to v.


Representations

Different data structures for the representation of graphs are used in practice:

Different data structures for the representation of graphs are used in practice:

图形表示的不同数据结构在实践中使用:

Adjacency list[2]
Adjacency list

邻接表

Vertices are stored as records or objects, and every vertex stores a list of adjacent vertices. This data structure allows the storage of additional data on the vertices. Additional data can be stored if edges are also stored as objects, in which case each vertex stores its incident edges and each edge stores its incident vertices.
Vertices are stored as records or objects, and every vertex stores a list of adjacent vertices. This data structure allows the storage of additional data on the vertices. Additional data can be stored if edges are also stored as objects, in which case each vertex stores its incident edges and each edge stores its incident vertices.

顶点作为记录或对象存储,每个顶点存储一个相邻顶点列表。这种数据结构允许在顶点上存储额外的数据。如果边也存储为对象,那么可以存储额外的数据,在这种情况下,每个顶点存储它的关联边,每个边存储它的关联顶点。

Adjacency matrix[3]
Adjacency matrix

邻接矩阵

A two-dimensional matrix, in which the rows represent source vertices and columns represent destination vertices. Data on edges and vertices must be stored externally. Only the cost for one edge can be stored between each pair of vertices.
A two-dimensional matrix, in which the rows represent source vertices and columns represent destination vertices. Data on edges and vertices must be stored externally. Only the cost for one edge can be stored between each pair of vertices.

一个二维矩阵,其中行表示源顶点,列表示目标顶点。关于边和顶点的数据必须存储在外部。只有一个边的开销可以存储在每对顶点之间。

Incidence matrix[4]
Incidence matrix

关联矩阵

A two-dimensional Boolean matrix, in which the rows represent the vertices and columns represent the edges. The entries indicate whether the vertex at a row is incident to the edge at a column.
A two-dimensional Boolean matrix, in which the rows represent the vertices and columns represent the edges. The entries indicate whether the vertex at a row is incident to the edge at a column.

一个二维布尔矩阵,其中行表示顶点,列表示边。条目指示行上的顶点是否与列上的边相关联。


The following table gives the time complexity cost of performing various operations on graphs, for each of these representations, with |V | the number of vertices and |E | the number of edges.[citation needed] In the matrix representations, the entries encode the cost of following an edge. The cost of edges that are not present are assumed to be ∞.

The following table gives the time complexity cost of performing various operations on graphs, for each of these representations, with |V | the number of vertices and |E | the number of edges. In the matrix representations, the entries encode the cost of following an edge. The cost of edges that are not present are assumed to be ∞.

下表给出了在图上执行各种操作的时间复杂度,对于每个表示,用 | v | 顶点数和 | e | 边数。在矩阵表示中,条目对跟随边的代价进行编码。假定不存在的边的代价为∞。


{ | class = “ wikitable”
Adjacency list Adjacency list 邻接表 Adjacency matrix Adjacency matrix 邻接矩阵 Incidence matrix Incidence matrix 关联矩阵
Store graph Store graph Store graph [math]\displaystyle{ O(|V|+|E|) }[/math] [math]\displaystyle{ O(|V|+|E|) }[/math]

(| v | + | e |)

[math]\displaystyle{ O(|V|^2) }[/math] [math]\displaystyle{ O(|V|^2) }[/math] v | ^ 2) [math]\displaystyle{ O(|V|\cdot|E|) }[/math] [math]\displaystyle{ O(|V|\cdot|E|) }[/math] v | cdot | e |)
Add vertex Add vertex Add vertex [math]\displaystyle{ O(1) }[/math] [math]\displaystyle{ O(1) }[/math] < math > o (1) </math > [math]\displaystyle{ O(|V|^2) }[/math] [math]\displaystyle{ O(|V|^2) }[/math] v | ^ 2) [math]\displaystyle{ O(|V|\cdot|E|) }[/math] [math]\displaystyle{ O(|V|\cdot|E|) }[/math] v | cdot | e |)
Add edge Add edge Add edge [math]\displaystyle{ O(1) }[/math] [math]\displaystyle{ O(1) }[/math] < math > o (1) </math > [math]\displaystyle{ O(1) }[/math] [math]\displaystyle{ O(1) }[/math] < math > o (1) </math > [math]\displaystyle{ O(|V|\cdot|E|) }[/math] [math]\displaystyle{ O(|V|\cdot|E|) }[/math] v | cdot | e |)
Remove vertex Remove vertex 删除顶点 [math]\displaystyle{ O(|E|) }[/math] [math]\displaystyle{ O(|E|) }[/math] e |) [math]\displaystyle{ O(|V|^2) }[/math] [math]\displaystyle{ O(|V|^2) }[/math] v | ^ 2) [math]\displaystyle{ O(|V|\cdot|E|) }[/math] [math]\displaystyle{ O(|V|\cdot|E|) }[/math] v | cdot | e |)
Remove edge Remove edge Remove edge [math]\displaystyle{ O(|V|) }[/math] [math]\displaystyle{ O(|V|) }[/math] v |) </math > [math]\displaystyle{ O(1) }[/math] [math]\displaystyle{ O(1) }[/math] < math > o (1) </math > [math]\displaystyle{ O(|V|\cdot|E|) }[/math] [math]\displaystyle{ O(|V|\cdot|E|) }[/math] v | cdot | e |)
Are vertices x and y adjacent (assuming that their storage positions are known)? Are vertices x and y adjacent (assuming that their storage positions are known)? 顶点 x 和 y 是否相邻(假设它们的存储位置已知) ? [math]\displaystyle{ O(|V|) }[/math] [math]\displaystyle{ O(|V|) }[/math] v |) </math > [math]\displaystyle{ O(1) }[/math] [math]\displaystyle{ O(1) }[/math] < math > o (1) </math > [math]\displaystyle{ O(|E|) }[/math] [math]\displaystyle{ O(|E|) }[/math] e |)
Remarks Remarks 备注 Slow to remove vertices and edges, because it needs to find all vertices or edges Slow to remove vertices and edges, because it needs to find all vertices or edges

移除顶点和边的速度很慢,因为它需要找到所有的顶点或边

Slow to add or remove vertices, because matrix must be resized/copied Slow to add or remove vertices, because matrix must be resized/copied 增加或删除顶点速度慢,因为矩阵必须调整大小/复制 Slow to add or remove vertices and edges, because matrix must be resized/copied Slow to add or remove vertices and edges, because matrix must be resized/copied 增加或删除顶点和边时速度慢,因为矩阵必须调整大小/复制

|}


Adjacency lists are generally preferred because they efficiently represent sparse graphs. An adjacency matrix is preferred if the graph is dense, that is the number of edges |E | is close to the number of vertices squared, |V |2, or if one must be able to quickly look up if there is an edge connecting two vertices.[5][6]

Adjacency lists are generally preferred because they efficiently represent sparse graphs. An adjacency matrix is preferred if the graph is dense, that is the number of edges |E | is close to the number of vertices squared, |V |2, or if one must be able to quickly look up if there is an edge connecting two vertices.

邻接表通常是首选的,因为它们能有效地表示稀疏图。如果图是稠密的,那么邻接矩阵是首选的,即边的数目 | e | 接近于顶点的平方数,| v | < sup > 2 ,或者如果有一条边连接两个顶点,那么必须能够快速查找。


Parallel Graph Representations

The parallelization of graph problems faces significant challenges: Data-driven computations, unstructured problems, poor locality and high data access to computation ratio.[7][8] The graph representation used for parallel architectures plays a significant role in facing those challenges. Poorly chosen representations may unnecessarily drive up the communication cost of the algorithm, which will decrease its scalability. In the following, shared and distributed memory architectures are considered.

The parallelization of graph problems faces significant challenges: Data-driven computations, unstructured problems, poor locality and high data access to computation ratio. The graph representation used for parallel architectures plays a significant role in facing those challenges. Poorly chosen representations may unnecessarily drive up the communication cost of the algorithm, which will decrease its scalability. In the following, shared and distributed memory architectures are considered.

图问题的并行化面临着重大的挑战: 数据驱动的计算、非结构化问题、局部性差和计算数据访问率高。用于并行架构的图表示在面对这些挑战时扮演着重要的角色。选择不当的表示可能会不必要地增加算法的通信代价,从而降低算法的可扩展性。在下面,我们将考虑共享和分布式内存架构。


Shared memory

In the case of a shared memory model, the graph representations used for parallel processing are the same as in the sequential case,[9] since parallel read-only access to the graph representation (e.g. an adjacency list) is efficient in shared memory.

In the case of a shared memory model, the graph representations used for parallel processing are the same as in the sequential case, since parallel read-only access to the graph representation (e.g. an adjacency list) is efficient in shared memory.

在共享内存模型的情况下,用于并行处理的图表示与顺序处理相同,因为对图表示的并行只读访问(例如:。邻接表)是共享内存的有效方法。

 --趣木木讨论)通读一遍  注意多余符号的问题(例如:。邻接表)

Distributed Memory

In the distributed memory model, the usual approach is to partition the vertex set [math]\displaystyle{ V }[/math] of the graph into [math]\displaystyle{ p }[/math] sets [math]\displaystyle{ V_0, \dots, V_{p-1} }[/math]. Here, [math]\displaystyle{ p }[/math] is the amount of available processing elements (PE). The vertex set partitions are then distributed to the PEs with matching index, additionally to the corresponding edges. Every PE has its own subgraph representation, where edges with an endpoint in another partition require special attention. For standard communication interfaces like MPI, the ID of the PE owning the other endpoint has to be identifiable. During computation in a distributed graph algorithms, passing information along these edges implies communication.[9]

In the distributed memory model, the usual approach is to partition the vertex set [math]\displaystyle{ V }[/math] of the graph into [math]\displaystyle{ p }[/math] sets [math]\displaystyle{ V_0, \dots, V_{p-1} }[/math]. Here, [math]\displaystyle{ p }[/math] is the amount of available processing elements (PE). The vertex set partitions are then distributed to the PEs with matching index, additionally to the corresponding edges. Every PE has its own subgraph representation, where edges with an endpoint in another partition require special attention. For standard communication interfaces like MPI, the ID of the PE owning the other endpoint has to be identifiable. During computation in a distributed graph algorithms, passing information along these edges implies communication.

在分布式存储模型中,常用的方法是将图的顶点集合 < math > v </math > 分解为 < math > p </math > 集合 < math > v0,dots,v { p-1} </math > 。这里,< math > p </math > 是可用处理元素(PE)的数量。然后,顶点集合分区被分配到具有匹配索引的 pe 中,并附加到相应的边上。每个 PE 都有自己的子图表示法,其中带有另一个分区中端点的边需要特别注意。对于像 MPI 这样的标准通信接口,拥有其他端点的 PE 的 ID 必须是可识别的。在分布式图算法的计算过程中,沿着这些边传递信息意味着通信。


Partitioning the graph needs to be done carefully - there is a trade-off between low communication and even size partitioning[10] But partitioning a graph is a NP-hard problem, so it is not feasible to calculate them. Instead, the following heuristics are used.

Partitioning the graph needs to be done carefully - there is a trade-off between low communication and even size partitioning But partitioning a graph is a NP-hard problem, so it is not feasible to calculate them. Instead, the following heuristics are used.

图的划分需要仔细地进行——在低通信和甚至大小划分之间有一个权衡。但是图的划分是一个 np 难问题,因此计算它们是不可行的。相反,使用以下启发式。


1D partitioning: Every processor gets [math]\displaystyle{ n/p }[/math] vertices and the corresponding outgoing edges. This can be understood as a row-wise or column-wise decomposition of the adjacency matrix. For algorithms operating on this representation, this requires an All-to-All communication step as well as [math]\displaystyle{ \mathcal{O}(m) }[/math] message buffer sizes, as each PE potentially has outgoing edges to every other PE.[11]

1D partitioning: Every processor gets [math]\displaystyle{ n/p }[/math] vertices and the corresponding outgoing edges. This can be understood as a row-wise or column-wise decomposition of the adjacency matrix. For algorithms operating on this representation, this requires an All-to-All communication step as well as [math]\displaystyle{ \mathcal{O}(m) }[/math] message buffer sizes, as each PE potentially has outgoing edges to every other PE.

1D 分区: 每个处理器都会得到 < math > n/p </math > 顶点和相应的外出边。这可以理解为按行或按列对邻接矩阵进行分解。对于在这种表示形式上运行的算法,这需要一个 All-to-All 通信步骤以及 < math > mathcal { o }(m) </math > 消息缓冲区大小,因为每个 PE 可能具有相对于其他 PE 的传出边缘。


2D partitioning: Every processor gets a submatrix of the adjacency matrix. Assume the processors are aligned in a rectangle [math]\displaystyle{ p = p_r \times p_c }[/math], where [math]\displaystyle{ p_r 2D partitioning: Every processor gets a submatrix of the adjacency matrix. Assume the processors are aligned in a rectangle \lt math\gt p = p_r \times p_c }[/math], where [math]\displaystyle{ p_r 2 d 分区: 每个处理器都有一个邻接矩阵矩阵的子矩阵。假设处理器在一个矩形 \lt math \gt p = p _ r 乘以 p _ c }[/math] 中对齐,其中 < math > p _ r

</math> and [math]\displaystyle{ p_c }[/math] and [math]\displaystyle{ p_c [/math ]和[ math ] }[/math] are the amount of processing elements in each row and column, respectively. Then each processor gets a submatrix of the adjacency matrix of dimension [math]\displaystyle{ (n/p_r)\times(n/p_c) }[/math]. This can be visualized as a checkerboard pattern in a matrix.[11] Therefore, each processing unit can only have outgoing edges to PEs in the same row and column. This bounds the amount of communication partners for each PE to [math]\displaystyle{ p_r + p_c - 1 }[/math] out of [math]\displaystyle{ p = p_r \times p_c }[/math] possible ones.

</math> are the amount of processing elements in each row and column, respectively. Then each processor gets a submatrix of the adjacency matrix of dimension [math]\displaystyle{ (n/p_r)\times(n/p_c) }[/math]. This can be visualized as a checkerboard pattern in a matrix. Therefore, each processing unit can only have outgoing edges to PEs in the same row and column. This bounds the amount of communication partners for each PE to [math]\displaystyle{ p_r + p_c - 1 }[/math] out of [math]\displaystyle{ p = p_r \times p_c }[/math] possible ones.

</math > 是每行和每列中处理元素的数量。然后每个处理器得到维数 < math > (n/p _ r)乘以(n/p _ c) </math > 的邻接矩阵矩阵。这可以可视化为矩阵中的棋盘格模式。因此,每个处理单元只能在同一行和列中具有 pe 的外出边。这将每个 PE 的通信伙伴的数量限制为 < math > p _ r + p _ c-1 </math > 出 < math > p = p _ r 乘以 p _ c </math > 可能的伙伴。


See also

  • Graph rewriting for rule based transformations of graphs (graph data structures)
 --趣木木讨论)see also记得翻译完全
 --趣木木讨论)可以看看之前发的链接“集智百科翻译团队新手指南”按着上面的要求再核对一下 比如专业名词标橙 疑难句标绿

References

  1. 1.0 1.1 See, e.g. Goodrich & Tamassia (2015), Section 13.1.2: Operations on graphs, p. 360. For a more detailed set of operations, see Mehlhorn, K.; Näher, S. (1999), "Chapter 6: Graphs and their data structures", LEDA: A platform for combinatorial and geometric computing, Cambridge University Press, pp. 240–282.
  2. Cormen et al. (2001), pp. 528–529; Goodrich & Tamassia (2015), pp. 361-362.
  3. Cormen et al. (2001), pp. 529–530; Goodrich & Tamassia (2015), p. 363.
  4. Cormen et al. (2001), Exercise 22.1-7, p. 531.
  5. Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001), "Section 22.1: Representations of graphs", Introduction to Algorithms (Second ed.), MIT Press and McGraw-Hill, pp. 527–531, ISBN 0-262-03293-7.
  6. Goodrich, Michael T.; Tamassia, Roberto (2015), "Section 13.1: Graph terminology and representations", Algorithm Design and Applications, Wiley, pp. 355–364.
  7. Bader, David; Meyerhenke, Henning; Sanders, Peter; Wagner, Dorothea (January 2013) (in en). Graph Partitioning and Graph Clustering. Contemporary Mathematics. 588. American Mathematical Society. doi:10.1090/conm/588/11709. ISBN 978-0-8218-9038-7. http://www.ams.org/conm/588/. 
  8. LUMSDAINE, ANDREW; GREGOR, DOUGLAS; HENDRICKSON, BRUCE; BERRY, JONATHAN (March 2007). "CHALLENGES IN PARALLEL GRAPH PROCESSING". Parallel Processing Letters. 17 (01): 5–20. doi:10.1142/s0129626407002843. ISSN 0129-6264.
  9. 9.0 9.1 Sanders, Peter; Mehlhorn, Kurt; Dietzfelbinger, Martin; Dementiev, Roman (2019) (in en). Sequential and Parallel Algorithms and Data Structures: The Basic Toolbox. Springer International Publishing. ISBN 978-3-030-25208-3. https://www.springer.com/gp/book/9783030252083. 
  10. "Parallel Processing of Graphs" (PDF).{{cite web}}: CS1 maint: url-status (link)
  11. 11.0 11.1 "Parallel breadth-first search on distributed memory systems | Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis". dl.acm.org (in English). doi:10.1145/2063384.2063471. Retrieved 2020-02-06.


External links

模板:Commons category

  • GraphMatcher a java program to align directed/undirected graphs.
  • GraphBLAS A specification for a library interface for operations on graphs, with a particular focus on sparse graphs.


模板:Graph representations

模板:Data structures

Category:Graph theory

范畴: 图论

Category:Abstract data types

类别: 抽象数据类型

Category:Graphs

分类: 图表

Category:Hypergraphs

分类: 超图


This page was moved from wikipedia:en:Graph (abstract data type). Its edit history can be viewed at 图(抽象数据类型)/edithistory