Graphs

The graphs are very useful and fairly common data structures. They are used to describe a wide variety of relationships between objects and in practice can be related to almost everything. As we will see later, trees are a subset of the graphs and also lists are special cases of trees and thus of graphs, i.e. the graphs represent a generalized structure that allows modeling of very large set of real-world situations.

Graph is a data structure that consists of following two components:

  • A finite set of vertices also called as node.
  • A finite set of ordered pair of the form (u, v) called as edge.

The pair is ordered because (u, v) is not same as (v, u) in case of directed graph(di-graph). The pair of form (u, v) indicates that there is an edge from vertex u to vertex v. The edges may contain weight/value/cost.One of the most common example of graphs to us would be in social networks like LinkedIn, facebook. For example, in facebook, each person is represented with a vertex(or node). Each node is a structure and contains information like person id, name, gender and locale.

Terminology

Vertex

A vertex (also called a “node”) is a fundamental part of a graph. It can have a name, which we will call the “key.” A vertex may also have additional information. We will call this additional information the “payload.

Edge

An edge (also called an “arc”) is another fundamental part of a graph. An edge connects two vertices to show that there is a relationship between them. Edges may be one-way or two-way. If the edges in a graph are all one-way, we say that the graph is a directed graph, or a digraph. The class prerequisites graph shown above is clearly a digraph since you must take some classes before others.

Weight

Edges may be weighted to show that there is a cost to go from one vertex to another. For example in a graph of roads that connect one city to another, the weight on the edge might represent the distance between the two cities.
With those definitions in hand we can formally define a graph. A graph can be represented by G where G=(V,E). For the graph G, V is a set of vertices and E is a set of edges. Each edge is a tuple (v,w) where w,v∈V. We can add a third component to the edge tuple to represent a weight. A subgraph s is a set of edges e and vertices v such that e⊂E and v⊂V.

Directed Graph

Graphs with directed edges are called Directed Graphs. The number of edges with one endpoint on a given vertex is called that vertex’s degree. In a directed graph, the number of edges that point to a given vertex is called its in-degree, and the number that point from it is called its out-degree. Often, we may want to be able to distinguish between different nodes and edges. We can associate labels with either. We call such a graph labeled.

Undirected Graph

In a directed graph, the edges point from one vertex to another, while in an undirected graph, they merely connect two vertices.

Path

A path in a graph is a sequence of vertices that are connected by edges. Formally we would define a path as w1,w2,…,wn such that (wi,wi+1)∈E for all 1≤i≤n−1. The unweighted path length is the number of edges in the path, specifically n−1. The weighted path length is the sum of the weights of all the edges in the path. For example in the above figure the path from V3 to V1 is the sequence of vertices (V3,V4,V0,V1). The edges are {(v3,v4,7),(v4,v0,1),(v0,v1,5)}.

Cycle

A cycle in a directed graph is a path that starts and ends at the same vertex. For example, in Figure 2 the path (V5,V2,V3,V5) is a cycle. A graph with no cycles is called an acyclic graph. A directed graph with no cycles is called a directed acyclic graph or a DAG. We will see that we can solve several important problems if the problem can be represented as a DAG.

The Graph Abstract Data Type

The graph abstract data type (ADT) is defined as follows:

Graph() creates a new, empty graph.
addVertex(vert) adds an instance of Vertex to the graph.
addEdge(fromVert, toVert) Adds a new, directed edge to the graph that connects two vertices.
addEdge(fromVert, toVert, weight) Adds a new, weighted, directed edge to the graph that connects two vertices.
getVertex(vertKey) finds the vertex in the graph named vertKey.
getVertices() returns the list of all vertices in the graph.
it returns True for a statement of the form vertex in graph, if the given vertex is in the graph, False otherwise.

Representation

  • Adjacency Matrix
  • Adjacency List

 Adjacencey Matrix

Each cell aij of an adjacency matrix contains 0, if there is an edge between i-th and j-th vertices, and otherwise. Before discussing the advantages and disadvantages of this kind of representation, let us see an example.

Graph sample Adjacency matrix for the graph
Graph Adjacency matrix
Edge (2, 5) Cells for edge (2, 5)
Edge (2, 5) Cells for the edge (2, 5)
Graph sample Adjacency matrix for the graph
Edge (1, 3) Cells for the edge (1, 3)

The graph presented by example is undirected. It means that its adjacency matrix is symmetric. Indeed, in undirected graph, if there is an edge (2, 5) then there is also an edge (5, 2). This is also the reason, why there are two cells for every edge in the sample. Loops, if they are allowed in a graph, correspond to the diagonal elements of an adjacency matrix.

Advantages: Adjacency matrix is very convenient to work with. Add (remove) an edge can be done in O(1) time, the same time is required to check, if there is an edge between two vertices. Also it is very simple to program and in all our graph tutorials we are going to work with this kind of representation.

Disadvantages:

  • Adjacency matrix consumes huge amount of memory for storing big graphs. All graphs can be divided into two categories, sparse and dense graphs. Sparse ones contain not much edges (number of edges is much less, that square of number of vertices, |E| << |V|2). On the other hand, dense graphs contain number of edges comparable with square of number of vertices. Adjacency matrix is optimal for dense graphs, but for sparse ones it is superfluous.
  • Next drawback of the adjacency matrix is that in many algorithms you need to know the edges, adjacent to the current vertex. To draw out such an information from the adjacency matrix you have to scan over the corresponding row, which results in O(|V|) complexity. For the algorithms like DFS or based on it, use of the adjacency matrix results in overall complexity of O(|V|2), while it can be reduced to O(|V| + |E|), when using adjacency list.
  • The last disadvantage, we want to draw you attention to, is that adjacency matrix requires huge efforts for adding/removing a vertex. In case, a graph is used for analysis only, it is not necessary, but if you want to construct fully dynamic structure, using of adjacency matrix make it quite slow for big graphs.

To sum up, adjacency matrix is a good solution for dense graphs, which implies having constant number of vertices.

Adjacency list

This kind of the graph representation is one of the alternatives to adjacency matrix. It requires less amount of memory and, in particular situations even can outperform adjacency matrix. For every vertex adjacency list stores a list of vertices, which are adjacent to current one. Let us see an example.

Graph sample Adjacency list for the graph
Graph Adjacency list
Graph sample Adjacency list for the graph
Vertices, adjacent to {2} Row in the adjacency list

Advantages: Adjacent list allows us to store graph in more compact form, than adjacency matrix, but the difference decreasing as a graph becomes denser. Next advantage is that adjacent list allows to get the list of adjacent vertices in O(1) time, which is a big advantage for some algorithms.

Disadvantages:

  • Adding/removing an edge to/from adjacent list is not so easy as for adjacency matrix. It requires, on the average, O(|E| / |V|) time, which may result in cubical complexity for dense graphs to add all edges.
  • Check if there is an edge between two vertices can be done in O(|E| / |V|) when list of adjacent vertices is unordered or O(log2(|E| / |V|)) when it is sorted. This operation stays quite cheap.
  • Adjacent list doesn’t allow us to make an efficient implementation, if dynamically change of vertices number is required. Adding new vertex can be done in O(V), but removal results in O(E) complexity.

To sum up, adjacency list is a good solution for sparse graphs and lets us changing number of vertices more efficiently, than if using an adjacent matrix. But still there are better solutions to store fully dynamic graphs.

Implementation for adjacency matrix

class Graph {
private:
      bool** adjacencyMatrix;
      int vertexCount;
public:
      Graph(int vertexCount) {
            this->vertexCount = vertexCount;
            adjacencyMatrix = new bool*[vertexCount];
            for (int i = 0; i < vertexCount; i++) {
                  adjacencyMatrix[i] = new bool[vertexCount];
                  for (int j = 0; j < vertexCount; j++)
                         adjacencyMatrix[i][j] = false;
             }
       }

       void addEdge(int i, int j) {
             if (i >= 0 && i < vertexCount && j > 0 && j < vertexCount) {
                   adjacencyMatrix[i][j] = true;
                   adjacencyMatrix[j][i] = true;
             }
       }

       void removeEdge(int i, int j) {
             if (i >= 0 && i < vertexCount && j > 0 && j < vertexCount) {
                   adjacencyMatrix[i][j] = false;
                   adjacencyMatrix[j][i] = false;
             }
       }

       bool isEdge(int i, int j) {
             if (i >= 0 && i < vertexCount && j > 0 && j < vertexCount)
                  return adjacencyMatrix[i][j];
            else
                  return false;
      }
 
      ~Graph() {
            for (int i = 0; i < vertexCount; i++)
                  delete[] adjacencyMatrix[i];
            delete[] adjacencyMatrix;
      }
};

Graph Traversals

Shortest Path

Minimum Spanning Tree

Applications

  • Electronic circuits
    • Printed circuit board
    • Integrated circuit
  • Transportation networks
    • Highway network
    • Flight network
  • Computer networks
    • Local area network
    • Internet
    • Web
  • Databases
    • Entity-relationship diagram
  • Questions
    • Are two points connected?
    • What is the shortest path between