Data Structure and Algorithm
Data Structure
Disjoint Set
Introduction to Disjoint Set

Disjoint Set Data Structure (Union-Find)

A Disjoint Set data structure (often called Union-Find) is a data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets.


Key Operations

Disjoint Set supports two primary operations:

  1. Find: Determine which subset a particular element belongs to. This can be used for checking if two elements are in the same subset.
  2. Union: Join two disjoint subsets into a single subset.

Parent Array Representation

A disjoint set can be represented using a Parent Array of size N:

  • The index of the array represents the element.
  • The value at each index represents the parent of that element.
  • If parent[i] === -1 (or parent[i] === i), it means i is the root/representative of its own subset.

Naive Find Operation:

To find the representative of the set containing element i, we recursively traverse parent pointers until we reach a node that is its own parent.

function find(parent, i) {
    if (parent[i] === -1) {
        return i;
    }
    return find(parent, parent[i]);
}

Naive Union Operation:

To perform the union of two elements x and y:

  1. Find the representatives (roots) of the subsets containing x and y.
  2. If the representatives are different, make the root of one subset the parent of the root of the other subset.
function union(parent, x, y) {
    let xRoot = find(parent, x);
    let yRoot = find(parent, y);
 
    if (xRoot !== yRoot) {
        parent[xRoot] = yRoot; // Attach x's tree under y's tree
    }
}

Application: Detecting Cycles in an Undirected Graph

We can use the Union-Find operations to check if a given undirected graph contains a cycle.

Algorithm:

  1. Initialize the parent array with -1 for all vertices.
  2. For each edge (u, v) in the graph:
    • Find the subsets (representatives) of u and v.
    • If they have the same representative, then a cycle is detected.
    • Otherwise, perform a union on their subsets.

JavaScript Implementation:

function find(parent, i) {
    if (parent[i] === -1) return i;
    return find(parent, parent[i]);
}
 
function union(parent, x, y) {
    let xSet = find(parent, x);
    let ySet = find(parent, y);
    if (xSet !== ySet) {
        parent[xSet] = ySet;
    }
}
 
function isCycle(vertices, edges) {
    // Allocate memory for parent array
    const parent = new Array(vertices).fill(-1);
 
    // Iterate through all edges of graph
    for (let i = 0; i < edges.length; i++) {
        let x = find(parent, edges[i][0]);
        let y = find(parent, edges[i][1]);
 
        if (x === y) {
            return true; // Cycle detected
        }
 
        union(parent, x, y);
    }
    return false;
}
 
// Driver code
const vertices = 3;
const edges = [
    [0, 1],
    [1, 2],
    [2, 0]
];
 
if (isCycle(vertices, edges)) {
    console.log("Graph contains cycle");
} else {
    console.log("Graph doesn't contain cycle");
}
// Output: Graph contains cycle

Complexity of Naive Approach

  • Time Complexity:
    • Find: O(N) in the worst case (if the tree becomes a skewed linear chain).
    • Union: O(N) (since it depends on Find).
  • Space Complexity: O(N) for the parent array.

These operations are inefficient in the worst-case. In the next lesson, we will explore optimizations: Union by Rank and Path Compression.