Disjoint Set Union Algorithm

Applying Path Compression and Rank by Size Techniques for Optimal Performance.

Introduction

Disjoint Set Union (DSU) algorithm is a data structure that is used to maintain a collection of disjoint sets that supports two main operations: Union and Find. The Union operation is used to merge two sets into a single set, while the Find operation is used to determine which set an element belongs to.

This algorithm is widely used in various applications such as Kruskal’s algorithm for finding the Minimum Spanning Tree, image segmentation, clustering, and network connectivity.

Implementation

The Disjoint Set Union algorithm can be implemented using an array or a tree. In the array implementation, each element of the array represents a set, and the value stored at that index represents the parent of that set. If an element has a parent of -1, then it is the root of its set. The Find operation is implemented by traversing the parent array until a root element is reached. The Union operation is implemented by making the root of one set the parent of the root of the other set.

Example:

Suppose we have a collection of sets: {1}, {2}, {3}, {4}, {5}. Initially, the parent of each set is set to -1. We can represent this using an array:

parent = [-1, -1, -1, -1, -1]

Now, let’s say we want to merge sets {1} and {2} into a single set. We can do this by setting the parent of the root of set {1} (which is 1) to the root of set {2} (which is 2). This gives us the following array:

parent = [2, -1, -1, -1, -1]

We can see that the root of set {1} is now 2. Similarly, we can merge sets {4} and {5} into a single set by setting the root of set {4} (which is 4) to the root of set {5} (which is 5). This gives us the following array:

parent = [2, -1, -1, 5, 5]

Now, let’s say we want to determine which set element 4 belongs to. We can do this by traversing the parent array until we reach a root element:

i = 4
while parent[i] != -1:
    i = parent[i]
print(i) # output: 5

We can see that element 4 belongs to set {5}.

Time Complexity

The time complexity of the Disjoint Set Union algorithm depends on the implementation. Using the array implementation, the time complexity of both the Find and Union operations is O(log n), where n is the number of elements in the collection. However, by using the path compression technique, the time complexity of the Find operation can be reduced to O(α(n)), where α(n) is the inverse Ackermann function. This gives the Disjoint Set Union algorithm a near-constant time complexity.

Path Compression

The path compression technique is a method of reducing the height of the tree formed by the DSU algorithm. In the basic implementation of the DSU algorithm, the Find operation involves traversing the parent array until a root element is reached. This can be time-consuming if the tree is tall.

Path compression works by setting the parent of each visited element to the root element. This helps to flatten the tree and reduce its height of the tree. When the Find operation is called on an element, it will traverse the tree only once and set the parent of each visited element to the root element. This makes subsequent Find operations on the same set much faster.

Here is an example of how path compression works. Suppose we have a collection of sets: {1}, {2}, {3}, {4}, {5}. Initially, the parent of each set is set to -1. We can create a DSU object to represent this:

Rank by Size

The rank-by-size technique is a method of optimizing the union operation in the DSU algorithm. In the basic implementation of the DSU algorithm, the union operation involves making the root of one set the parent of the root of the other set. This can result in a tree that is tall and unbalanced, which can affect the time complexity of the Find operation.

Rank by size works by making the smaller set the child of the larger set. This helps to balance the tree and reduce the height of the tree. When the union operation is called on two sets, the rank (or size) of each set is compared. The smaller set is made the child of the larger set, and the rank of the larger set is increased by one. This helps to balance the tree and reduce the height of the tree.

here is the implementation of the Disjoint Set Union algorithm

import java.util.*;

class DSU {
    int[] parent;

    public DSU(int n) {
        parent = new int[n];
        Arrays.fill(parent, -1);
    }

    public int find(int x) {
        if (parent[x] == -1) return x;
        return parent[x] = find(parent[x]);
    }

    public void union(int x, int y) {
        int rootX = find(x);
        int rootY = find(y);
        if (rootX != rootY) {
            parent[rootX] = rootY;
        }
    }
}

Example

Suppose we have a collection of sets: {1}, {2}, {3}, {4}, {5}. Initially, the parent of each set is set to -1. We can create a DSU object to represent this:

DSU dsu = new DSU(5);

Now, let’s say we want to merge sets {1} and {2} into a single set. We can do this by calling the union method:

dsu.union(1, 2);

We can see that the root of set {1} is now 2 by calling the find method on element 1:

System.out.println(dsu.find(1)); // output: 2

Similarly, we can merge sets {4} and {5} into a single set:

dsu.union(4, 5);
System.out.println(dsu.find(4)); // output: 5

Now, let’s say we want to determine which set element 4 belongs to. We can do this by calling the find method on element 4:

System.out.println(dsu.find(4)); // output: 5

We can see that element 4 belongs to set {5}.

Conclusion

In this article, we have explained the path compression and rank by size techniques in the Disjoint Set Union algorithm. These techniques can help to optimize the time complexity of the Find and Union operations in the DSU algorithm. By using both techniques, we can achieve a time complexity of O(α(n)), where α(n) is the inverse Ackermann function, which is a very slow-growing function. Therefore, the DSU algorithm is an efficient and practical data structure for maintaining a collection of disjoint sets.