.. _Disjoint Set Union Find: Disjoint Set (Union-Find) ========================== The Disjoint Set data structure, also known as Union-Find, is a data structure that tracks a set of elements partitioned into multiple disjoint (non-overlapping) subsets. It's particularly useful in network connectivity(eg. :numref:`Kruskal Algorithm`), least common ancestors, image processing, and more. The primary operations of this data structure are: - **Union(x,y)**: Merge two subsets, represented by x and y, into a single subset. - **Find(x)**: Determine which subset a particular element x is in. This can be used for determining if two elements are in the same subset. Optimization Technique ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The efficiency of these operations is crucial for many algorithms, and the Union-Find data structure provides a way to perform them efficiently using three optimizations. - **Path Compression**: In a naive Union-Find implementation, the Find operation can become inefficient if the trees become very deep. In the worst case, you might have to traverse a long chain of elements just to find the root. With Path Compression, it flatten the structure of the tree whenever **Find()** is used, by making every node point directly to the root. This dramatically reduces the time complexity of subsequent find operations. .. figure:: union_find_path_comp.png :align: center :name: union_find_path_comp - **Path Havling**: Flatten the structure of the tree whenever **Find()** is used, by making every node point directly to the root. This dramatically reduces the time complexity of subsequent find operations. - **Union by Size**: In **Union()**, attach the root of the tree with fewer elements to the root of the tree with more elements to avoid increasing the tree height unnecessarily. The implementation could be the following: .. literalinclude:: code/union_find.h :language: c++ :linenos: :lines: 51-93 In addition to **Union(x,y)** and **Find(x)**, there are three more functions: **get_set(x)**, **number_of_groups()**, and **connected()**. **Related Questions**: - Groups of Strings(:numref:`uf_group_strings`) - Longest Consecutive Sequence(:numref:`uf_longest_consecutive_sequence`) - Multi-Language Translator(:numref:`uf_translator`) - Maze Generation(:numref:`uf_maze_gen`) **Reference**: - Time complexity: https://en.wikipedia.org/wiki/Disjoint-set_data_structure#Time_complexity - Path compression: https://www.youtube.com/watch?v=VHRhJWacxis **Song**: .. code-block:: c++ Title: Union Find Lyrics: Henry Fuheng Wu (Verse 1) Disjoint Set a.k.a Union-Find It helps us track sets with all its might.(all its might) They are just bunch of trees, For efficiency, we put them in some arrays(use two arrays) (Chorus) wowo Union-Find, a super star, Find your boss No matter who you are. Mazes and graphs, it conquers them all, Union-Find answers the programmer's call. (Verse 2) As the name implies, core functions are union and find, Yes we can, optimize it with union by size (winner takes all) In the find method, compress and halve the path, It flatten the trees so code is fast. (Outro) wowo Union-Find, a super star, Connecting the elements No matter how far. Kruskal minimizes spanning tree You conquers problems and set us free(set us free)