I was randomly solving problems when I bumped into this interesting problem New Roads Queries. I initially felt that this problem will be completely out of reach, but in the end I came up with an interesting solution for the same and learnt a lot in this process.
The problem statement goes like:
There are N cities in Byteland but no roads between them. However, each day, a new road will be built. There will be a total of M roads. Your task is to process q queries of the form: "after how many days can we travel from city a to city b for the first time?"
A small detour 🛣️
Let's start with a simple and familiar problem:
Given a graph of N nodes and queries of the form (A, B) check whether the 2 nodes A and B are in the same connected component or not.
Although this particular problem can be done using a simple DFS or BFS on the graph, it can be done easily using the data structure of Disjoint Set. Let's see how.
Disjoint Set gives us a way to do two operations very efficiently on N disjoint sets:
Check whether two nodes are part of the same set
Merge the sets containing given two nodes
So if we add all the edges to the Disjoint Set one by one using the merge operation, in the end we would have a disjoint sets of nodes. Then all we need to do for each query is to answer whether each node is part of the same set or not.
With path-compression and light-to-heavy merging, we can answer each query in amortised log* N, where N is the number of nodes in the graph and log* N is the Inverse Ackermann function.
The code for DSU is also fairly simple and below in an example of the same in C++.
classDisjointSet{public: vector<int> parent; vector<int> size;DisjointSet(intn){parent.resize(n);size.resize(n,1); /** Initially each element belongs to one set which is itself */for(int i =0; i < n; i++)parent[i]= i;} /** Find the root of any element */introot(inta){while(parent[a]!= a){parent[a]=parent[parent[a]]; /** Path compression */ a =parent[a];}return a;} /** Find whether two elements belong to the same set */intfind(inta,intb){returnroot(a)==root(b);} /** Merge two sets which contain element a, b */voidmerge(inta,intb){int root_a =root(a);int root_b =root(b); /** Light to Heavy merging */if(root_a != root_b){if(size[root_a]<size[root_b]){parent[root_a]=parent[root_b];size[root_b]+=size[root_a];}else{parent[root_b]=parent[root_a];size[root_a]+=size[root_b];}}}};
NOTE: All Disjoint Set uses to store things are two arrays/vectors.
Coming back 🔙
Now that we know what Disjoint Set is, let's try to apply this to the problem we have at hand. One notices that the queries we need to answer are time ⏰ based, which means we need to have a snapshot of the graph or the Disjoint Set (representing the graph) at each point in time. Let's number these Disjoint Sets representing the graphs at each time by D0, D1, ... DM. But we need to do this smartly without copying the entire Disjoint Set each time.
For the time being, let's assume that we have some efficient way to get the copy of the Disjoint Set at any time t, still how do we solve the problem at hand ??
💡 On careful observation we can see that if two nodes are connected at time ti, then they will remain connected in all time tj where j >= i. That means the function of whether two nodes are connected is monotonic over time.
This brings us to the idea of using Binary Search to find the least time t where any given two nodes are connected. This will take us O(log M) time where M is the number of edges in the graph. Formally, we are trying to find a Disjoint Set Di such that A and B have the same root in Di and i is minimum.
We seem to be getting somewhere now....
Now how do we solve the problem of persistence ??
We notice that our Disjoint Set just uses arrays. We can replace the arrays in the DSU with a Persistent Array to create a Persistent Disjoint Set data structure. This will allows us to query the Disjoint Set at any point in time!!
There are other interesting ways to solve this same problem which requires knowledge of some concepts like Heavy Light Decomposition etc. The rough outline of the solution is given here.