Number of Good Paths⚓︎

Description⚓︎

There is a tree (i.e. a connected, undirected graph with no cycles) consisting of n nodes numbered from 0 to n - 1 and exactly n - 1 edges.

You are given a 0-indexed integer array vals of length n where vals[i] denotes the value of the i-th node. You are also given a 2D integer array edges where edges[i] = [ai, bi] denotes that there exists an undirected edge connecting nodes ai and bi.

A good path is a simple path that satisfies the following conditions:

The starting node and the ending node have the same value.
All nodes between the starting node and the ending node have values less than or equal to the starting node (i.e. the starting node's value should be the maximum value along the path).

Return the number of distinct good paths.

Note that a path and its reverse are counted as the same path. For example, 0 -> 1 is considered to be the same as 1 -> 0. A single node is also considered as a valid path.

Example 1:

Input: vals = [1,3,2,1,3], edges = [[0,1],[0,2],[2,3],[2,4]]
Output: 6
Explanation:

There are 5 good paths consisting of a single node.
There is 1 additional good path: 1 -> 0 -> 2 -> 4.
(The reverse path 4 -> 2 -> 0 -> 1 is treated as the same as 1 -> 0 -> 2 -> 4.)
Note that 0 -> 2 -> 3 is not a good path because vals[2] > vals[0].

Example 2:

Input: vals = [1,1,2,2,3], edges = [[0,1],[1,2],[2,3],[2,4]]
Output: 7
Explanation:

1 2	`There are 5 good paths consisting of a single node. There are 2 additional good paths: 0 -> 1 and 2 -> 3.`

Example 3:

Input: vals = [1], edges = []
Output: 1
Explanation: The tree consists of only one node, so there is one good path.

Constraints:

n == vals.length
1 <= n <= 3 * 10^4
0 <= vals[i] <= 10^5
edges.length == n - 1
edges[i].length == 2
0 <= ai, bi < n
ai != bi
edges represents a valid tree.

Solution⚓︎

class Solution {
private:
    vector<int> p;
    vector<int> maxCnt;

    void build(int n) {
        p.resize(n);
        maxCnt.resize(n);

        for (int i = 0; i < n; i++) {
            p[i] = i;
            maxCnt[i] = 1;
        }
    }

    int find(int i) {
        if (i != p[i]) p[i] = find(p[i]);
        return p[i];
    }

    int unionSets(int x, int y, vector<int>& vals) {
        int fx = find(x), fy = find(y);
        int path = 0;
        if (vals[fx] > vals[fy]) {
            p[fy] = fx;
        } else if (vals[fx] < vals[fy]) {
            p[fx] = fy;
        } else {
            path = maxCnt[fx] * maxCnt[fy];
            p[fy] = fx;
            maxCnt[fx] += maxCnt[fy];
        }
        return path;
    }

public:
    int numberOfGoodPaths(vector<int>& vals, vector<vector<int>>& edges) {
        int n = vals.size();
        build(n);
        int res = n;

        sort(edges.begin(), edges.end(), [&](auto& lhs, auto& rhs) {
            return max(vals[lhs[0]], vals[lhs[1]]) < max(vals[rhs[0]], vals[rhs[1]]);
        });

        for (auto edge : edges) {
            res += unionSets(edge[0], edge[1], vals);
        }
        return res;
    }
};

The Union-Find data structure is used here with two main arrays:

p[]: Parent array to keep track of the root of each set. Initially, each node is its own parent.
maxCnt[]: This array counts the number of times the representative value occurs within the set. This is crucial for determining the number of good paths that involve nodes with the same value.

Algorithm Steps:

Initialization:
Each node is initialized as its own parent, indicating that initially, each node is in its own set.
maxCnt[i] is initialized to 1 for each node, indicating that initially, the representative value (which is the node's own value) occurs exactly once within its own set.
Sorting Edges:
The edges are sorted not just by the values of the nodes they connect but specifically to ensure processing happens in a way that facilitates the formation of good paths under the problem's constraints.
Processing Edges:
For each edge, the algorithm tries to union the sets of the two endpoints. The key here is how maxCnt is updated:
- If the two endpoints have different values, the standard union by rank or size is performed, without any special treatment for maxCnt.
- If the two endpoints have the same value (and thus are part of the "good path" definition), the maxCnt of the root node is updated to reflect the total occurrence of this representative value across the merged sets. This update is essential for counting good paths correctly.

The initial count of good paths is set to n, considering each node as a trivial path of length 0 (a single node). The algorithm then adds to this count based on the maxCnt updates during the union operations. Whenever two sets with the same representative value are merged, the number of new good paths formed is determined by the product of the maxCnt values of the two sets before merging. This is because any pair of nodes with the same value in the two sets can form a good path.

maxCnt tracks the occurrence of the representative (maximum) value within each set. When two sets with the same maximum value are merged, the potential combinations of endpoints that can form good paths increase. The product of the maxCnt values before merging gives the number of new good paths formed by the merge operation because it represents all possible pairs of nodes with the representative value from both sets.