Top K Frequent Elements⚓︎
Description⚓︎
Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.
Example 1:
- Input:
nums = [1,1,1,2,2,3], k = 2 - Output:
[1,2]
Example 2:
- Input:
nums = [1], k = 1 - Output:
[1]
Constraints:
1 <= nums.length <= 10^5-10^4 <= nums[i] <= 10^4kis in the range[1, the number of unique elements in the array].- It is guaranteed that the answer is unique.
Solution⚓︎
Using Max Heap⚓︎
Explanation:
- Frequency Mapping: First, the algorithm creates a hash map (
unordered_map) to count the frequency of each element in the input vectornums. - Max Heap Creation: It then uses a max heap (
priority_queue) to store pairs of (frequency, element). The max heap is used to efficiently retrieve the element with the highest frequency. - Heap Manipulation: For each element in the frequency map, the algorithm pushes a pair of (frequency, element) into the max heap. If the size of the max heap exceeds the number of unique elements minus
k, it means we have found one of the topkfrequent elements. This element is added to the result vector, and then removed from the max heap. - Result Compilation: This process is repeated until all top
kfrequent elements are found and added to the result vector.
Time Complexity Analysis:
- Frequency Mapping: \(O(N)\), where \(N\) is the number of elements in
nums. Each element is visited once to update the frequency map. - Heap Operations:
- Inserting an element into the heap takes \(O(\log M)\) time, where \(M\) is the number of unique elements in
nums. - The for loop iterates \(M\) times, and each iteration may involve a heap insertion.
- Therefore, the total time for heap operations is \(O(M \log M)\).
Overall, the time complexity is \(O(N + M \log M)\). In the worst case, where all elements are unique, \(M\) becomes \(N\), and the time complexity is \(O(N \log N)\).
Space Complexity Analysis:
- Frequency Map: \(O(M)\), where \(M\) is the number of unique elements in
nums. - Max Heap: \(O(M)\), for storing the frequency and element pairs.
- Result Vector: \(O(K)\), for storing the top \(k\) frequent elements.
Hence, the total space complexity is \(O(M + K)\). In the worst case, this becomes \(O(N)\), where \(N\) is the number of elements in nums and all elements are unique.
Using Min Heap⚓︎
The min heap now keeps the least frequent elements at the top. When its size exceeds k, the least frequent element is popped out, ensuring that only the k most frequent elements remain.
- Time complexity: \(O(N\log k)\);
- Space complexity: \(O(N + k)\).
Counting Sort⚓︎
Code with comments:
Explanation:
- Frequency Mapping: The algorithm starts by creating a hash map (
unordered_map) to count the frequency of each element in the input vectornums. - Frequency of Frequencies: It then creates a vector
countwherecount[i]represents how many elements appear exactlyitimes innums. This is a key step that differentiates this solution from others, as it essentially performs a counting sort on the frequencies. - Finding the Threshold: The algorithm then determines the minimum frequency (
threshold) that is needed to ensure that the sum of the counts of elements with frequency higher thanthresholdis at leastk. This step ensures that we include only the topkfrequent elements. - Collecting Results: Finally, it iterates over the frequency map again and adds elements to the result vector
resif their frequency is greater than the determinedthreshold.
Time Complexity Analysis:
- Frequency Mapping: \(O(N)\), where N is the number of elements in
nums. - Counting Frequencies: \(O(N)\), as it iterates over the frequency map which can have at most \(N\) unique elements.
- Finding the Threshold: \(O(N)\), in the worst case, it might iterate through the entire
countvector. - Collecting Results: \(O(N)\), as it iterates over the frequency map again.
Overall, the time complexity is \(O(N)\), which is quite efficient as it linearly scales with the input size.
Space Complexity Analysis:
- Frequency Map: \(O(N)\), for storing the frequency of each element.
- Count Vector: \(O(N)\), for counting the frequencies of frequencies.
- Result Vector: \(O(K)\), for storing the top \(k\) frequent elements.
Thus, the total space complexity is \(O(N + K)\), which is mainly dominated by the size of the input and the frequency map.