Find Median from Data Stream⚓︎
Description⚓︎
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
- For example, for
arr = [2,3,4], the median is3. - For example, for
arr = [2,3], the median is(2 + 3) / 2 = 2.5.
Implement the MedianFinder class:
MedianFinder() initializes the MedianFinder object.
void addNum(int num) adds the integer num from the data stream to the data structure.
double findMedian() returns the median of all elements so far. Answers within 10^-5 of the actual answer will be accepted.
Example 1:
- Input:
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"][[], [1], [2], [], [3], []]
- Output:
[null, null, null, 1.5, null, 2.0]
Explanation:
Constraints:
-10^5 <= num <= 10^5- There will be at least one element in the data structure before calling
findMedian. - At most
5 * 10^4calls will be made toaddNumandfindMedian.
Solution⚓︎
Class Structure⚓︎
- Class MedianFinder: Contains two priority queues
smallandlarge. small: A max heap (standard priority queue in C++) that stores the smaller half of the numbers.large: A min heap (priority queue withgreater<int>comparison) that stores the larger half of the numbers.
Constructor⚓︎
- MedianFinder(): Initializes the MedianFinder object. It doesn't require any specific initialization for the heaps as their constructors handle that.
Function addNum(int num)⚓︎
- The function ensures that the two heaps are either equal in size or have a size difference of one.
- If
smallhas more or equal elements thanlarge, the new number is added tosmalland then the maximum element fromsmallis moved tolarge. - Otherwise, the new number is added to
largeand then the minimum element fromlargeis moved tosmall. - This approach maintains the heaps such that
smallcontains the smaller half of the numbers, andlargecontains the larger half. The top ofsmallis always less or equal to the top oflarge.
Function findMedian()⚓︎
- Returns the median of the current data stream.
- If
smallhas more elements, the median is the top ofsmall. - If
largehas more elements, the median is the top oflarge. - If they are of equal size, the median is the average of the tops of both
smallandlarge.
Complexity Analysis⚓︎
- Time Complexity:
addNum(int num)- Insertion into a priority queue: \(O(\log n)\), where \(n\) is the number of elements in the heap.
- Extracting the top and inserting into the other heap: \(O(\log n)\).
- Total for each insertion: \(O(\log n)\).
findMedian()- Returning the top of a heap or calculating the average of the tops of two heaps: \(O(1)\).
- Space Complexity: The space complexity is \(O(n)\), where \(n\) is the number of elements in the data stream. This is because the data structure stores all the elements in the two heaps.
Follow-Up⚓︎
If 99% of all integer numbers from the stream are in the range [0, 100], how would you optimize your solution?