Find Median from Data Stream
Expert Answer & Key Takeaways
A complete guide to understanding and implementing Heap / Priority Queue.
Find Median from Data Stream
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
Implement the
MedianFinder class:MedianFinder()initializes theMedianFinderobject.void addNum(int num)adds the integernumfrom the data stream to the data structure.double findMedian()returns the median of all elements so far. Answers within10^-5of the actual answer will be accepted.
Visual Representation
Stream: [2, 3, 4]
addNum(2) -> [2] (Median = 2.0)
addNum(3) -> [2, 3] (Median = (2+3)/2 = 2.5)
addNum(4) -> [2, 3, 4] (Median = 3.0)
Two Heaps State:
Lower (Max-Heap) | Upper (Min-Heap)
[2] | []
[2] | [3]
[2, 3](pop->[2]) | [3, 4](push 3)
[2, 3] | [4]Examples
Input: ["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output: [null, null, null, 1.5, null, 2.0]
Approach 1
Level I: Array and Insertion Sort
Intuition
Maintain a dynamically growing array. Every time a new number is added, keep the array sorted by placing the new number in its correct position. When calculating the median, simply access the middle element(s).
Thought Process
- Store numbers in a standard list or array.
- On
addNum(n):- Find the correct position to insert
nso the collection remains sorted. - Insert
n.
- Find the correct position to insert
- On
findMedian():- If the size is odd, return
list[size / 2]. - If the size is even, return
(list[size / 2 - 1] + list[size / 2]) / 2.0.
- If the size is odd, return
⏱ O(N) for `addNum` (to shift elements during insertion), O(1) for `findMedian`💾 O(N) to store the stream
Detailed Dry Run
Calls: add(1), add(2), find(), add(3), find()
- add(1): Array = [1]
- add(2): Insert 2. Array = [1, 2]
- find(): Even size 2. Return (1+2)/2.0 = 1.5
- add(3): Insert 3. Array = [1, 2, 3]
- find(): Odd size 3. Middle is index 1. Return 2.0
Approach 2
Level II: TreeMap of Counts
Intuition
Use a sorted map (TreeMap in Java, SortedDict in Python) to store the frequency of each number. To find the median, we iterate through the map to find the middle element(s) based on the total count.
⏱ O(log N) for `addNum`, O(N) for `findMedian` (worst case if we iterate the whole map)💾 O(K) where K is the number of unique elements
Detailed Dry Run
Calls: add(1), add(2), find(), add(3), find()
- add(1): map={1:1}, total=1
- add(2): map={1:1, 2:1}, total=2
- find(): total=2. Median is at pos 1, 2. Summing counts: at 1, count=1(pos 0-0), at 2, count=1(pos 1-1). Avg of 1 and 2 is 1.5.
- add(3): map={1:1, 2:1, 3:1}, total=3
- find(): total=3. Median is at pos 1. Summing: pos 1 is val 2. Return 2.0
Approach 3
Level III: Two Heaps (Optimal)
Intuition
To find the median instantly, we only need access to the middle one or two values. We can split the stream into a lower half (stored in a Max-Heap) and an upper half (stored in a Min-Heap). The Max-Heap holds the maximum of the lower values, while the Min-Heap holds the minimum of the upper values.
Thought Process
low: A Max-Heap to store the smaller half of the numbers.high: A Min-Heap to store the larger half of the numbers.- When adding
num:- Always push to
low, then pop the largest fromlowand push it tohigh(to ensure every element inlowis smaller than elements inhigh). - Balance the heaps: If
highhas more elements thanlow, pophighand push tolow.
- Always push to
- When finding median:
- If sizes are unequal, the extra element is in
low, so return the top oflow. - If equal, return
(low.top() + high.top()) / 2.0.
- If sizes are unequal, the extra element is in
⏱ O(log N) for `addNum`, O(1) for `findMedian`💾 O(N) to store the elements in two heaps
Detailed Dry Run
add(1): low pulls 1, pushes to high; high size > low, wait, step-by-step:
- Add 1: Push to low -> low=[1]. Pop low (1) -> Push to high. high=[1]. high size(1) > low size(0) -> Pop high(1) -> push to low. low=[1], high=[].
- Add 2: Push to low -> low=[2,1]. Pop low(2) -> push to high. high=[2], low=[1]. Sizes equal.
- Median: (1 + 2)/2.0 = 1.5
- Add 3: Push to low -> low=[3,1]. Pop low(3) -> push to high. high=[2,3], low=[1]. high size(2) > low size(1) -> Pop high(2) -> push to low. low=[2,1], high=[3].
- Median: low has extra. Return 2.0.
Course4All Technical Board
Verified ExpertSenior Software Engineers & Algorithmic Experts
Our DSA content is authored and reviewed by engineers from top tech firms to ensure optimal time and space complexity analysis.
Pattern: 2026 Ready
Updated: Weekly
Found an issue or have a suggestion?
Help us improve! Report bugs or suggest new features on our Telegram group.