Counting Distinct Events in Streams
Big data statistics in distributed settings
Published in
11 min readSep 2, 2022
Imagine an infinite stream of incoming symbols. We’d like to know the number of distinct values received so far at any point in time.
This problem has a number of uses. One of them is to track the number of distinct visitors to a heavily-visited website over a certain time period, say the past month.