0
5.5kviews
DGIM and Applications of DGIM
1 Answer
3
688views
  • DGIM allows to estimate number of 1's in the window with an error of no more than 50%

  • Here each bit of the stream has a time stamp, the position in which it arrives.

  • It is required to distinguish positions within the window of length N, we shall represent time stamp module N, so they can be represented by $Log_2$ N bits.

  • If we store the total number of bits ever seen in the stream modulo N, then we can determine from a time stamp modulo N, where in the current window the bit with that timestamp is.

  • The window is divided into buckets consisting of:

1) The timestamp of its right end

2) The number of 1's in the bucket. the number must be power of 2, and we refer to the no of 1's as the size of the bucket.

  • There are six ovules to be followed to represent a stream by buckets.

  • The right end of a bucket is always a position with a 1

  • Every position with a 1 is in same bucket.

  • No. position is in more than one bucket

  • There are one or two buckets of any given size, upto some maximum size.

  • All sizes must be a power of 2.

  • Buckets cannot decrease in size as we move to the left.

Updating Buckets:

  • When the new bit comes in, drop the last bit if its end time is prior to N time units before the current time.

  • If the current bit is zero there is no change.

  • If the current bit is 1:

  • Create new bucket of size 1, for just that bit, end timestamp = current time.

  • If there are three buckets of size 1, combine the oldest two into a bucket of size 2.

  • If there are new three buckets of size 2, combine the oldest two into a bucket of size 4 and so on.

  • To estimate the no of 1's in the most recent N bits:

1) Sum the size of all buckets but last.

2) And half the size of last bucket.

For eg.

enter image description here

Count $ = 1 + 1 + 2 + 4 + \frac{4}{2} = 10 $

so no of 1's are 10

enter image description here

so no of 1's will be

count $ = 1 + 1 + 2 + 4 + \frac{8}{2} = 12 $

Applications of DGIM:

  • In an e-commerce website there are thousands of products and millions of transactions.

  • We can have an vector of 0/1 to indicate if a product X is sold in any nth transaction.

  • The query on this stream is like - how many times we have sold X in the last K sales.

  • On the same line we can represent the users who purchased the products using 0/1

  • So the query can be asked like how many users did purchased products in last K transactions.

Please log in to add an answer.