Dynamic random choice datastructure

One of the basic usages of random numbers is chosing an item at random. Most of the time, we want a uniform distribution, but sometimes we want a weighted one: the probability of selecting an element is proportional to its weight. We would use Numpy’s random.choice(), or C++’s discrete_distribution.

The algorithm would pick a random number, and perform a binary search (libstdc++ implementation) to find the associated item. But some randomized algorithms need to update the weights during execution: with those algorithms, we need to precompute all the weights each time we do a change.

It’s actually possible to do it in O(log n) time, by maintaining a binary tree: each node contains the sum of the weights of its child. This allows us to do a similar binary search (log n), and to access only a few nodes when a weight is updated (log n). I read about an implementation based on red-black trees on HN.

I wrote a similar datastructure based on an implicit binary tree – getting read of the pointers and the memory allocations, using about 3 times less memory when using doubles. It is available on Github.

Leave a Reply

Your email address will not be published. Required fields are marked *