## Dynamic random choice datastructure

One of the basic usages of random numbers is chosing an item at random. Most of the time, we want a uniform distribution, but sometimes we want a weighted one: the probability of selecting an element is proportional to its weight. We would use Numpy’s random.choice(), or C++’s discrete_distribution.

The algorithm would pick a random number, and perform a binary search (libstdc++ implementation) to find the associated item. But some randomized algorithms need to update the weights during execution: with those algorithms, we need to precompute all the weights each time we do a change.

It’s actually possible to do it in O(log n) time, by maintaining a binary tree: each node contains the sum of the weights of its child. This allows us to do a similar binary search (log n), and to access only a few nodes when a weight is updated (log n). I read about an implementation based on red-black trees on HN.

I wrote a similar datastructure based on an implicit binary tree – getting read of the pointers and the memory allocations, using about 3 times less memory when using doubles. It is available on Github.

## Incremental evaluation is often overlooked

As a community, we focus a lot on the heuristics available to solve a problem. But my experience in the field is that we don’t spend much time on the heuristics themselves: the cost function will get most of the work.

## Incrementality

When solving a problem using a metaheuristic, it is important to quickly evaluate incumbent solutions. When the number of variables is small, it is possible to compute the cost of a solution from scratch, but, if the problem has millions of variables, even a linear time-complexity will become limiting.

This is where incremental evaluation comes into play: for many cost functions, we can compute the result quickly for a small change of inputs. This is perfect for local search, which can now scale to very large problems.