How to retrieve a row, biased by populatity?

Abigail abigail at abigail.be
Fri Aug 24 13:06:01 BST 2012


On Tue, Aug 21, 2012 at 07:35:55PM -0700, Yitzchak Scott-Thoennes wrote:
> The single pass approach, as given in perlfaq "How do I select a
> random line from a file?" adapts to work here too.
> 
> my $total_weight = 0;
> my $selected;
> while ( my $record = get_next_record() ) {
>     my $weight = weight($record);
>     $total_weight += $weight;
>     $selected = $record if $weight > $total_weight * rand;
> }
> 
> Weights can be integers or floating point, but must be >= 0.  A record
> is guaranteed to be selected unless all weights are 0 or there are no
> records.


It's a classical problem, and IIRC, discussed by Knuth's The Art of Computer
Programming, Vol 1. 



Abigail


More information about the london.pm mailing list