[OT] benchmarking "typical" programs

Simon Wistow simon at thegestalt.org
Fri Sep 21 08:56:34 BST 2012

On Thu, Sep 20, 2012 at 12:35:18PM +0100, Nicholas Clark said:
> Lots of "one trick pony" type benchmarks exist, but very few that actually
> try to look like they are doing typical things typical programs do, at the
> typical scales real programs work out, so

As a search engineer (recovering) I'm inclined to say - get a corpus of 
docs, build an inverted index out of it and then do some searches. This 
will test

1) File/IO Performance (Reading in the corpus)
2) Text manipulation (Tokenizing, Stop word removal, Stemming)
3) Data structure performance (Building the index)
4) Maths Calculation (performing TF/IDF searches)

All in pretty good, discrete steps. Plus by tweaking the size of the 
corpus you can stress memory as well.


More information about the london.pm mailing list