An interesting problem

Simon Matthews spm at
Thu Jan 4 14:52:29 GMT 2007

* Paul Orrock (paulo at [070104 10:25]:
> Thanks everyone for your input. I wrote a script to brute force it using 
> Spreadsheet::Perl and summing every possible range. I've attached a 
> sample sheet. As you can see there are large blocks of zero values so 
> I've managed to cut it down quite a lot by working out where the first 
> and last point of non zero data are and just iterating over the blocks 
> between them. In the attached dataset this means it only works over the 
> ranges between U11 to BC55.
> It works but it is incredibly slow, like over a hour for 650,000 ranges 
> on a P4. I'm sure the Spreadsheet::Perl range sum calculation is taking 
> the time so I'll rip it out and write one myself. For the attached sheet 
> my code makes AH34 to AP32 the most positive range at 97.57.
> McGlinchy, Alistair wrote:
> >>I think that the previous mention of a 2D autocorrelation will be  
> >>good to explore, but you can use the "moving average" trick to avoid  
> >>lots of extra cycles while summing neighborhoods.  I've implemented  
> >>this in C before and gotten speedups of 1x to 50x, depending on the  
> >>neighborhood size.
> >
> >Aye, That's the trick. Run time from 10 mins to 16secs for the 30x100
> >case. Great fun!
> Wow, that's a big difference.

One thing that might speed it up even further would be to pre-cache the 
total of (0..x, 0..y) for each x and y.

Then for a particular tile (l..r, t..b):

0   l   r
|...|///|    |
----+---+----- t
|\\\|XXX|    |
|\\\|XXX|    |
----+---+----- b
|   |   |    |

Total of ... will be T(l,t)
Total of /// will be T(r,t) - T(l,r)
Total of \\\ will be T(l,b) - T(l,r)

So the total of XXX will be:
   T(r,b) - ( total of /// + total of \\\ + total of ... )
 = T(r,b) - ( T(r,t) - T(l,r) + T(l,b) - T(l,r) + T(l,r) )
 = T(r,b) - T(r,t) + T(l,t) - T(l,b)


More information about the mailing list