abuse at cabal.org.uk
Thu Jan 14 12:22:48 GMT 2010
On 13 Jan 2010, at 17:53, David Cantrell wrote:
> Other hashing algorithms exist and are faster but more prone to
> inadvertant collisions. If you've got a lot of data to compare, I'd
> use one of them (eg one of the variations on a CRC) and then only
> bring out the big SHA guns when that finds a collision.
That's a premature optimisation which just complicates the code, unless you mean *a lot* such as in the rdiff algorithm.
For de-duping purposes, SHA is still faster than you can pull the files off the disk and a secondary cheaper hash is unnecessary.
More information about the london.pm