SHA question

Alexander Clouter alex at
Wed Jan 13 14:25:58 GMT 2010

Roger Burton West <roger at> wrote:
> You may want to be slightly cleverer about it - taking a SHAsum is
> computationally expensive, and it's only worth doing if the files have
> the same size.
> If you don't require a pure-Perl solution, bear in mind that all this
> has been done for you in the "fdupes" program, already in Debian or at
> .

The following gives the duplicated hashes (you might prefer '-D' instead
of '-d'):
md5sum /path/to/pdfs | sort | uniq -d

Replace the '-d' with '-u' if you want to just see the unique ones.

I'll leave it as an exercise for the reader to pipe the output of '-D'
into some xarg action to 'rm' and 'ln -s' the duplicates.


Alexander Clouter
.sigmonster says: For fast-acting relief, try slowing down.

More information about the mailing list