SHA question

Alexander Clouter alex at digriz.org.uk
Wed Jan 13 14:25:58 GMT 2010


Roger Burton West <roger at firedrake.org> wrote:
>
> You may want to be slightly cleverer about it - taking a SHAsum is
> computationally expensive, and it's only worth doing if the files have
> the same size.
>
> If you don't require a pure-Perl solution, bear in mind that all this
> has been done for you in the "fdupes" program, already in Debian or at
> http://netdial.caribe.net/~adrian2/programs/ .
>
*sigh*

The following gives the duplicated hashes (you might prefer '-D' instead
of '-d'):
----
md5sum /path/to/pdfs | sort | uniq -d
----

Replace the '-d' with '-u' if you want to just see the unique ones.

I'll leave it as an exercise for the reader to pipe the output of '-D'
into some xarg action to 'rm' and 'ln -s' the duplicates.

Cheers

-- 
Alexander Clouter
.sigmonster says: For fast-acting relief, try slowing down.



More information about the london.pm mailing list