file grouping answer.
michael
michael at galton.ucl.ac.uk
Wed Mar 29 13:51:32 BST 2006
I was going to ask:-
I've got a load of files I want to efficently sort into groups of 4.2Gb(1)
for burning to DVD. How do I go about doing this?
I did some digging and found the answer and thought I'd save time by
posting the answer as well.
The hard part was finding out what the problem was called I guessed it was
going to be some varient on the 'Travelling salesman problem' (2) which
ment that any any home rolled solution would be to say the least sub
optimal....
I started out by Googling on algorithm group, algorithm sort and perl
algorithm but came up with little of use, finally I tried just algorithm
which took me to the wikipedia Algorithm page (3). I flicked through it
and finally noticed a reference to Approximation algorithms (4). The
Travelling salesman is NP-Complete (5), however Nicholas Clark YAPC::Braga
talk 'When Perl is not quite fast enough' made the point that even if a
problem is NP-complete there are faster approximation algorithms which
will get a nearly optimal solution (6). I was in luck it mentioned the
'bin packing problem' (7)
Search CPAN on bin packing got me almost straight to:-
Algorithm::BinPack and via BinPack to Algorithm::Bucketizer (which called
in it bin-packing and was hence missed in future I think I may just google
site:http://search.cpan.org)
(1) when a disk starts to fail IME it often starts from the outer edge,
not filling it provides a buffer zone
(2) <http://en.wikipedia.org/wiki/Travelling_salesman>
(3) <http://en.wikipedia.org/wiki/Algorithm>
(4) <http://en.wikipedia.org/wiki/Approximation_algorithms>
(5) ie the time to get an exact answer increases exponentaily with the
number of variables
(6) if your wavering about going to YAPC::Birmingham this is proof it will
save you time! <http://www.ccl4.org/~nick/P/Fast_Enough/>
(7) <http://en.wikipedia.org/wiki/Bin_packing_problem>
--
Michael
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Michael John Lush PhD Tel:44-20-7679-5027
Nomenclature Bioinformatician Fax:44-20-7387-3496
HUGO Gene Nomenclature Committee Email: nome at galton.ucl.ac.uk
The Galton Laboratory
University College London, UK
URL: http://www.gene.ucl.ac.uk/nomenclature/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the london.pm
mailing list