A difficult filesystem

Eric Wilhelm scratchcomputing at gmail.com
Wed Jun 20 23:14:50 BST 2007


# from Andy Armstrong
# on Wednesday 20 June 2007 05:30 am:

>Any of the normal hard-link-preserving copying methods run out of  
>memory pretty early - for obvious reasons.

I assume that includes rsync?

If you're transferring the whole partition, any chance of getting lvm to 
adopt the filesystem and migrate it across pv's?  (That's iffy, so try 
it on a disposable setup first.)

Maybe more feasible would be to create a mirror with mdadm and then just 
unplug the original once /proc/mdstat gives you the all-clear.

>So I copied the whole filesystem (dd if=x of=y) onto the new device 

Is that `dd if=/dev/sda1 of=/dev/sdb1`? (pointing at the device or 
partition?)  Can you use an lv as a dd target (thus allowing lvextend)?

>While I'm waiting does anyone have tips for other tools that might be
>   useful?

Assuming you're talking about a dirvish-like setup, it might be fastest 
to simply reconstruct the process that originally created the trees 
(but applying it to the repository->repository transfer rather than the 
typical nightly host->repository usage.)

I'm guessing $local_backup_nightly_job_time * $number_of_nightlies is 
less than two weeks.  You know a few rules about your data that 
rsync/cp/tar don't.  E.g. "links only within a tree/vault" and "links 
only to previous date".

That is, you have something like:

  bank/target/2005-01-01/tree
  bank/target/2005-01-02/tree
  ...
  bank/target/2007-06-19/tree
  bank/target/2007-06-20/tree

For each target, transfer the original/oldest (2005-01-01) tree to 
dest/target, then `cp -al` to the next date (2005-01-02) in dest/target 
and rsync from bank/target/ into dest/target/ per date.

You can also run one fork per target, which should scale per cpu if your 
disk hardware is bored and the fs implementation isn't bottlenecking.

If it works, send me your code :-D

--Eric
-- 
As an old bass player friend of mine used to say: throw money, don't 
clap.
--Tony Parisi
---------------------------------------------------
    http://scratchcomputing.com
---------------------------------------------------



More information about the london.pm mailing list