[Noisebridge-discuss] Data integrity, rdiff-backup, Reed-Solomon codes
John Magolske
listmail at b79.net
Mon Dec 14 07:39:56 UTC 2009
* Jason Dusek <jason.dusek at gmail.com> [091213 18:40]:
> 2009/12/12 John Magolske <listmail at b79.net>:
> > In general, what would be some recommended tools & strategies
> > to ensure ongoing data integrity?
>
> It is very important to have a few backups -- snapshots from
> times past. A corrupted cell is not the only way to get an
> rsync backup that is broken, after all. If you only have one
> and you don't check it's integrity, you can easily find
> yourself in a situation where you backup, trash something
> important and then lose your drive -- leaving you with no good
> copies.
>
> Maybe `rdiff-backup` is just the thing?
>
> http://rdiff-backup.nongnu.org/
Oh yes, incremental backups make sense for lots of reasons...thanks
for the reminder, must bump this up on my todo list. I remember
trying to decide between dirvish and rdiff-backup a while back, but I
think rdiff-backup looks like the way to go. Though I've heard good
things about dirvish, I've read that its reliance on hard link trees
means "...Apart from not being a 1:1 backup (you lose hard links!),
the filesystem metadata storage explodes for any reasonable sized
filesystem and any reasonable frequency of backup."
http://lists.debian.org/debian-user/2009/07/msg02022.html
Also, archfs sound pretty cool -- a FUSE virtual filesystem that
allows you to mount a backup created by rdiff-backup and browse each
increment as though it were a regular directory structure.
http://code.google.com/p/archfs/
http://packages.debian.org/sid/archfs
*
I found the following interesting...maybe incorporate this into a
backup routine? :
Shielding your files with Reed-Solomon codes
http://ttsiodras.googlepages.com/rsbep.html
Some commentary about the above on Slashdot:
http://hardware.slashdot.org/article.pl?sid=08/08/03/197254
>From which I gleaned...
Current hard drives employ some such error-correction (but how much?
are some drives better than others in this regard?):
http://hardware.slashdot.org/comments.pl?sid=634559&no_d2=1&cid=24459631
* PAR will protect against an occasional bit error, but the
above mentioned R-S scheme will protect against bad sectors:
http://hardware.slashdot.org/comments.pl?sid=634559&no_d2=1&cid=24462959
* CDROM's by design employ some such error correction, evidently
dvdisaster can add additional levels of error correction:
http://hardware.slashdot.org/comments.pl?sid=634559&no_d2=1&cid=24462527
http://hardware.slashdot.org/comments.pl?sid=634559&no_d2=1&cid=24462733
http://dvdisaster.net/en/index.html
Brings to mind this idea of re-assembling collections of files from
say, a series of backups on a bunch of aging CDROM's each with varying
errors using bittorrent to stitch the pieces back together. Not sure
if this was ever implemented or just imagined...can't find a link ATM.
John
--
John Magolske
http://B79.net/contact
More information about the Noisebridge-discuss
mailing list