[Noisebridge-discuss] [RUG] Data integrity, SMART and SSD's

Joel Jaeggli joelja at bogus.com
Sun Dec 13 09:15:57 UTC 2009



John Magolske wrote:
> I'm looking for strategies around maintaining data integrity. The
> primary machine in question where data is gathered, manipulated &
> stored is a laptop with an SSD hard-drive (actually a CF card + CF to
> IDE adapter). This is rsync'd to backups on external USB hard-drives.
> 
> At the last RUG (Rsync Users Group) meeting there was mention of
> the usefulness of SMART (Self-Monitoring, Analysis and Reporting
> Technology) systems, which are built into many (all?) ATA, IDE and
> SCSI-3 hard drives.

smart is useful for telling you your disk is failing. which is go to
know but it's not an integrity mechanism.

> From SMARTCTL(8):
> 
>     The purpose of SMART is to monitor the reliability of the hard
>     drive and predict drive failures, and to carry out different
>     types of drive self-tests.
> 
> Issuing `smartctl -a` to my Transcend & SanDisk CF cards gets:
> 
>     SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87
>     don't show if SMART is enabled.
> 
>     SMART support is: Unavailable - Packet Interface Devices
>     [this device: Write-once (optical disk)] don't support ATA SMART
> 
> I also understand that SMART doesn't work with USB hard drives. 

It didn't for a long time, but in point of fact smartmon tools on linux
does now support a number of usb bridges..

> So it
> appears that nowhere would the benefits of SMART be present in this
> particular situation. Which leaves me me wondering...
> 
> How helpful is SMART in maintaining data integrity?
> 
> Is there a way to gain the benefits of SMART in the above scenario?
> I'm thinking maybe by doing the rsync backups to a hard-drive built
> into another computer... Or maybe there's an external hard-drive with
> a little computer built in that allows SMART to function?
> 
> I'm a bit concerned about keeping the primary repository of a
> collection of data on a CF card. If some cells fail on read (is that
> possible? 

yeah it is

or do they only fail on write?

nope

 and that is rsync'd to an
> external hard-drive target, what are the chances those errors will
> be propagated without being detected?

if you're collecting syslog you're going to be informed of un
recoverable read if a block fails to be read or fails it parity test.

> In general, what would be some
> recommended tools & strategies to ensure ongoing data integrity?

generate hashes of the files, the hashes should be much smaller than the
files then you can verify against the hash that the file hasn't be
altered at some future date.

> TIA for any suggestions,
> 
> John
> 
> 



More information about the Noisebridge-discuss mailing list