On Sun, Mar 15, 2015 at 3:50 AM, Felix Miata
NAICT, smartctl thinks this HD is OK, but is it really?
As long as the drive reallocates on writes, and none of the prefail attribute values are at threshold, the drive will say it's healthy. There are now quite a few published papers showing that the drive's self assessment isn't helpful a significant minority of the time. Of all the problems SMART reports, increasing numbers of bad sectors is correlated with prefailure. The fact you now have a corrupt file system is consistent with that. a. Dispose of the drive. If you do that, consider using hdparm to leverage the drive's built-in ATA Security Erase command. This is the only way to erase data on sectors that no longer have LBA mapping. This is also a ton faster than writing zeros with dd. http://mackonsti.wordpress.com/2011/11/22/ssd-secure-erase-ata-command b. Use badblocks -svw. This is destructive, and does ~4 passes, write followed by read. I'd let it do at least two write/read passes before canceling it, or just let it complete. If the drive is actually working normally, no errors will be recorded in dmesg or badblocks. The drive will remap those bad sectors internally at write time. Any failures are essentially fatal. A write fail means there are no more reserve sectors for reallocation. A read failure means the drive firmware incorrectly assessed a persistent write failure for a transient one. And a corruption count means some kind of silent data corruption, like a torn write. So if it comes up clean, honestly I still wouldn't trust it I'd relegate it to Btrfs use only (which would have self healed in this instance so long as the default DUP metadata was used). If it has errors, then obliterate it with hdparm and retired it. -- Chris Murphy -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org