[opensuse] WD Green AV HD: rsync read errors mapping
Running OS 13.1: 2017/08/27 03:38:05 [2308] building file list 2017/08/27 03:38:08 [2308] .d..t...... movie/ 2017/08/27 03:40:09 [2308] >f+++++++++ movie/bigBang0113-201312151700GDMX10wd.ts 2017/08/27 03:40:09 [2308] rsync: read errors mapping "/disks/esata/movie/bigBang0113-201312151700GDMX10wd.ts": Input/output error (5) 2017/08/27 03:42:30 [2308] >f+++++++++ movie/bigBang0220-201310061730GDMX10wd.ts 2017/08/27 03:42:30 [2308] rsync: read errors mapping "/disks/esata/movie/bigBang0220-201310061730GDMX10wd.ts": Input/output error (5) ... 2017/08/27 06:15:56 [2308] >f+++++++++ movie/CLO/closer0406-201306270230GDMX09.ts 2017/08/27 06:15:56 [2308] rsync: read errors mapping "/disks/esata/movie/CLO/closer0406-201306270230GDMX09.ts": Input/output error (5) ... 2017/08/27 06:24:14 [2308] >f+++++++++ movie/CLO/closer0411-201307180130GDMX09.ts 2017/08/27 06:24:14 [2308] rsync: read errors mapping "/disks/esata/movie/CLO/closer0411-201307180130GDMX09.ts": Input/output error (5) ... Above is from rsync log. Before interrupting the rsync and restarting it I logged output from hdparm and smartctl: http://fm.no-ip.com/Tmp/Hardware/Disk/hdpd-201207-fi965azbme-wd20eurs-os131.... Similar logs from the first WD Green AV disk, first WD I ever bought: http://fm.no-ip.com/Tmp/Hardware/Disk/smart-wdAVGP3200AVVS.txt The smaller 320GB device I consider functional but unreliable, removed from SD DVR service shortly after the warranty expired and the DVR began producing inexplicable errors. The larger (disk of of current interest) has been in a satellite receiver that runs on a variation of Mips Linux Dreambox FOSSware using a 3.92 kernel, resulting in original creation of the files that error'd. I was able to delete the 4 uncopyable files, then restore them from backup. Smartctl -x reports from my few other WD disks all seem to have identical Reallocated_Sector_Ct, 200 200 140 Reallocated_Event_Count, 200 200 0 Current_Pending_Sector, 200 200 0 Offline_Uncorrectable 100 253 0 except for 200 173 0 on the bad 320 for current pending. Can anyone suggest what most likely caused the 4 files to be repeatedly uncopyable? Driver error or other error in the Dreambox software maybe? Before the 320 AV and the 2TB AV I had never ever purchased a WD disk, due to miserable experiences with used WD HDs that had come into my possession over the years. I was Quantum user for as long as it lasted, eventually switching to Seagate after nearly as bad experiences with Toshiba and Maxtor. -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-28 03:54, Felix Miata wrote:
Running OS 13.1:
2017/08/27 03:38:05 [2308] building file list 2017/08/27 03:38:08 [2308] .d..t...... movie/ 2017/08/27 03:40:09 [2308] >f+++++++++ movie/bigBang0113-201312151700GDMX10wd.ts 2017/08/27 03:40:09 [2308] rsync: read errors mapping "/disks/esata/movie/bigBang0113-201312151700GDMX10wd.ts": Input/output error (5) 2017/08/27 03:42:30 [2308] >f+++++++++ movie/bigBang0220-201310061730GDMX10wd.ts 2017/08/27 03:42:30 [2308] rsync: read errors mapping "/disks/esata/movie/bigBang0220-201310061730GDMX10wd.ts": Input/output error (5) ... 2017/08/27 06:15:56 [2308] >f+++++++++ movie/CLO/closer0406-201306270230GDMX09.ts 2017/08/27 06:15:56 [2308] rsync: read errors mapping "/disks/esata/movie/CLO/closer0406-201306270230GDMX09.ts": Input/output error (5) ... 2017/08/27 06:24:14 [2308] >f+++++++++ movie/CLO/closer0411-201307180130GDMX09.ts 2017/08/27 06:24:14 [2308] rsync: read errors mapping "/disks/esata/movie/CLO/closer0411-201307180130GDMX09.ts": Input/output error (5) ...
Above is from rsync log. Before interrupting the rsync and restarting it I logged output from hdparm and smartctl: http://fm.no-ip.com/Tmp/Hardware/Disk/hdpd-201207-fi965azbme-wd20eurs-os131....
Serial Number: WD-WCAZAE933621 User Capacity: 2,000,398,934,016 bytes [2.00 TB] 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 6 SMART Extended Self-test Log Version: 1 (1 sectors) No self-tests have been logged. [To run self-tests, use: smartctl -t]
Similar logs from the first WD Green AV disk, first WD I ever bought: http://fm.no-ip.com/Tmp/Hardware/Disk/smart-wdAVGP3200AVVS.txt
Serial Number: WD-WCAV18791785 User Capacity: 320,072,933,376 bytes [320 GB] 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 195 173 000 - 306 SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 27071 252915937 # 2 Extended offline Completed: read failure 90% 27070 252915955 # 3 Conveyance offline Completed without error 00% 27051 - # 4 Extended offline Completed: read failure 90% 27046 460886821 # 5 Short offline Completed without error 00% 3 -
Can anyone suggest what most likely caused the 4 files to be repeatedly uncopyable? Driver error or other error in the Dreambox software maybe?
That you have media errors. On both disks. You can try to recover with dd*rescue. Then run "badblocks" to try find the bad sectors, then overwrite them, or better, overwrite the entire disk (both). Finally, restore from backup. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On Mon, Aug 28, 2017 at 9:19 AM, Carlos E. R. <robin.listas@telefonica.net> wrote:
On 2017-08-28 03:54, Felix Miata wrote:
Running OS 13.1:
2017/08/27 03:38:05 [2308] building file list 2017/08/27 03:38:08 [2308] .d..t...... movie/ 2017/08/27 03:40:09 [2308] >f+++++++++ movie/bigBang0113-201312151700GDMX10wd.ts 2017/08/27 03:40:09 [2308] rsync: read errors mapping "/disks/esata/movie/bigBang0113-201312151700GDMX10wd.ts": Input/output error (5) 2017/08/27 03:42:30 [2308] >f+++++++++ movie/bigBang0220-201310061730GDMX10wd.ts 2017/08/27 03:42:30 [2308] rsync: read errors mapping "/disks/esata/movie/bigBang0220-201310061730GDMX10wd.ts": Input/output error (5) ... 2017/08/27 06:15:56 [2308] >f+++++++++ movie/CLO/closer0406-201306270230GDMX09.ts 2017/08/27 06:15:56 [2308] rsync: read errors mapping "/disks/esata/movie/CLO/closer0406-201306270230GDMX09.ts": Input/output error (5) ... 2017/08/27 06:24:14 [2308] >f+++++++++ movie/CLO/closer0411-201307180130GDMX09.ts 2017/08/27 06:24:14 [2308] rsync: read errors mapping "/disks/esata/movie/CLO/closer0411-201307180130GDMX09.ts": Input/output error (5) ...
Above is from rsync log. Before interrupting the rsync and restarting it I logged output from hdparm and smartctl: http://fm.no-ip.com/Tmp/Hardware/Disk/hdpd-201207-fi965azbme-wd20eurs-os131....
Serial Number: WD-WCAZAE933621 User Capacity: 2,000,398,934,016 bytes [2.00 TB]
196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 6
SMART Extended Self-test Log Version: 1 (1 sectors) No self-tests have been logged. [To run self-tests, use: smartctl -t]
Similar logs from the first WD Green AV disk, first WD I ever bought: http://fm.no-ip.com/Tmp/Hardware/Disk/smart-wdAVGP3200AVVS.txt
Serial Number: WD-WCAV18791785 User Capacity: 320,072,933,376 bytes [320 GB]
196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 195 173 000 - 306
SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 27071 252915937 # 2 Extended offline Completed: read failure 90% 27070 252915955 # 3 Conveyance offline Completed without error 00% 27051 - # 4 Extended offline Completed: read failure 90% 27046 460886821 # 5 Short offline Completed without error 00% 3 -
Can anyone suggest what most likely caused the 4 files to be repeatedly uncopyable? Driver error or other error in the Dreambox software maybe?
That you have media errors. On both disks.
You can try to recover with dd*rescue. Then run "badblocks" to try find the bad sectors, then overwrite them, or better, overwrite the entire disk (both). Finally, restore from backup.
I have never seen a positive comment about the WD Green drives. I refuse to buy them (I bought a few 10 years ago. Bad results at the time.) Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 28/08/17 14:36, Greg Freemyer wrote:
That you have media errors. On both disks.
You can try to recover with dd*rescue. Then run "badblocks" to try find the bad sectors, then overwrite them, or better, overwrite the entire disk (both). Finally, restore from backup.
I'd be inclined to just "dd if=/dev/zero of=/dev/sdX" over them. Provided there isn't a genuine fault in the drives, this will wipe them and fix any bad sectors. Bear in mind, trying to second-guess what the manufacturers have put in their firmware by way of error correction etc. is an exercise in futility...
I have never seen a positive comment about the WD Green drives.
I refuse to buy them (I bought a few 10 years ago. Bad results at the time.)
Seems to me, manufacturers are optimising the hell out of firmware for specific use cases, and forgetting that some of us want cheap general-purpose drives. As far as I can make out, "green" drives are optimised for *backup* or for laptops with oodles of ram that are configured to cache disk up to gunwhales and delay disk access as much as possible. They aggressively go to sleep, and don't like being woken up. Put them in a linux desktop and they'll go through their warranty (maximum acceptable sleep/wake cycles) in a matter of months. I want cheap drives I can put in a desktop raid - 5400 rpm is fine, SCT/ERC is a must, can I find the two in the same drive? NO! Who was the idiot that decided 3 minutes or more is an acceptable time for a desktop drive to hang when searching for data?! But any drive over 1GB certified for desktop use no longer supports SCT/ERC (the ERC standing for Error Recovery Control, ie allowing the user to tell the drive what to do if it can't read the data!). Properly certified raid drives are massive overkill for a desktop :-( I think about the only *real* reason for all these drive types is for shingled drives, where re-writing data comes at a real cost, but apart from that I think drives should have a "universal" firmware that can be configured to the user's requirements, and maybe a server and desktop variant that are built to different tolerances. Not oodles and oodles of different drives optimised for use cases that don't match user requirements. (Oh, and yes, I've heard very little by way of good stories about green drives, mostly because they are optimised for a usage pattern that does not match typical linux use.) Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-30 20:44, Wols Lists wrote:
On 28/08/17 14:36, Greg Freemyer wrote:
That you have media errors. On both disks.
You can try to recover with dd*rescue. Then run "badblocks" to try find the bad sectors, then overwrite them, or better, overwrite the entire disk (both). Finally, restore from backup.
I'd be inclined to just "dd if=/dev/zero of=/dev/sdX" over them. Provided there isn't a genuine fault in the drives, this will wipe them and fix any bad sectors.
Yes. However, I noticed more than once that using badblocks to discover where those bad sectors are make them disappear. It apparently caused the disk to remap them. I'm not sure of what happened, but I'm certain that they disappeared from the smartctl output with long test. I also have the theory of not throwing away my disks the first time they have bad sectors. Allow remapping, then keep them under observation. If there are no more bad sectors (after long test, perhaps full dd writeover), I feel safe. If bad sectors keep appearing, then replace fast. I have used disks this way several years after the first error without a new one. But then, I use Seagates. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 30/08/17 20:06, Carlos E. R. wrote:
On 2017-08-30 20:44, Wols Lists wrote:
On 28/08/17 14:36, Greg Freemyer wrote:
That you have media errors. On both disks.
You can try to recover with dd*rescue. Then run "badblocks" to try find the bad sectors, then overwrite them, or better, overwrite the entire disk (both). Finally, restore from backup.
I'd be inclined to just "dd if=/dev/zero of=/dev/sdX" over them. Provided there isn't a genuine fault in the drives, this will wipe them and fix any bad sectors.
Yes.
However, I noticed more than once that using badblocks to discover where those bad sectors are make them disappear. It apparently caused the disk to remap them.
I'm not sure of what happened, but I'm certain that they disappeared from the smartctl output with long test.
Note that magnetism decays. If the drive has trouble reading something in a test, it should transparently rewrite it behind the scenes, and if that is the problem then it's fixed. No remapping or anything. As I've said before, think of DRAM needing refreshing. And now drives are packing so much into such a small space, if the sector next to yours gets rewritten, some of the write can leak and damage your data. If that happens ten or twenty times your data is now unreadable ... (okay, it's a lot more reliable than that, but that's roughly what's going on :-)
I also have the theory of not throwing away my disks the first time they have bad sectors. Allow remapping, then keep them under observation. If there are no more bad sectors (after long test, perhaps full dd writeover), I feel safe. If bad sectors keep appearing, then replace fast.
I have used disks this way several years after the first error without a new one.
But then, I use Seagates.
So do I :-) Just don't get the drive I've got - 3TB Barracuda. Actually, I would be very surprised if Seagate haven't fixed it by now, but they had a design fault that primed them for early failure. Dunno if you've seen the report, but some virtual hosting provider bought a load of drives in the aftermath of the Taiwan flood, and just happened to buy a load of these particular drives. The report - with plenty of hard data! - absolutely slates them for reliability. Just the 3TB version ... Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 31/08/17 07:51 AM, Wols Lists wrote:
Note that magnetism decays. If the drive has trouble reading something in a test, it should transparently rewrite it behind the scenes, and if that is the problem then it's fixed. No remapping or anything.
As I've said before, think of DRAM needing refreshing. And now drives are packing so much into such a small space, if the sector next to yours gets rewritten, some of the write can leak and damage your data. If that happens ten or twenty times your data is now unreadable ... (okay, it's a lot more reliable than that, but that's roughly what's going on :-)
LOL! Roll on the days when we have sub-atomic data storage, using the spin of each electron in orbit around each atom in a (probably) diamond (or other carbon crystalline) lattice. That's nice if you subscribe to the planetary model of atoms, as opposed to the probability density cloud model that is used to explain semiconductors. Hmm, it is also a model that makes valency theory (and hence all of chemistry) a lot easier to explain and put on a mathematical basis, as well as the operation of electrolytes. When I got to university and learnt this aspect of semiconductor theory a lot of the stuff that made no sense in high school chemistry now made sense. Not that it was any use to me from the POV of the next level of exams or my career mind you. You can bet these spin-polarized electrons are going to need a lot of refreshing! Why do I think of the stage jugglers that balanced lots of spinning plates on poles:http://livingflow.ca/wp-content/uploads/2016/01/spinning-plates-forest.jpg -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 31/08/17 15:27, Anton Aylward wrote:
On 31/08/17 07:51 AM, Wols Lists wrote:
Note that magnetism decays. If the drive has trouble reading something in a test, it should transparently rewrite it behind the scenes, and if that is the problem then it's fixed. No remapping or anything.
As I've said before, think of DRAM needing refreshing. And now drives are packing so much into such a small space, if the sector next to yours gets rewritten, some of the write can leak and damage your data. If that happens ten or twenty times your data is now unreadable ... (okay, it's a lot more reliable than that, but that's roughly what's going on :-)
LOL!
Roll on the days when we have sub-atomic data storage, using the spin of each electron in orbit around each atom in a (probably) diamond (or other carbon crystalline) lattice.
Does that mean you've invented the perfect laser-guided tightly-focused magnetic beam that writes *exactly* where you want on the disk? And magnetic media that only changes state when you want it to, and not when subjected to stray magnetic fields? Or am I simply describing that phenomenon, where writes get smudged and, on the basis of "if you can't beat them join them" shingled disks deliberately do exactly that. If you write to track 10 on a shingled disk, it will overwrite most of track 9, obliterate track 11, and probably take out track 12 to boot. By design. Okay, normal drives don't (want) to do that, but writing to track 10 will send stray magnetic field over 9 and 11, weakening the data. The way to deal with that is to increase the gap between tracks, except that manufacturers want to decrease the gap to increase the amount of data they can store. The result is that data stored on magnetic disk *will* slowly fade, exacerbated by writes to neighbouring tracks. And yes, writes are probably good for several years, but if you have lots of old files on your disk that you haven't touched for years, there is a reasonable chance they will fail to read. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 09/01/2017 08:55 AM, Wols Lists wrote:
Or am I simply describing that phenomenon, where writes get smudged and, on the basis of "if you can't beat them join them" shingled disks deliberately do exactly that.
This reminds me of the old daze when I used SMD rack-mounted disks, maybe in the late 1980's. The process of reading/writing sectors would, over time, cause them to wander around their tracks. After a period of time, the sectors would start to overlap themselves, leading to all kinds of chaos. The only solution was to do a low-level format and restore the data from 9-track tapes. As long as I'm reminiscing, I also remember open-platter HP disks in the 1970's that required a manual cleaning of the heads with alcohol once every thirty-days. How far we've come! I just fired up a RAID 6 array of 24 10TB Seagate disks. They're working perfectly so far, and they're not shingled! I'm getting about 1.9-GB/sec of continuous write bandwidth. Regards, Lew -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/09/17 11:55 AM, Wols Lists wrote:
Does that mean you've invented the perfect laser-guided tightly-focused magnetic beam that writes *exactly* where you want on the disk?
Why do you assume it is *me* that does the inventing of such things? Would you be so sceptical if you heard of IBM doing it, given that they have already announced the manipulation of single atoms? Go google? And why this back reference to a _disk_ rather than a crystal or lattice array? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/09/17 22:57, Anton Aylward wrote:
On 01/09/17 11:55 AM, Wols Lists wrote:
Does that mean you've invented the perfect laser-guided tightly-focused magnetic beam that writes *exactly* where you want on the disk?
Why do you assume it is *me* that does the inventing of such things? Would you be so sceptical if you heard of IBM doing it, given that they have already announced the manipulation of single atoms? Go google?
Well, you did respond to my comment with "LOL", which to me implies disbelief.
And why this back reference to a _disk_ rather than a crystal or lattice array?
So I was just explaining in rather more detail. Yes it would be great if we can have your crystal or lattice memory, but I suspect it will suffer greatly from quantum degradation :-( As some great physicist once said (Neils Bohr?) - "Man proposes, nature opposes". Computers are fast being miniaturized to the point at which classical physics ceases to work ... Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/09/17 06:53 PM, Anthony Youngman wrote:
Why do you assume it is *me* that does the inventing of such things? Would you be so sceptical if you heard of IBM doing it, given that they have already announced the manipulation of single atoms? Go google?
Well, you did respond to my comment with "LOL", which to me implies disbelief.
And why this back reference to a _disk_ rather than a crystal or lattice array?
So I was just explaining in rather more detail. Yes it would be great if we can have your crystal or lattice memory, but I suspect it will suffer greatly from quantum degradation :-(
I really think you missed my point. The "LOL" was irony. We already have atom level recording in the laboratory. http://gizmodo.com/record-setting-hard-drive-writes-information-one-atom-a-1... "Record-Setting Hard Drive Writes Information One Atom At a Time" And this goes back to 2012 https://arxiv.org/abs/1202.1131 Which is important because the first article needs liquid nitrogen temperatures. Let me pile irony on irony. We'll use SATA and SCSI interfaces when the commercial versions are available :-) My original irony was 'what happens next'. Yes you are right about degradation, but so what? I'm old enough to remember using computers where the 'core' memory didn't degrade when you turned to computer off. Not need for 'suspend to disk', you just turned the power back on and it began running again from where it left off. The we go that 'semiconductor' memory that lost it contents when you turned the power off and people like you said how useless that was. Next along came memory that forgot after a few seconds anyway so there had to be extra hardware to refresh each 'row' periodically. I had a Z-80 based board with memory like that. Next along we had memory that flipped bits because of the trace radioactives in the packaging, or was it cosmic rays? Or both? Or maybe it was quantum effects, who knows. I think we dealt with that by including extra checksum bits and using Hamming coding, all built in to the chip so that we never bothered with it at a higher level. Or something. Engineers are pretty good at solving problems. And at creating them. It goes hand in hand - ying/yang. And heck, if we have quantum effect computers why not quantum effect memory? Isn't that how the human brain works? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
The we go that 'semiconductor' memory that lost it contents when you turned the power off and people like you said how useless that was. Next along came memory that forgot after a few seconds anyway so there had to be extra hardware to refresh each 'row' periodically. I had a Z-80 based board with memory like that.
Since this is already way offtopic, have you seen the trick of freezing the RAM to keep it from degrading: https://www.youtube.com/watch?v=JDaicPIgn9U A great tool for sophisticated thiefs! And the tools are cheap unfortunately. Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Greg Freemyer composed on 2017-09-01 20:13 (UTC-0400):
this is already way offtopic,
Seriously. Back to topic: Badblocks on the WD Green AV HD has reached 13.60% completion with (0/0/0 errors) in 15:59:30. :-( -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-02 03:03, Felix Miata wrote:
Greg Freemyer composed on 2017-09-01 20:13 (UTC-0400):
this is already way offtopic,
Seriously.
Back to topic:
Badblocks on the WD Green AV HD has reached 13.60% completion with (0/0/0 errors) in 15:59:30. :-(
Yes, it is very slow to run. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 02/09/17 06:01 AM, Carlos E. R. wrote:
On 2017-09-02 03:03, Felix Miata wrote:
Greg Freemyer composed on 2017-09-01 20:13 (UTC-0400):
this is already way offtopic,
Seriously.
Back to topic:
Badblocks on the WD Green AV HD has reached 13.60% completion with (0/0/0 errors) in 15:59:30. :-(
Yes, it is very slow to run.
Less than 1%/hour, apparently. That's a 2T drive, isn't it? ${DEITY} help us when we have to check out the half dozen or so 4T drives for a RAID array! I'm back to thinking that a RAID array, a HIGHLY redundant, HIGHLY striped, full 64-bit wide+Hamming dual error detection, (72,64) of smaller drives, say 350G, might make more sense. What? Failure rates? Oh, right, yes. Now THIS is interesting https://www.backblaze.com/blog/hard-drive-failure-rates-q2-2016/ https://www.backblaze.com/blog/hard-drive-failure-stats-q2-2017/ There seem to be some reliable 4T drives out there and the trend with 8T is looking good. Still, a RAID6 with 8T drives is a bit large for a home setting :-) https://www.comparitech.com/blog/cloud-online-backup/how-long-do-hard-drives... http://it.toolbox.com/blogs/marketing-strategies/is-data-backup-really-that-... I have a colleague/developer whose home system is set up to run continuous backup of him /home-equivalent (he's a Windows user and I'm not sure what the terminology is) to a USB portable drive-of-notable-size for each of his machines. He told me once that the 'backup' drives were less reliable than his main drive! I found that ironic. He couldn't understand why I was laughing. https://www.xilinx.com/support/documentation/application_notes/xapp645.pdf https://www.google.com/patents/US3656109 https://github.com/sultanqasim/hamming_7264_secded -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Sat, 2 Sep 2017 08:59:16 -0400 Anton Aylward <opensuse@antonaylward.com> wrote:
On 02/09/17 06:01 AM, Carlos E. R. wrote:
On 2017-09-02 03:03, Felix Miata wrote:
Greg Freemyer composed on 2017-09-01 20:13 (UTC-0400):
this is already way offtopic,
Seriously.
Back to topic:
Badblocks on the WD Green AV HD has reached 13.60% completion with (0/0/0 errors) in 15:59:30. :-(
Yes, it is very slow to run.
Less than 1%/hour, apparently. That's a 2T drive, isn't it?
${DEITY} help us when we have to check out the half dozen or so 4T drives for a RAID array!
I'm back to thinking that a RAID array, a HIGHLY redundant, HIGHLY striped, full 64-bit wide+Hamming dual error detection, (72,64) of smaller drives, say 350G, might make more sense. What? Failure rates? Oh, right, yes.
Now THIS is interesting https://www.backblaze.com/blog/hard-drive-failure-rates-q2-2016/ https://www.backblaze.com/blog/hard-drive-failure-stats-q2-2017/ There seem to be some reliable 4T drives out there and the trend with 8T is looking good. Still, a RAID6 with 8T drives is a bit large for a home setting :-)
Really interesting links, thanks. My prejudice is to never buy WDC or Seagate and to buy HGST when I can, so it's good to have some confirmation.
https://www.comparitech.com/blog/cloud-online-backup/how-long-do-hard-drives...
http://it.toolbox.com/blogs/marketing-strategies/is-data-backup-really-that-...
I have a colleague/developer whose home system is set up to run continuous backup of him /home-equivalent (he's a Windows user and I'm not sure what the terminology is) to a USB portable drive-of-notable-size for each of his machines. He told me once that the 'backup' drives were less reliable than his main drive!
I found that ironic. He couldn't understand why I was laughing.
https://www.xilinx.com/support/documentation/application_notes/xapp645.pdf https://www.google.com/patents/US3656109 https://github.com/sultanqasim/hamming_7264_secded
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/09/17 22:59, Anton Aylward wrote:
On 02/09/17 06:01 AM, Carlos E. R. wrote:
On 2017-09-02 03:03, Felix Miata wrote:
Greg Freemyer composed on 2017-09-01 20:13 (UTC-0400):
this is already way offtopic, Seriously.
Back to topic:
Badblocks on the WD Green AV HD has reached 13.60% completion with (0/0/0 errors) in 15:59:30. :-( Yes, it is very slow to run. Less than 1%/hour, apparently. That's a 2T drive, isn't it? [pruned]
Now THIS is interesting https://www.backblaze.com/blog/hard-drive-failure-rates-q2-2016/ https://www.backblaze.com/blog/hard-drive-failure-stats-q2-2017/ There seem to be some reliable 4T drives out there and the trend with 8T is looking good. Still, a RAID6 with 8T drives is a bit large for a home setting :-)
https://www.comparitech.com/blog/cloud-online-backup/how-long-do-hard-drives...
http://it.toolbox.com/blogs/marketing-strategies/is-data-backup-really-that-...
I have a colleague/developer whose home system is set up to run continuous backup of him /home-equivalent (he's a Windows user and I'm not sure what the terminology is) to a USB portable drive-of-notable-size for each of his machines. He told me once that the 'backup' drives were less reliable than his main drive!
I found that ironic. He couldn't understand why I was laughing.
What I found to be strange is why do Backblaze keep buying/installing Seagate drives when they have such a high failure rate (according to their own statistics)? It's like repeatedly watching the same movie hoping that it would have a different ending. BC -- You are NOT entitled to your opinion. You are entitled to your INFORMED opinion. Nobody is entitled to be ignorant. Harlan Ellison -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward composed on 2017-09-02 08:59 (UTC-0400):
Carlos E. R. wrote:
Felix Miata wrote:
Badblocks on the WD Green AV HD has reached 13.60% completion with (0/0/0 errors) in 15:59:30. :-(
Yes, it is very slow to run.
Less than 1%/hour, apparently. That's a 2T drive, isn't it?
Badblocks on the WD Green AV HD has reached 38.00% completion with (0/0/0 errors) in 47:13:45. :-( -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-03 10:18, Felix Miata wrote:
Anton Aylward composed on 2017-09-02 08:59 (UTC-0400):
Carlos E. R. wrote:
Felix Miata wrote:
Badblocks on the WD Green AV HD has reached 13.60% completion with (0/0/0 errors) in 15:59:30. :-(
Yes, it is very slow to run.
Less than 1%/hour, apparently. That's a 2T drive, isn't it?
Badblocks on the WD Green AV HD has reached 38.00% completion with (0/0/0 errors) in 47:13:45. :-(
I remember it been very slow. I think it also had a high cpu load, am I right? -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Carlos E. R. composed on 2017-09-03 13:46 (UTC+0200):
Felix Miata wrote:
Badblocks on the WD Green AV HD has reached 38.00% completion with (0/0/0 errors) in 47:13:45. :-(
I remember it been very slow. I think it also had a high cpu load, am I right?
It's varying between 1.6% and 2.4%. Badblocks on the WD Green AV HD has reached 41.8% completion with (0/0/0 errors) in 51:51:00. -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-03 14:55, Felix Miata wrote:
Carlos E. R. composed on 2017-09-03 13:46 (UTC+0200):
Felix Miata wrote:
Badblocks on the WD Green AV HD has reached 38.00% completion with (0/0/0 errors) in 47:13:45. :-(
I remember it been very slow. I think it also had a high cpu load, am I right?
It's varying between 1.6% and 2.4%.
Badblocks on the WD Green AV HD has reached 41.8% completion with (0/0/0 errors) in 51:51:00.
Well, you will have to be patient and wait... -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Felix Miata composed on 2017-09-03 08:55 (UTC-0400):
Carlos E. R. composed on 2017-09-03 13:46 (UTC+0200):
Felix Miata wrote:
Badblocks on the WD Green AV HD has reached 38.00% completion with (0/0/0 errors) in 47:13:45. :-(
I remember it been very slow. I think it also had a high cpu load, am I right?
It's varying between 1.6% and 2.4%.
Badblocks on the WD Green AV HD has reached 41.8% completion with (0/0/0 errors) in 51:51:00.
Badblocks 1.42.11 on the WD Green AV HD has reached 51.26% completion with (4/0/0 errors) in 65:25:00, 4 bad sectors in sequence, but the log's block numbers have me somewhat perplexed. The output was continuing with 0/0/0 until well after 41.8% was reached. Now the log shows: 984662252 984662253 984662254 984662255 As I didn't specify -b4096 (and should have, given its sloth), those numbers look like they must be badblocks' own default 1024b block size numbers rather than logical sector numbers or filesystem block numbers, not really what I would expect it to log on a filesystem formatted with 4k block size (as tune2fs reports). # fdisk -l Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Device Boot Start End Blocks Id System /dev/sdc1 4096 3907029167 1953512536 83 Linux Is it possible to know whether 1 or 2 4k physical sectors comprise the logged group? Something else doesn't make sense. Tune2fs reports lifetime writes of 1223GB, but the 1863GiB filesystem is 93% full, and certainly has had many files replaced over its 5 year installed life. I would expect lifetime writes to be somewhere in the 1.5-3X filesystem size range if not more. Another thing: I don't see in the badblocks man page what the components of (#/#/#) are supposed to represent. :-( -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
04.09.2017 05:30, Felix Miata пишет:
Felix Miata composed on 2017-09-03 08:55 (UTC-0400):
Carlos E. R. composed on 2017-09-03 13:46 (UTC+0200):
Felix Miata wrote:
Badblocks on the WD Green AV HD has reached 38.00% completion with (0/0/0 errors) in 47:13:45. :-(
I remember it been very slow. I think it also had a high cpu load, am I right?
It's varying between 1.6% and 2.4%.
Badblocks on the WD Green AV HD has reached 41.8% completion with (0/0/0 errors) in 51:51:00.
Badblocks 1.42.11 on the WD Green AV HD has reached 51.26% completion with (4/0/0 errors) in 65:25:00, 4 bad sectors in sequence, but the log's block numbers have me somewhat perplexed. The output was continuing with 0/0/0 until well after 41.8% was reached. Now the log shows:
984662252 984662253 984662254 984662255
As I didn't specify -b4096 (and should have, given its sloth), those numbers look like they must be badblocks' own default 1024b block size numbers rather than logical sector numbers or filesystem block numbers, not really what I would expect it to log on a filesystem formatted with 4k block size (as tune2fs reports).
How is filesystem block size related? badblocks works with device, not filesystem. From the man page "For this reason, it is strongly recommended that users not run badblocks directly, but rather use the -c option of the e2fsck and mke2fs programs".
# fdisk -l Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Device Boot Start End Blocks Id System /dev/sdc1 4096 3907029167 1953512536 83 Linux
Is it possible to know whether 1 or 2 4k physical sectors comprise the logged group?
Something else doesn't make sense. Tune2fs reports lifetime writes of 1223GB, but the 1863GiB filesystem is 93% full, and certainly has had many files replaced over its 5 year installed life. I would expect lifetime writes to be somewhere in the 1.5-3X filesystem size range if not more.
Another thing: I don't see in the badblocks man page what the components of (#/#/#) are supposed to represent. :-(
Count of (read errors, write errors, corruption errors) where "corruption" means block read from device differs from block written to device. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov composed on 2017-09-04 06:27 (UTC+0300):
Felix Miata composed:
Badblocks 1.42.11 on the WD Green AV HD has reached 51.26% completion with (4/0/0 errors) in 65:25:00, 4 bad sectors in sequence, but the log's block numbers have me somewhat perplexed. The output was continuing with 0/0/0 until well after 41.8% was reached. Now the log shows:
984662252 984662253 984662254 984662255
As I didn't specify -b4096 (and should have, given its sloth), those numbers look like they must be badblocks' own default 1024b block size numbers rather than logical sector numbers or filesystem block numbers, not really what I would expect it to log on a filesystem formatted with 4k block size (as tune2fs reports).
How is filesystem block size related? badblocks works with device,
It's not that hard to confuse concepts of filesystem and device when working in a state of frustration over a single process estimated to consume 5+ days. :-( Quoting myself from https://lists.opensuse.org/opensuse/2017-09/msg00045.html 1 Sep 2017 17:15:43 -0400 I have running now: badblocks -o bb-wd20eurs -s -n /dev/sdb1
not filesystem. From the man page "For this reason, it is strongly recommended that users not run badblocks directly, but rather use the -c option of the e2fsck and mke2fs programs Knowing it would be a lengthy process, I wanted to start badblocks before going to bed, hours after I should already have been in bed asleep. Whether I actually saw that, was oblivious to it, ignored it, or other, I hazard to guess.
Another thing: I don't see in the badblocks man page what the components of (#/#/#) are supposed to represent. :-(
Count of (read errors, write errors, corruption errors) where "corruption" means block read from device differs from block written to device.
Thank you! Where did you find it? -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
04.09.2017 06:54, Felix Miata пишет:
Another thing: I don't see in the badblocks man page what the components of (#/#/#) are supposed to represent. :-(
Count of (read errors, write errors, corruption errors) where "corruption" means block read from device differs from block written to device.
Thank you! Where did you find it?
In e2fsprogs sources. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-04 05:27, Andrei Borzenkov wrote:
How is filesystem block size related? badblocks works with device, not filesystem. From the man page "For this reason, it is strongly recommended that users not run badblocks directly, but rather use the -c option of the e2fsck and mke2fs programs".
I use it on entire disk across several partitions, so e2 would be irrelevant. I expected the numbers to be LBAs. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Felix Miata composed on 2017-09-03 22:30 (UTC-0400)
Felix Miata composed on 2017-09-03 08:55 (UTC-0400):
Carlos E. R. composed on 2017-09-03 13:46 (UTC+0200):
Felix Miata wrote:
Badblocks on the WD Green AV HD has reached 38.00% completion with (0/0/0 errors) in 47:13:45. :-(
I remember it been very slow. I think it also had a high cpu load, am I right?
It's varying between 1.6% and 2.4%.
Badblocks on the WD Green AV HD has reached 41.8% completion with (0/0/0 errors) in 51:51:00.
Badblocks 1.42.11 on the WD Green AV HD has reached 51.26% completion with (4/0/0 errors) in 65:25:00, 4 bad sectors in sequence, but the log's block numbers have me somewhat perplexed. The output was continuing with 0/0/0 until well after 41.8% was reached. Now the log shows:
984662252 984662253 984662254 984662255
As I didn't specify -b4096 (and should have, given its sloth), those numbers look like they must be badblocks' own default 1024b block size numbers rather than logical sector numbers or filesystem block numbers, not really what I would expect it to log on a filesystem formatted with 4k block size (as tune2fs reports).
# fdisk -l Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Device Boot Start End Blocks Id System /dev/sdc1 4096 3907029167 1953512536 83 Linux
Is it possible to know whether 1 or 2 4k physical sectors comprise the logged group?
Badblocks 1.42.11 on the WD Green AV HD has reached 100.00% completion with (4/0/0 errors) in 129:44:20, 4 bad sectors in sequence. Smartctl -x before process: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 64 3 Spin_Up_Time POS--K 168 166 021 - 6558 4 Start_Stop_Count -O--CK 099 099 000 - 1863 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 071 071 000 - 21875 10 Spin_Retry_Count -O--CK 100 100 000 - 0 11 Calibration_Retry_Count -O--CK 100 100 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 877 192 Power-Off_Retract_Count -O--CK 200 200 000 - 414 193 Load_Cycle_Count -O--CK 200 200 000 - 1448 194 Temperature_Celsius -O---K 114 094 000 - 36 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 6 198 Offline_Uncorrectable ----CK 100 253 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 37 200 Multi_Zone_Error_Rate ---R-- 100 253 000 - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning Smartctl after process: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 183 3 Spin_Up_Time POS--K 168 166 021 - 6575 4 Start_Stop_Count -O--CK 099 099 000 - 1891 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 070 070 000 - 22100 10 Spin_Retry_Count -O--CK 100 100 000 - 0 11 Calibration_Retry_Count -O--CK 100 100 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 904 192 Power-Off_Retract_Count -O--CK 200 200 000 - 416 193 Load_Cycle_Count -O--CK 200 200 000 - 1474 194 Temperature_Celsius -O---K 113 094 000 - 37 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 1 198 Offline_Uncorrectable ----CK 100 253 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 37 200 Multi_Zone_Error_Rate ---R-- 100 253 000 - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning Diff -u: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE - 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 64 + 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 183 - 9 Power_On_Hours -O--CK 071 071 000 - 21875 + 9 Power_On_Hours -O--CK 070 070 000 - 22100 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 -197 Current_Pending_Sector -O--CK 200 200 000 - 6 +197 Current_Pending_Sector -O--CK 200 200 000 - 1 198 Offline_Uncorrectable ----CK 100 253 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 37 200 Multi_Zone_Error_Rate ---R-- 100 253 000 - 0 Any comments? -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-06 21:00, Felix Miata wrote:
Badblocks 1.42.11 on the WD Green AV HD has reached 100.00% completion with (4/0/0 errors) in 129:44:20, 4 bad sectors in sequence.
Smartctl -x before process:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 9 Power_On_Hours -O--CK 071 071 000 - 21875 192 Power-Off_Retract_Count -O--CK 200 200 000 - 414 193 Load_Cycle_Count -O--CK 200 200 000 - 1448
196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 6 198 Offline_Uncorrectable ----CK 100 253 000 - 0
Smartctl after process:
Ah, let's see.
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
197 Current_Pending_Sector -O--CK 200 200 000 - 1 198 Offline_Uncorrectable ----CK 100 253 000 - 0
Any comments?
One Pending out of six! Much better. Well, what about the long test? Do that. The thing to do now is overwrite the bad sector with anything. Assuming the badblocks program printed the LBA, it is possible to find where that sector is. Finding the file is far from trivial, though, so you may have to overwrite the entire partition. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" (Minas Tirith))
Carlos E. R. composed on 2017-09-07 02:00 (UTC+0200):
Felix Miata wrote:
Badblocks 1.42.11 on the WD Green AV HD has reached 100.00% completion with (4/0/0 errors) in 129:44:20, 4 bad sectors in sequence.
Smartctl -x before process:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 9 Power_On_Hours -O--CK 071 071 000 - 21875 192 Power-Off_Retract_Count -O--CK 200 200 000 - 414 193 Load_Cycle_Count -O--CK 200 200 000 - 1448
196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 6 198 Offline_Uncorrectable ----CK 100 253 000 - 0
Smartctl after process:
Ah, let's see.
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
197 Current_Pending_Sector -O--CK 200 200 000 - 1 198 Offline_Uncorrectable ----CK 100 253 000 - 0
Any comments?
One Pending out of six!
The number was 6, but the listed bad were 4: 984662252 984662253 984662254 984662255 Each I assume to be a badblocks block of 1024 bytes. Internally that could be as few as one 4k sector, and no more than two.
Much better.
Well, what about the long test? Do that. What long test? I just tied up a workspace 5.4 days to do badblocks using the -n (non-destructive read-write mode) switch, and in sloth mode (without a -b 4096 switch)!!!
badblocks -o bb-wd20eurs -s -n /dev/sdb1
The thing to do now is overwrite the bad sector with anything. Assuming the badblocks program printed the LBA, it is possible to find where that sector is. Finding the file is far from trivial, though, so you may have to overwrite the entire partition.
1.863TiB (whole disk) partition. :-( Probably take less time to dd whole disk and then restore from backup, probably most of another day at least, assuming I didn't destroy the device half an hour ago when I tripped and the external case it was in went flying onto the floor (fdisk does still find it). Meanwhile I'm trying to get things in order, including trying to figure out why with my freshly upgraded 24/7 box, from 42.1 to 42.3, KDE3 no longer reports USB stick plugins (yet does with OM), anticipating an attack from the strongest Atlantic hurricane on record. :-( -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
1.863TiB (whole disk) partition.:-( Probably take less time to dd whole disk and then restore from backup, probably most of another day at least, assuming I didn't destroy the device half an hour ago when I tripped and the external case it was in went flying onto the floor (fdisk does still find it). Meanwhile I'm trying to get things in order, including trying to figure out why with my freshly upgraded 24/7 box, from 42.1 to 42.3, KDE3 no longer reports USB stick plugins (yet does with OM), anticipating an attack from the strongest Atlantic hurricane on record.:-( - You can take the failed sector number from smartctl -l selftest and
On 07/09/2017 08:19, Felix Miata wrote: paste it into hdparm --write-sector Repair/overwrite a (possibly bad) sector directly on the media (VERY DANGEROUS) Just follow the instructions that come up when you try to execute it the first time. If that doesn't fix the bad sector then there isn't any space left for reallocating sectors or the disk's firmware is faulty. Repeat until smartctl -t long completes. Seagate have isos for reflashing firmware, I don't know if WD have. It's a lot quicker than badblocks. Dave P -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-07 08:19, Felix Miata wrote:
Carlos E. R. composed on 2017-09-07 02:00 (UTC+0200):
Felix Miata wrote:
Badblocks 1.42.11 on the WD Green AV HD has reached 100.00% completion with (4/0/0 errors) in 129:44:20, 4 bad sectors in sequence.
Smartctl -x before process:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 9 Power_On_Hours -O--CK 071 071 000 - 21875 192 Power-Off_Retract_Count -O--CK 200 200 000 - 414 193 Load_Cycle_Count -O--CK 200 200 000 - 1448
196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 6 198 Offline_Uncorrectable ----CK 100 253 000 - 0
Smartctl after process:
Ah, let's see.
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
197 Current_Pending_Sector -O--CK 200 200 000 - 1 198 Offline_Uncorrectable ----CK 100 253 000 - 0
Any comments?
One Pending out of six!
The number was 6, but the listed bad were 4:
984662252 984662253 984662254 984662255
I think the other two got cleared at the start of the test.
Each I assume to be a badblocks block of 1024 bytes. Internally that could be as few as one 4k sector, and no more than two.
Much better.
Well, what about the long test? Do that. What long test? I just tied up a workspace 5.4 days to do badblocks using the -n (non-destructive read-write mode) switch, and in sloth mode (without a -b 4096 switch)!!!
smartctl --test=long /dev/sdb it runs on the disk firmware, you can continue working almost normally. It will print an estimate of when it will finish, perhaps in 4 hours. You see the result with smartctl -a /dev/sda for instance. You have to look at this block: SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 5650 - # 2 Short offline Completed without error 00% 5641 - # 3 Short offline Completed without error 00% 5617 - # 4 Short offline Completed without error 00% 5602 - # 5 Short offline Completed without error 00% 5585 - # 6 Short offline Completed without error 00% 5571 - # 7 Extended offline Completed without error 00% 5559 - In my case, the short tests are run automatically by a daemon. If there is an error, the test will abort and print the LBA in the last column.
badblocks -o bb-wd20eurs -s -n /dev/sdb1
The thing to do now is overwrite the bad sector with anything. Assuming the badblocks program printed the LBA, it is possible to find where that sector is. Finding the file is far from trivial, though, so you may have to overwrite the entire partition.
1.863TiB (whole disk) partition. :-( Probably take less time to dd whole disk and then restore from backup, probably most of another day at least, assuming I didn't destroy the device half an hour ago when I tripped and the external case it was in went flying onto the floor (fdisk does still find it). Meanwhile I'm trying to get things in order, including trying to figure out why with my freshly upgraded 24/7 box, from 42.1 to 42.3, KDE3 no longer reports USB stick plugins (yet does with OM), anticipating an attack from the strongest Atlantic hurricane on record. :-(
I don't know about that kde3 problem. Yes, I heard about that hurricane. That's climate change in action, there is more energy in the system. I wish good luck to all. We can investigate whether knowing the LBA we can find what file is affected. What filesystem is it? -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Carlos E. R. composed on 2017-09-07 15:17 (UTC+0200): ...
We can investigate whether knowing the LBA we can find what file is affected. What filesystem is it?
EXT3. Now it's several hours into dd if=/dev/zero of=/dev/sdb1 bs=32768 which reports no progress state. # smartctl -x /dev/sdb | grep rent_Pending reports 197 Current_Pending_Sector -O--CK 200 200 000 - 0 so what needed doing seems to have been done already, but I plan to let dd run to completion before beginning restore process. -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-07 15:28, Felix Miata wrote:
Carlos E. R. composed on 2017-09-07 15:17 (UTC+0200):
...
We can investigate whether knowing the LBA we can find what file is affected. What filesystem is it?
EXT3.
https://www.smartmontools.org/browser/trunk/www/badblockhowto.xml#rfile
Now it's several hours into
dd if=/dev/zero of=/dev/sdb1 bs=32768
Then no point in the above, but you should read it, anyway. And give dd 1 MB size at least (one track).
which reports no progress state.
status=LEVEL The LEVEL of information to print to stderr; 'none' suppresses everything but error messages, 'noxfer' suppresses the final transfer statistics, 'progress' shows periodic transfer statistics
# smartctl -x /dev/sdb | grep rent_Pending reports
197 Current_Pending_Sector -O--CK 200 200 000 - 0
so what needed doing seems to have been done already, but I plan to let dd run to completion before beginning restore process.
Don't forget to run "smartctl --test=long /dev/sdb". And later make sure the smartd daemon is enabled to run weekly at least the short test. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Carlos E. R. composed on 2017-09-07 16:10 (UTC+0200):
Felix Miata wrote:
Carlos E. R. composed on 2017-09-07 15:17 (UTC+0200):
...
We can investigate whether knowing the LBA we can find what file is affected. What filesystem is it?
EXT3.
https://www.smartmontools.org/browser/trunk/www/badblockhowto.xml#rfile
Now it's several hours into
dd if=/dev/zero of=/dev/sdb1 bs=32768
Then no point in the above, but you should read it, anyway.
And give dd 1 MB size at least (one track).
which reports no progress state.
status=LEVEL The LEVEL of information to print to stderr; 'none' suppresses everything but error messages, 'noxfer' suppresses the final transfer statistics, 'progress' shows periodic transfer statistics
# smartctl -x /dev/sdb | grep rent_Pending reports
197 Current_Pending_Sector -O--CK 200 200 000 - 0
so what needed doing seems to have been done already, but I plan to let dd run to completion before beginning restore process.
Don't forget to run "smartctl --test=long /dev/sdb".
And later make sure the smartd daemon is enabled to run weekly at least the short test.
Arghhhh! 6+ days on this so far including 6.75 hours to restore, and now with the hurricane almost on my doorstep I'm facing another lengthy process of unknown length. How long is this supposed to take? Smartctl man page is an abomination. It's so big I can't find anything about background/foreground/progress. I tried smartctl --test=long -t offline /dev/sdg and it tells me it can only run one test at a time. How can I run it in foreground and have it report progress? -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Smartctl man page is an abomination. It's so big I can't find anything about background/foreground/progress. I tried
smartctl --test=long -t offline /dev/sdg
and it tells me it can only run one test at a time. How can I run it in foreground and have it report progress? You're right the man page has become unreadable since I last used it. I
On 08/09/2017 02:06, Felix Miata wrote: think the foreground option is -C --captive but AFAIR it doesn't show anything useful apart from when it exits. You can use either -c or "-l selftest" to get the progress of the selftest. I see that the bulk of Harvey hits on Sunday, you're in my thoughts. Dave P -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-08 02:06, Felix Miata wrote:
Carlos E. R. composed on 2017-09-07 16:10 (UTC+0200):
Don't forget to run "smartctl --test=long /dev/sdb".
And later make sure the smartd daemon is enabled to run weekly at least the short test.
Arghhhh! 6+ days on this so far including 6.75 hours to restore, and now with the hurricane almost on my doorstep I'm facing another lengthy process of unknown length. How long is this supposed to take?
Smartctl man page is an abomination. It's so big I can't find anything about background/foreground/progress. I tried
smartctl --test=long -t offline /dev/sdg
and it tells me it can only run one test at a time. How can I run it in foreground and have it report progress?
I told you. It is printed above, don't go reading the man page now: smartctl --test=long /dev/sdb It runs in background. No, you don't make it run in foreground. No, the computer doesn't run the test: the disk runs the test internally, using its own CPU, RAM and ROM. The manufacturer defines the test, it is his code. The command prints out how long it will take. Example: Telcontar:~ # smartctl --test=long /dev/sdc smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 226 minutes for test to complete. <================= Test will complete after Fri Sep 1 08:49:06 2017 <============= Use smartctl -X to abort test. Telcontar:~ # Telcontar:~ # fdisk -l /dev/sdc Disk /dev/sdc: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Wait, your disk will take: Total time to complete Offline data collection: (40500) seconds. you gave that info in your first post. Your disk is slow, that 675 minutes. You can get the progress info by running "smartctl -a /dev/sdb", one line in all that text says the percent of the test done. There is another option that gives shorter info but I don't remember it. And of course you can abort the test. I don't remember about retaking the test from the same point. If all you want is making a backup of the disk in a hurry, forget testing. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 08/09/2017 16:21, Carlos E. R. wrote:
You can get the progress info by running "smartctl -a /dev/sdb", one line in all that text says the percent of the test done. There is another option that gives shorter info but I don't remember it. Both "smartctl -c" and "smartctl -l selftest" give progress information for the self test. Dave P
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-09 06:26, Dave Plater wrote:
On 08/09/2017 16:21, Carlos E. R. wrote:
You can get the progress info by running "smartctl -a /dev/sdb", one line in all that text says the percent of the test done. There is another option that gives shorter info but I don't remember it. Both "smartctl -c" and "smartctl -l selftest" give progress information for the self test.
I have used the latest, but I forget it. One can't remember so many commands... I see now in my notes: smartctl --log=selftest /dev/hda so they are old notes... -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Dave Plater composed on 2017-09-09 06:26 (UTC+0200):
Carlos E. R. wrote:
You can get the progress info by running "smartctl -a /dev/sdb", one line in all that text says the percent of the test done. There is another option that gives shorter info but I don't remember it.
Both "smartctl -c" and "smartctl -l selftest" give progress information for the self test. New WD Purple WD20PURX replacement arrived several hours ago. https://www.monoprice.com/product?p_id=19359 I don't see any evidence of (42.3) progress info:
# smartctl --test=long /dev/sdb smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.87-25-default] (SUSE RPM) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 276 minutes for test to complete. Test will complete after Wed Sep 27 04:47:35 2017 Use smartctl -X to abort test. # smartctl -X /dev/sdb smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.87-25-default] (SUSE RPM) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Abort SMART off-line mode self-test routine". Self-testing aborted! # smartctl --test=long -l selftest /dev/sdb smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.87-25-default] (SUSE RPM) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Aborted by host 90% 1 - === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 276 minutes for test to complete. Test will complete after Wed Sep 27 04:56:03 2017 Use smartctl -X to abort test. # date Wed Sep 27 00:20:55 EDT 2017 -- "Wisdom is supreme; therefore get wisdom. Whatever else you get, get wisdom." Proverbs 4:7 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 27/09/2017 06:33, Felix Miata wrote:
New WD Purple WD20PURX replacement arrived several hours ago. https://www.monoprice.com/product?p_id=19359 I don't see any evidence of (42.3) progress info:
# smartctl --test=long /dev/sdb smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.87-25-default] (SUSE RPM) Copyright (C) 2002-16, Bruce Allen, Christian Franke,www.smartmontools.org
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 276 minutes for test to complete. Test will complete after Wed Sep 27 04:47:35 2017
Use smartctl -X to abort test.
# smartctl -X /dev/sdb smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.87-25-default] (SUSE RPM) Copyright (C) 2002-16, Bruce Allen, Christian Franke,www.smartmontools.org
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Abort SMART off-line mode self-test routine". Self-testing aborted! # smartctl --test=long -l selftest /dev/sdb smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.87-25-default] (SUSE RPM) Copyright (C) 2002-16, Bruce Allen, Christian Franke,www.smartmontools.org
=== START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Aborted by host 90% 1 -
You aborted the test with 90% of the disk still to be tested, it counts down the percentage as the test nears completion. If you use -c this is the part that gives the same percentage: Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Regards Dave P -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. composed on 2017-09-07 16:10 (UTC+0200):
Felix Miata wrote:
Now it's several hours into
dd if=/dev/zero of=/dev/sdb1 bs=32768
Then no point in the above, but you should read it, anyway.
And give dd 1 MB size at least (one track).
???
which reports no progress state.
status=LEVEL The LEVEL of information to print to stderr; 'none' suppresses everything but error messages, 'noxfer' suppresses the final transfer statistics, 'progress' shows periodic transfer statistics
# smartctl -x /dev/sdb | grep rent_Pending reports
197 Current_Pending_Sector -O--CK 200 200 000 - 0
so what needed doing seems to have been done already, but I plan to let dd run to completion before beginning restore process.
Don't forget to run "smartctl --test=long /dev/sdb".
# smartctl -s on -t offline /dev/sdf http://fm.no-ip.com/Tmp/Hardware/Disk/smartcx-wd20eurs-20170908.txt Current pending sector and offline uncorrectable have gone from 0 to 53. :-(
And later make sure the smartd daemon is enabled to run weekly at least the short test.
It's out of a satellite STB DVB receiver that has no smart* to run. # smartctl --help -sh: smartctl: not found root@azbme:~# smart --help -sh: smart: not found -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-08 15:27, Felix Miata wrote:
Carlos E. R. composed on 2017-09-07 16:10 (UTC+0200):
Felix Miata wrote:
Now it's several hours into
dd if=/dev/zero of=/dev/sdb1 bs=32768
Then no point in the above, but you should read it, anyway.
And give dd 1 MB size at least (one track).
???
bs=1M I was told disk tracks are 1MB in size. So use that size or more, better speed. dd if=/dev/zero of=/dev/sdb1 bs=1M status=progress
Don't forget to run "smartctl --test=long /dev/sdb".
# smartctl -s on -t offline /dev/sdf http://fm.no-ip.com/Tmp/Hardware/Disk/smartcx-wd20eurs-20170908.txt
Current pending sector and offline uncorrectable have gone from 0 to 53. :-(
Argh. That disk is asking for retirement fast.
And later make sure the smartd daemon is enabled to run weekly at least the short test.
It's out of a satellite STB DVB receiver that has no smart* to run. # smartctl --help -sh: smartctl: not found root@azbme:~# smart --help -sh: smart: not found
Too bad. Same here, but terrestrial TV. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On Fri, Sep 8, 2017 at 1:49 PM, Carlos E. R. <robin.listas@telefonica.net> wrote:
On 2017-09-08 15:27, Felix Miata wrote:
Carlos E. R. composed on 2017-09-07 16:10 (UTC+0200):
Felix Miata wrote:
Now it's several hours into
dd if=/dev/zero of=/dev/sdb1 bs=32768
Then no point in the above, but you should read it, anyway.
And give dd 1 MB size at least (one track).
???
bs=1M
I was told disk tracks are 1MB in size. So use that size or more, better speed.
dd if=/dev/zero of=/dev/sdb1 bs=1M status=progress
That changes over time, as density increases so does disk tracks capacity. 1MB is a several year old number. And the advantage of writting out a full track at a time is it significantly reduces the likelihood of the drive running out of data in the middle of writing a track. If it runs out, it has to wait 10 msecs or so for the platter to spin back around. The more of those 10 msec delays, the longer the whole wipe takes. I tend to do 10MB just so I don't have to think about the capacity of a single track. 10MB is plenty big to keep the disks write buffers full, but small enough not to impact the system RAM. Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-08 20:43, Greg Freemyer wrote:
On Fri, Sep 8, 2017 at 1:49 PM, Carlos E. R. <> wrote:
On 2017-09-08 15:27, Felix Miata wrote:
Carlos E. R. composed on 2017-09-07 16:10 (UTC+0200):
I was told disk tracks are 1MB in size. So use that size or more, better speed.
dd if=/dev/zero of=/dev/sdb1 bs=1M status=progress
That changes over time, as density increases so does disk tracks capacity.
1MB is a several year old number.
And the advantage of writting out a full track at a time is it significantly reduces the likelihood of the drive running out of data in the middle of writing a track. If it runs out, it has to wait 10 msecs or so for the platter to spin back around.
The more of those 10 msec delays, the longer the whole wipe takes.
I tend to do 10MB just so I don't have to think about the capacity of a single track. 10MB is plenty big to keep the disks write buffers full, but small enough not to impact the system RAM.
Noted, thanks :-) -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 08/09/17 19:53, Carlos E. R. wrote:
The more of those 10 msec delays, the longer the whole wipe takes.
I tend to do 10MB just so I don't have to think about the capacity of a single track. 10MB is plenty big to keep the disks write buffers full, but small enough not to impact the system RAM.
Noted, thanks :-)
Note also that CHS (cylinders, heads, sectors), which used to define track size, is now archaic and meaningless. Modern drives now use "constant angular velocity". Which may mean that declaring a drive as 5200rpm or 7000rpm is also a little meaningless :-) Basically, the physical size of a sector is now constant. So a track near the centre of a disk may have a capacity of 1MB, say. Move further out, double the radius say, and you've doubled the physical length of the track if I remember my maths right. So this track now will store 2MB. Move out the same distance again and that track will store 3MB. In other words, talking about the capacity of a single track is meaningless, a hangover from a simple world when it really was like that. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-08 21:22, Wols Lists wrote:
On 08/09/17 19:53, Carlos E. R. wrote:
The more of those 10 msec delays, the longer the whole wipe takes.
I tend to do 10MB just so I don't have to think about the capacity of a single track. 10MB is plenty big to keep the disks write buffers full, but small enough not to impact the system RAM.
Noted, thanks :-)
Note also that CHS (cylinders, heads, sectors), which used to define track size, is now archaic and meaningless.
Yes, I know that.
Modern drives now use "constant angular velocity". Which may mean that declaring a drive as 5200rpm or 7000rpm is also a little meaningless :-)
I don't understand this. You mean they change rotational speed (rpm) based on some criteria?
Basically, the physical size of a sector is now constant. So a track near the centre of a disk may have a capacity of 1MB, say. Move further out, double the radius say, and you've doubled the physical length of the track if I remember my maths right. So this track now will store 2MB. Move out the same distance again and that track will store 3MB.
Ah, yes. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On Fri, Sep 8, 2017 at 5:57 PM, Carlos E. R. <robin.listas@telefonica.net> wrote:
On 2017-09-08 21:22, Wols Lists wrote:
On 08/09/17 19:53, Carlos E. R. wrote:
The more of those 10 msec delays, the longer the whole wipe takes.
I tend to do 10MB just so I don't have to think about the capacity of a single track. 10MB is plenty big to keep the disks write buffers full, but small enough not to impact the system RAM.
Noted, thanks :-)
Note also that CHS (cylinders, heads, sectors), which used to define track size, is now archaic and meaningless.
Yes, I know that.
Modern drives now use "constant angular velocity". Which may mean that declaring a drive as 5200rpm or 7000rpm is also a little meaningless :-)
I don't understand this. You mean they change rotational speed (rpm) based on some criteria?
They do NOT change rotational speed. They do change transfer rates. Track 0 is near the outer circumference so you have a lot more data passing under the disk head per revolution. Transfer rates can drop in half by the time your reading from the tracks at the end of the drive (near the center).
Basically, the physical size of a sector is now constant. So a track near the centre of a disk may have a capacity of 1MB, say. Move further out, double the radius say, and you've doubled the physical length of the track if I remember my maths right. So this track now will store 2MB. Move out the same distance again and that track will store 3MB.
Ah, yes.
Agreed, but I'm pretty sure you can't do 3x. There is dead space at the center of the platter that isn't used. The radius of the outer tracks is only about 2x the radius of the inner tracks. Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-09 00:17, Greg Freemyer wrote:
On Fri, Sep 8, 2017 at 5:57 PM, Carlos E. R. <> wrote:
Modern drives now use "constant angular velocity". Which may mean that declaring a drive as 5200rpm or 7000rpm is also a little meaningless :-)
I don't understand this. You mean they change rotational speed (rpm) based on some criteria?
They do NOT change rotational speed.
They do change transfer rates. Track 0 is near the outer circumference so you have a lot more data passing under the disk head per revolution.
Transfer rates can drop in half by the time your reading from the tracks at the end of the drive (near the center).
Makes sense. They either do that or adapt sector density, or both. My guess is the later. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Carlos E. R. composed on 2017-09-08 19:49 (UTC+0200):
Felix Miata wrote:
# smartctl -s on -t offline /dev/sdf http://fm.no-ip.com/Tmp/Hardware/Disk/smartcx-wd20eurs-20170908.txt
Current pending sector and offline uncorrectable have gone from 0 to 53. :-(
Argh.
That disk is asking for retirement fast.
In the process of replacing it yesterday, I got about 58 of these: 2017/09/27 22:17:28 [2343] ERROR: movie/<title>.ts failed verification -- update discarded. after which I aborted and redid the rsync from backup. Peak SATA rsync speeds were around 120 MB/sec instead of 70. # hdparm -t /dev/sdb: Timing buffered disk reads: 440 MB in 3.00 seconds = 146.53 MB/sec More than fast enough for a simple STB tuner/recorder. :-) Smartctl long self-test gave these on the replacement: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 100 253 051 - 0 3 Spin_Up_Time POS--K 100 253 021 - 0 4 Start_Stop_Count -O--CK 100 100 000 - 1 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 100 100 000 - 6 10 Spin_Retry_Count -O--CK 100 253 000 - 0 11 Calibration_Retry_Count -O--CK 100 253 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 1 192 Power-Off_Retract_Count -O--CK 200 200 000 - 0 193 Load_Cycle_Count -O--CK 200 200 000 - 1 194 Temperature_Celsius -O---K 113 113 000 - 34 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 0 198 Offline_Uncorrectable ----CK 100 253 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning -- "Wisdom is supreme; therefore get wisdom. Whatever else you get, get wisdom." Proverbs 4:7 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 09/03/2017 06:46 AM, Carlos E. R. wrote:
I remember it been very slow. I think it also had a high cpu load, am I right?
It was slow on 80G 7200 RPM drives -- on the order of hours to complete, I can just imagine how long it would take on Terabyte drives.... -- David C. Rankin, J.D.,P.E.
On 02/09/17 06:01 AM, Carlos E. R. wrote:
Yes, it is very slow to run.
Yes, but that seems excessive to me. I realise that there are a number of tunable parameters to the 'badblocks' command, such as the block size. And yes, if you have a 4K physical then setting the block size to 4k rather than its default 1K will probably help. Certainly on drives over 2T. But are there any faster methods? What does SMART have to offer? Surely by now the vendors will have found a use for all that cheap memory and cpu power they have on the embedded controller boards ??? - one revolution of the disk to write a test pastern to the complete track of every platter - the next revolution to read it back, verifying on the fly 'cos the CPU is so much faster than the disk - the next revolution to write a different pattern - step and repeat though the pattern set - now go to the next track Of course it is a destructive test, but we knew that to start with. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-03 14:30, Anton Aylward wrote:
On 02/09/17 06:01 AM, Carlos E. R. wrote:
Yes, it is very slow to run.
Yes, but that seems excessive to me.
I realise that there are a number of tunable parameters to the 'badblocks' command, such as the block size. And yes, if you have a 4K physical then setting the block size to 4k rather than its default 1K will probably help. Certainly on drives over 2T.
If this is true, it should be automatic. Next time I'll have to remember this.
But are there any faster methods? What does SMART have to offer? Surely by now the vendors will have found a use for all that cheap memory and cpu power they have on the embedded controller boards ???
- one revolution of the disk to write a test pastern to the complete track of every platter - the next revolution to read it back, verifying on the fly 'cos the CPU is so much faster than the disk - the next revolution to write a different pattern - step and repeat though the pattern set - now go to the next track
Of course it is a destructive test, but we knew that to start with.
But badblocks is not destructive. I don't know exactly how badblocks work. Typically, I first run a long SMART test, which is a matter of a few hours. I leave it running during the night. However, if it finds a single surface error, it stops there. If you want to find all other possible errors, and locate their LBA address, you need some tool like badblocks. The idea is to manually overwrite those sectors to force a remap. However, you can simply overwrite the entire disk (or partition) with zeroes. This will cause remap of any bad sector. But you will not know which, nor how many. So I prefer to use badblocks. I have also noticed that, in my cases, and a few others I read about, it failed to find the bad sectors that the long SMART test said existed. Running smartctl again it also failed to find those sectors, so the empirical guess is that it can cause "repair" of sectors transparently. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 03/09/17 09:36 AM, Carlos E. R. wrote:
On 2017-09-03 14:30, Anton Aylward wrote:
On 02/09/17 06:01 AM, Carlos E. R. wrote:
Yes, it is very slow to run.
Yes, but that seems excessive to me.
I realise that there are a number of tunable parameters to the 'badblocks' command, such as the block size. And yes, if you have a 4K physical then setting the block size to 4k rather than its default 1K will probably help. Certainly on drives over 2T.
If this is true, it should be automatic. Next time I'll have to remember this.
The man page states that the default is 1K.
Of course it is a destructive test, but we knew that to start with.
But badblocks is not destructive. I don't know exactly how badblocks work.
The man page seems pretty clear. Normally, badblocks will refuse to do a read/write or a non-destructive test on a device which is mounted, since either can cause the system to potentially crash and/or damage the filesystem even if it is mounted read-only. This can be overridden using the -f flag, but should almost never be used --- if you think you're smarter than the badblocks program, you almost certainly aren't. But I don't think that applies if you are testing a new device -n Use non-destructive read-write mode. By default only a non-destructive read-only test is done. This option must not be combined with the -w option, as they are mutually exclusive. I think that means 'read-write-readback' Of course if the write back of the original data goes bad ... -w Use write-mode test. With this option, badblocks scans for bad blocks by writing some patterns (0xaa, 0x55, 0xff, 0x00) on every block of the device, reading every block and comparing the contents. This option may not be combined with the -n option, as they are mutually exclusive. To my mind, that should be used when testing a new drive. that's what I used on my newly acquired 1T drive. IIR the test ran within 12 hours. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Op zondag 3 september 2017 15:36:52 CEST schreef Carlos E. R.:
On 2017-09-03 14:30, Anton Aylward wrote:
On 02/09/17 06:01 AM, Carlos E. R. wrote:
Yes, it is very slow to run.
Yes, but that seems excessive to me.
I realise that there are a number of tunable parameters to the 'badblocks' command, such as the block size. And yes, if you have a 4K physical then setting the block size to 4k rather than its default 1K will probably help. Certainly on drives over 2T.
If this is true, it should be automatic. Next time I'll have to remember this.
But are there any faster methods? What does SMART have to offer? Surely by now the vendors will have found a use for all that cheap memory and cpu power they have on the embedded controller boards ???
- one revolution of the disk to write a test pastern to the complete track of> every platter
- the next revolution to read it back, verifying on the fly 'cos the CPU is
so much faster than the disk
- the next revolution to write a different pattern - step and repeat though the pattern set - now go to the next track
Of course it is a destructive test, but we knew that to start with.
But badblocks is not destructive. I don't know exactly how badblocks work.
Typically, I first run a long SMART test, which is a matter of a few hours. I leave it running during the night.
However, if it finds a single surface error, it stops there. If you want to find all other possible errors, and locate their LBA address, you need some tool like badblocks. The idea is to manually overwrite those sectors to force a remap.
However, you can simply overwrite the entire disk (or partition) with zeroes. This will cause remap of any bad sector. But you will not know which, nor how many.
So I prefer to use badblocks. I have also noticed that, in my cases, and a few others I read about, it failed to find the bad sectors that the long SMART test said existed. Running smartctl again it also failed to find those sectors, so the empirical guess is that it can cause "repair" of sectors transparently.
In my SystemV Unix days finding badblocks on a 600 MB disk meant call NCR and have it replaced, then restore a backup from tape. But that took about four to five hours. The SLA from NCR did simply not allow us to proceed with a disk that had ( just one or a couple of ) badblocks. -- Gertjan Lettink, a.k.a. Knurpht openSUSE Board Member openSUSE Forums Team -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-09-03 21:51, Knurpht - Gertjan Lettink wrote:
Op zondag 3 september 2017 15:36:52 CEST schreef Carlos E. R.:
On 2017-09-03 14:30, Anton Aylward wrote:
On 02/09/17 06:01 AM, Carlos E. R. wrote:
So I prefer to use badblocks. I have also noticed that, in my cases, and a few others I read about, it failed to find the bad sectors that the long SMART test said existed. Running smartctl again it also failed to find those sectors, so the empirical guess is that it can cause "repair" of sectors transparently.
In my SystemV Unix days finding badblocks on a 600 MB disk meant call NCR and have it replaced, then restore a backup from tape. But that took about four to five hours. The SLA from NCR did simply not allow us to proceed with a disk that had ( just one or a couple of ) badblocks.
Back on my MsDOS days, initially we had hard disks with steppers motors for the head (currently they use "voice coils"), and capacities of 10..32 megs. Yes, MBytes. Those disks came with a paper label listing known defects, ie, sectors known to the manufacturer to be bad! We had to initialize the disk with code that run from the bios, using debug. I think we just told the thing to start running from a certain address, probably residing inside the hard disk controller. As data, we had to enter the interleave and the defect list. An interleave of three meant that after sector #1 came two other sectors, then (IIRC) sector #2; ie, while the computer processed sector #1 the disk had time to continue rotating, and it would get to sector #2 just as the cpu was ready for it, after skipping two other sectors. Setting no interleave would mean that when the cpu was ready for #2, the disk head would be at #4, thus having to rotate one full turn more before reaching #1 again. On my computer I think I needed an interleave of 13. Yes, I tested and timed all numbers from 1 to 13. That was "low level format". Later came the partitioning and the formatting. Having seen that, the paper label with the list of bad sectors, I do not consider some bad blocks as final :-) . The important thing, to me, is that the list doesn't grow. Also, MsDOS could mark a sector as bad in the FAT. Some Linux filesystem can do that, others could not, at least initially (eg, reiserfs). -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 03/09/17 07:47 PM, Carlos E. R. wrote:
An interleave of three meant that after sector #1 came two other sectors, then (IIRC) sector #2; ie, while the computer processed sector #1 the disk had time to continue rotating, and it would get to sector #2 just as the cpu was ready for it, after skipping two other sectors. Setting no interleave would mean that when the cpu was ready for #2, the disk head would be at #4, thus having to rotate one full turn more before reaching #1 again. On my computer I think I needed an interleave of 13. Yes, I tested and timed all numbers from 1 to 13.
Yes, I do recall all this. It was an emergent property of the high cost of components, in this case of memory. Now memory is cheap the disk controller reads a whole track at a time. I think it writes a whole track at a time. Or maybe a few if the head covers that much. Now I come to think about it in this manner, it makes me wonder about what badblocks is actually doing. If badblocks is clumping 64 4096 bytes blocks at a time, what kind of granularity are we really working with, what is the real visibility? If not all tracks have the same capacity, then trying to force this fixed scan size scan onto the way the disk controller is dealing with buffering is going to make the efficiency and interaction .... weird. But lets face it; we're forcing a long standing BUFSIZE of 512 bytes from the program POV, a result of the archaic dis block size, onto a file system model that parametrized the disk block size long ago. The Berkeley Fast File System of the early 1980s used a 4K internal model, so there's nothing new about this magic number. Of course when we get to deal with SSDs this all becomes moot. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. composed on 2017-08-28 15:19 (UTC+0200):
You can try to recover with dd*rescue. Then run "badblocks" to try find the bad sectors, then overwrite them, or better, overwrite the entire disk (both). Finally, restore from backup.
I recovered by deleting the 4 files and restoring from backup. Rsync from it to a new Seagate ST2000DM006 Barracuda took about 6.2 hours using eSATA for both. I have running now: badblocks -o bb-wd20eurs -s -n /dev/sdb1 It's reached 10% completion with (0/0/0 errors) in about 11.75 hours. :-( -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (13)
-
Andrei Borzenkov
-
Anthony Youngman
-
Anton Aylward
-
Basil Chupin
-
Carlos E. R.
-
Dave Howorth
-
Dave Plater
-
David C. Rankin
-
Felix Miata
-
Greg Freemyer
-
Knurpht - Gertjan Lettink
-
Lew Wolfgang
-
Wols Lists