Re: [SLE] little problem with reiserfs and bad blocks
El 2004-01-30 a las 23:54 -0700, c_nelson77 escribió:
Anychance you can help me down this path a little more? What is it I am suppose to do? I found this: http://linux.about.com/library/cmd/blcmdl8_hdparm.htm
Simply "man hdparm" -- too many cookies there.
The best thing I see is "-D Enable/disable the on-drive defect management feature"
That's right, I was thinking about that. I think it is enabled by default, but you can enable it if on doubt.
Coudl you elaborate on what I should do?
Not right now, I have to go out in a few minutes. We talked about this same issue on the list not longer than a few months ago. Simply trying to write to a bad block triggers the relocation. I did that by copying the partition with errors elsewere, reformating twice (as ext3, check badblocks, nothing found, back to reiser) and restoring every thing. Too conlvoluted, I know: I was testing. If the sectors is known, write to it, or simply move the file over somewhere else. Or do a badbock testing with write - I don't know if it is destructive. Issue this command: smartctl --all /dev/hda|less You will see, amongst other things, a log of your hard disk error, as seen by SMART - if enabled. For example: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 059 054 025 Pre-fail - 176948510 3 Spin_Up_Time 0x0003 097 096 000 Pre-fail - 0 4 Start_Stop_Count 0x0032 099 099 020 Old_age - 1184 5 Reallocated_Sector_Ct 0x0033 098 098 036 Pre-fail - 23 .... That table shows some parameters predicting if the HD is near failure. For a disk with solved errors, I see: Error 325 occurred at disk power-on lifetime: 4275 hours When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER:40 SC:64 SN:9c CL:16 CH:70 D/H:51 ST:51 Sequence of commands leading to the command that caused the error were: DCR FR SC SN CL CH D/H CR Timestamp 00 d0 00 00 15 70 51 40 3.514 00 d0 00 00 14 70 51 40 3.476 00 d0 00 00 13 70 51 40 7.384 00 d0 00 00 12 70 51 40 3.537 00 d0 00 00 11 70 51 40 3.499 You can do testing of the disk (long and short) while on use (without stopping the OS): SMART Self-test log, version number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short off-line Completed 00% 5613 - # 2 Extended off-line Completed 00% 4918 - # 3 Short off-line Completed 00% 4915 - # 4 Short off-line Completed: read failure 90% 4272 0x0170169c You see, info is very complete. I don't know if it is saved on disk EPROM memory, or a track. -- Saludos Carlos Robinson
participants (1)
-
Carlos E. R.