[opensuse] worried: failing disk or raid10?
On my 12.2 system my syslog keeps telling (warning?) : Jun 29 11:00:43 storage dbus-daemon[954]: **** CHANGED /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** POLL SYNCING MD /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** CHANGING /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata3/host2/target2:0:0/2:0:0:0/block/sdc/sdc1 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata4/host3/target3:0:0/3:0:0:0/block/sdd/sdd1 Jun 29 11:00:45 storage dbus-daemon[954]: **** EMITTING CHANGED for /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** CHANGED /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** POLL SYNCING MD /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** CHANGING /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata3/host2/target2:0:0/2:0:0:0/block/sdc/sdc1 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata4/host3/target3:0:0/3:0:0:0/block/sdd/sdd1 Jun 29 11:00:47 storage dbus-daemon[954]: **** EMITTING CHANGED for /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** CHANGED /sys/devices/virtual/block/md127 iostat -1 tels me that all four disk are continuously reading at a pace od 60MBps. And once every couple of seconds, four block are written. Even when nothing happens on the box, top gives: 447 root 20 0 0 0 0 R 36.3 0.0 2:59.07 md127_raid10 4526 root 20 0 0 0 0 D 12.2 0.0 1:08.64 md127_resync Should i be worried, or just leave the system at rest for a while ;-) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hans Witvliet wrote:
On my 12.2 system my syslog keeps telling (warning?) :
[big snip]
iostat -1 tels me that all four disk are continuously reading at a pace od 60MBps. And once every couple of seconds, four block are written.
Even when nothing happens on the box, top gives: 447 root 20 0 0 0 0 R 36.3 0.0 2:59.07 md127_raid10 4526 root 20 0 0 0 0 D 12.2 0.0 1:08.64 md127_resync
Should i be worried, or just leave the system at rest for a while ;-)
What does "cat /proc/mdstat" reveal? -- Per Jessen, Zürich (13.4°C) http://www.dns24.ch/ - free DNS hosting, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----Original Message----- From: Per Jessen <per@computer.org> To: opensuse@opensuse.org Subject: Re: [opensuse] worried: failing disk or raid10? Date: Sat, 29 Jun 2013 13:04:09 +0200 Hans Witvliet wrote:
On my 12.2 system my syslog keeps telling (warning?) :
[big snip]
iostat -1 tels me that all four disk are continuously reading at a pace od 60MBps. And once every couple of seconds, four block are written.
Even when nothing happens on the box, top gives: 447 root 20 0 0 0 0 R 36.3 0.0 2:59.07 md127_raid10 4526 root 20 0 0 0 0 D 12.2 0.0 1:08.64 md127_resync
Should i be worried, or just leave the system at rest for a while ;-)
What does "cat /proc/mdstat" reveal? -----Original Message----- hard to tell right now System died during coffee break, became unbootable ;-( will disconnect drives from raid, and will try to repair while only system disk is connected... Wonder if it may the the right moment (while raid is disconnected) to upgrade from 12.2 to 12.3 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----Original Message-----
Even when nothing happens on the box, top gives: 447 root 20 0 0 0 0 R 36.3 0.0 2:59.07 md127_raid10 4526 root 20 0 0 0 0 D 12.2 0.0 1:08.64 md127_resync
Should i be worried, or just leave the system at rest for a while ;-)
What does "cat /proc/mdstat" reveal? -----Original Message----- Thanks Per for reminding me: After a quick repair mdstat said: storage:~ # cat /proc/mdstat Personalities : [raid10] md127 : active raid10 sdb1[1] sdd1[3] sdc1[2] sda1[0] 5860529856 blocks super 1.0 32K chunks 2 near-copies [4/4] [UUUU] [======>..............] resync = 30.2% (1771608384/5860529856) finish=599.7min speed=113630K/sec bitmap: 31/44 pages [124KB], 65536KB chunk unused devices: <none> So, it seems i'm forced to clean up my desk ;-) I presume that any read activities will only slow down the lot... Hans -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hans Witvliet wrote:
-----Original Message-----
Even when nothing happens on the box, top gives: 447 root 20 0 0 0 0 R 36.3 0.0 2:59.07 md127_raid10 4526 root 20 0 0 0 0 D 12.2 0.0 1:08.64 md127_resync
Should i be worried, or just leave the system at rest for a while ;-)
What does "cat /proc/mdstat" reveal?
-----Original Message----- Thanks Per for reminding me:
After a quick repair mdstat said:
storage:~ # cat /proc/mdstat Personalities : [raid10] md127 : active raid10 sdb1[1] sdd1[3] sdc1[2] sda1[0] 5860529856 blocks super 1.0 32K chunks 2 near-copies [4/4] [UUUU] [======>..............] resync = 30.2% (1771608384/5860529856) finish=599.7min speed=113630K/sec bitmap: 31/44 pages [124KB], 65536KB chunk
unused devices: <none>
So, it seems i'm forced to clean up my desk ;-) I presume that any read activities will only slow down the lot...
Having an md127 is indicative of something having gone wrong - unless you actually have 127 RAID arrays :-) -- Per Jessen, Zürich (15.6°C) http://www.dns24.ch/ - free DNS hosting, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
В Sun, 30 Jun 2013 10:16:35 +0200 Per Jessen <per@computer.org> пишет:
Having an md127 is indicative of something having gone wrong - unless you actually have 127 RAID arrays :-)
Not really. It is pretty much normal since default metadata format became 1.x. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrey Borzenkov wrote:
В Sun, 30 Jun 2013 10:16:35 +0200 Per Jessen <per@computer.org> пишет:
Having an md127 is indicative of something having gone wrong - unless you actually have 127 RAID arrays :-)
Not really. It is pretty much normal since default metadata format became 1.x.
Whenever I've seen an md126 or md127, it's been because something went wrong, usually at startup. Sometimes metadata was found where it was no longer used, etc. Just my experience. If you search bugzilla for md127, I think there are a few hits. -- Per Jessen, Zürich (21.2°C) http://www.dns24.ch/ - free DNS hosting, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Sat, Jun 29, 2013 at 5:54 AM, Hans Witvliet <suse@a-domani.nl> wrote:
On my 12.2 system my syslog keeps telling (warning?) :
Jun 29 11:00:43 storage dbus-daemon[954]: **** CHANGED /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** POLL SYNCING MD /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** CHANGING /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata3/host2/target2:0:0/2:0:0:0/block/sdc/sdc1 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata4/host3/target3:0:0/3:0:0:0/block/sdd/sdd1 Jun 29 11:00:45 storage dbus-daemon[954]: **** EMITTING CHANGED for /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** CHANGED /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** POLL SYNCING MD /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** CHANGING /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata3/host2/target2:0:0/2:0:0:0/block/sdc/sdc1 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata4/host3/target3:0:0/3:0:0:0/block/sdd/sdd1 Jun 29 11:00:47 storage dbus-daemon[954]: **** EMITTING CHANGED for /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** CHANGED /sys/devices/virtual/block/md127
iostat -1 tels me that all four disk are continuously reading at a pace od 60MBps. And once every couple of seconds, four block are written.
Even when nothing happens on the box, top gives: 447 root 20 0 0 0 0 R 36.3 0.0 2:59.07 md127_raid10 4526 root 20 0 0 0 0 D 12.2 0.0 1:08.64 md127_resync
Should i be worried, or just leave the system at rest for a while ;-)
Your logs, etc. are consistent with a raid resync. A resync is pretty common so in and of itself nothing is wrong. A resync should be caused by an uncontrolled shutdown, on reboot the array has to be verified. If that was the tirgger, great. Also, a resync can be run via cron every once in a while to scan for bad media blocks. If that is the trigger, maybe you should move it to the middle of the night so it won't bother you in the future. Both of the above are good things. A resync can be caused because a disk is coming and going from the SATA bus. That is a bad thing. If this might be the cause, look thru the logs for that. Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----Original Message----- From: Greg Freemyer <greg.freemyer@gmail.com> To: opensuse@opensuse.org Subject: Re: [opensuse] worried: failing disk or raid10? Date: Sat, 29 Jun 2013 16:11:06 -0400 On Sat, Jun 29, 2013 at 5:54 AM, Hans Witvliet <suse@a-domani.nl> wrote:
On my 12.2 system my syslog keeps telling (warning?) :
Jun 29 11:00:43 storage dbus-daemon[954]: **** CHANGED /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** POLL SYNCING MD /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** CHANGING /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata3/host2/target2:0:0/2:0:0:0/block/sdc/sdc1 Jun 29 11:00:45 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata4/host3/target3:0:0/3:0:0:0/block/sdd/sdd1 Jun 29 11:00:45 storage dbus-daemon[954]: **** EMITTING CHANGED for /sys/devices/virtual/block/md127 Jun 29 11:00:45 storage dbus-daemon[954]: **** CHANGED /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** POLL SYNCING MD /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** CHANGING /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata3/host2/target2:0:0/2:0:0:0/block/sdc/sdc1 Jun 29 11:00:47 storage dbus-daemon[954]: **** UPDATING /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:01.0/ata4/host3/target3:0:0/3:0:0:0/block/sdd/sdd1 Jun 29 11:00:47 storage dbus-daemon[954]: **** EMITTING CHANGED for /sys/devices/virtual/block/md127 Jun 29 11:00:47 storage dbus-daemon[954]: **** CHANGED /sys/devices/virtual/block/md127
iostat -1 tels me that all four disk are continuously reading at a pace od 60MBps. And once every couple of seconds, four block are written.
Even when nothing happens on the box, top gives: 447 root 20 0 0 0 0 R 36.3 0.0 2:59.07 md127_raid10 4526 root 20 0 0 0 0 D 12.2 0.0 1:08.64 md127_resync
Should i be worried, or just leave the system at rest for a while ;-)
Your logs, etc. are consistent with a raid resync. A resync is pretty common so in and of itself nothing is wrong. A resync should be caused by an uncontrolled shutdown, on reboot the array has to be verified. If that was the tirgger, great. Also, a resync can be run via cron every once in a while to scan for bad media blocks. If that is the trigger, maybe you should move it to the middle of the night so it won't bother you in the future. Both of the above are good things. A resync can be caused because a disk is coming and going from the SATA bus. That is a bad thing. If this might be the cause, look thru the logs for that. Greg -----Original Message----- Thanks Greg, I think i had two issues: A flaky LSI scsi-card causing hard freezing of the entire system. And a 4*3TB raid needing time re rebuild mdstat said it needed about 10 hours to complete, so i let the machine/disks run without any extra burden. Seems to be running fine, so it is rsyncing the obs right now.. hw -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (4)
-
Andrey Borzenkov
-
Greg Freemyer
-
Hans Witvliet
-
Per Jessen