mdadm: Raid1 array constantly degrading.
For several weeks my Raid1 array is constantly degrading on Tumbleweed. Running two HDDs, one of them is suddenly inactive. Re-adding is no problem. What I did so far: - Backup as usual an hoping that things change due to some kind of regression being sorted out via update. - Checking several logs with help of the forum. I could not find a single hint of what is going on - even with running "dd if=/dev/sdb of=/dev/null bs=4M" on both HDDs. - Changing SATA cables. Please find a more detailed description on what has been done via this link: https://forums.opensuse.org/showthread.php/577384-Raid1-array-constantly-deg... Would highly appreciate any suggestions how to sort this out.
Am 10.11.22 um 07:34 schrieb Mark Neugebauer:
For several weeks my Raid1 array is constantly degrading on Tumbleweed. Running two HDDs, one of them is suddenly inactive. Re-adding is no problem. What I did so far:
- Backup as usual an hoping that things change due to some kind of regression being sorted out via update. - Checking several logs with help of the forum. I could not find a single hint of what is going on - even with running "dd if=/dev/sdb of=/dev/null bs=4M" on both HDDs. - Changing SATA cables.
Please find a more detailed description on what has been done via this link: https://forums.opensuse.org/showthread.php/577384-Raid1-array-constantly-deg...
Would highly appreciate any suggestions how to sort this out.
Sounds stange. Is your /etc/mdadm.conf correctly set up? I.e. like DEVICE containers partitions ARRAY /dev/md0 metadata=... UUID=...
Jep, strage. Config is straight forward (I think). This config ran without problems on Unbuntu for >5 years and on TW for about 6 months. Than the stange part began. MAILADDR xxx@yyy.zzz # definitions of existing MD arrays ARRAY /dev/md/d3417-server:0 metadata=1.2 name=d3417-server:0 UUID=3dd9c7b0:7f330c86:56a22768:18efe755
Hi Mark, On Thu, 10 Nov 2022, 07:34:48 +0100, Mark Neugebauer wrote:
For several weeks my Raid1 array is constantly degrading on Tumbleweed. Running two HDDs, one of them is suddenly inactive. Re-adding is no problem. What I did so far:
- Backup as usual an hoping that things change due to some kind of regression being sorted out via update. - Checking several logs with help of the forum. I could not find a single hint of what is going on - even with running "dd if=/dev/sdb of=/dev/null bs=4M" on both HDDs. - Changing SATA cables.
Please find a more detailed description on what has been done via this link: https://forums.opensuse.org/showthread.php/577384-Raid1-array-constantly-deg...
last time I had this several years ago, I swapped SATA cables which helped for some time. Did you try other power cables to serve the disks? Maybe there are some very short power outages due to the plug(s) not being reliably connected anymore.
Would highly appreciate any suggestions how to sort this out.
HTH, cheers. l8er manfred
Nope, did not tried the power cables yet ... but I already thought about that. I thought about logging the voltage levels to be concrete. The one thing that I don't understand: Even if this happens due to a "brown out" (too low voltage level for a short period of time), there should be some kind of read error leading to some kind of log entry. I am far from being a Linux expert but if there is one thing I understood: if you dig deep enough you will find some kind of log about what is happening (might be wrong about that). Degrading a Raid is a major issue of concern from my point of view. So why is mdadm so dead silent about that.
participants (3)
-
Manfred Hollstein
-
Manfred Schwarb
-
Mark Neugebauer