[opensuse] /var failed to mount, md raid/LVM volume woes - 11.1
My /var partition failed to mount this morning with the error "resize inode not valid". I have had no soft or hard crashes that I am aware of and have made no recent changes to the structure and layout of my volumes. /var is a logical volume managed by LVM running on my /dev/md2 device I have three md devices on two disks. My volume setup is as follows: /dev/md0 formatted to ext3 contains root OS /dev/md1 has LVM volume group /dev/vgHome and a single logical volume (/dev/vgHome/home mounted under /home /dev/md2 has LVM group /dev/vgData and two logical volumes (/dev/vgData/data & /dev/vgData/var) mounted under /data and /var respectively. What I have done so far is basically fsck.ext3 -f on all my volumes from a recovery console, all volumes are reported as clean. I can actually manually mount /var but the system will not boot with /var in fstab. cat /proc/mdstat & mdadm verifies the md devices are running and in order, however i have the following issues I have noticed. When booting boot.md reports as failed, here is a snippet from boot.msg Starting MD Raid mdadm: /dev/md/1 has been started with 2 drives. mdadm: /dev/md/2 has been started with 2 drives. failed I have noticed "File descriptor 3 left open" messages from LVM during boot but I would have though if there were problems, fsck.ext3 would pick that up? So how to troubleshoot "resize inode not valid" on my /var? Cheers Graham -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Graham Anderson wrote:
My /var partition failed to mount this morning with the error "resize inode not valid". I have had no soft or hard crashes that I am aware of and have made no recent changes to the structure and layout of my volumes.
I known nothing about ext3, but googling 'resize inode not valid' returns quite a few hits from the last 2-3 years. Maybe one those will give you a hint.
What I have done so far is basically fsck.ext3 -f on all my volumes from a recovery console, all volumes are reported as clean. I can actually manually mount /var but the system will not boot with /var in fstab.
I believe the only different between those two is that the latter does an fsck before mounting it. The mount command should be the same, and therefore should produce the same result (unless something else changes the state of the filesystem). /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Monday 29 December 2008 13:41:57 Per Jessen wrote:
My /var partition failed to mount this morning with the error "resize inode not valid". I have had no soft or hard crashes that I am aware of and have made no recent changes to the structure and layout of my volumes.
I known nothing about ext3, but googling 'resize inode not valid' returns quite a few hits from the last 2-3 years. ?Maybe one those will give you a hint.
Indeed, I did a quick google myself before posting. Many of the issues with that term seem to be related to SELinux and I understand 11.1 brings support for SELinux however I'm unwilling to go full steam ahead and make changes suggested to my volumes without fully understanding the problem. As my /var volume would mount ok and was usable up until this morning, I'm looking for some sage advice on what may have gone wrong. Why would it mount one day and not the next? Is this SELinux related at all? All questions I would prefer knowing answer too before i go tweaking my volumes! Cheers Graham -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Graham Anderson wrote:
Indeed, I did a quick google myself before posting. Many of the issues with that term seem to be related to SELinux and I understand 11.1 brings support for SELinux however I'm unwilling to go full steam ahead and make changes suggested to my volumes without fully understanding the problem.
As my /var volume would mount ok and was usable up until this morning, I'm looking for some sage advice on what may have gone wrong. Why would it mount one day and not the next? Is this SELinux related at all? All questions I would prefer knowing answer too before i go tweaking my volumes!
I doubt if SELinux is somehow involved. OpenSUSE does have the support, but still uses AppArmor by default. To me it's very important that you can mount the volume manually, but that it can't be mounted automaticvally at boot-up. I think you need to look at that to understand what has gone wrong. /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen said the following on 12/29/2008 08:25 AM:
To me it's very important that you can mount the volume manually, but that it can't be mounted automaticvally at boot-up. I think you need to look at that to understand what has gone wrong.
I've stepped though the relvant /etc/init.d/boot stuff, eimiaitng the parallelism and al that ... wolf fencing. Its the return from the 'parallel' fsck with the "-R A -M -a -t noopts=nofail" option list. For soem reason it gives a return code of 3. Running same on /var manually at the command line doens't. The strange thing is that even in the boot sequence the verbiage says /var is clean. Its as if there is some 'overload' when fsck deals with a particular LVM volume. -- The early bird gets the worm, but the second mouse gets the cheese. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Anton Aylward wrote:
Per Jessen said the following on 12/29/2008 08:25 AM:
To me it's very important that you can mount the volume manually, but that it can't be mounted automaticvally at boot-up. I think you need to look at that to understand what has gone wrong.
I've stepped though the relvant /etc/init.d/boot stuff, eimiaitng the parallelism and al that ... wolf fencing.
Its the return from the 'parallel' fsck with the "-R A -M -a -t noopts=nofail" option list. For soem reason it gives a return code of 3.
fsck rc=3 means 'File system errors corrected' + 'System should be rebooted'. With the '-R -A' options, you get all filesystems except root checked - any chance that another filesystem is failing? /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen said the following on 12/29/2008 08:48 AM:
Anton Aylward wrote:
To me it's very important that you can mount the volume manually, but that it can't be mounted automaticvally at boot-up. I think you need to look at that to understand what has gone wrong. I've stepped though the relvant /etc/init.d/boot stuff, eimiaitng the
Per Jessen said the following on 12/29/2008 08:25 AM: parallelism and al that ... wolf fencing.
Its the return from the 'parallel' fsck with the "-R A -M -a -t noopts=nofail" option list. For soem reason it gives a return code of 3.
fsck rc=3 means 'File system errors corrected' + 'System should be rebooted'. With the '-R -A' options, you get all filesystems except root checked - any chance that another filesystem is failing?
Yes, I read the MAN page :-) Yes, I tweeked /etc/sysconfig so that there were no parallel processs and /etc/fstab so that there were no other FS except /boot (which wasn't LVM) and swap (which wasn't LVM). I can wolf-fence :-) -- One should guard against preaching to young people success in the customary form as the main aim in life. The most important motive for work in school and in life is pleasure in work, pleasure in its result, and the knowledge of the value of the result to the community. --Albert Einstein -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Graham Anderson said the following on 12/29/2008 08:10 AM:
Indeed, I did a quick google myself before posting. Many of the issues with that term seem to be related to SELinux and I understand 11.1 brings support for SELinux however I'm unwilling to go full steam ahead and make changes suggested to my volumes without fully understanding the problem.
You and me both, bro' !
As my /var volume would mount ok and was usable up until this morning, I'm looking for some sage advice on what may have gone wrong. Why would it mount one day and not the next? Is this SELinux related at all? All questions I would prefer knowing answer too before i go tweaking my volumes!
I'm rapidly approaching the point where I give up on 11.1. all these small annoyances are getting me down, the proverbial 'death of a thousand duck bites'. If anyone has any advice on rolling back to 11.0 I'd welcome hearing from them. -- The emphasis should be on "why" we do a job - W. Edwards Deming -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen said the following on 12/29/2008 07:41 AM:
Graham Anderson wrote:
My /var partition failed to mount this morning with the error "resize inode not valid". I have had no soft or hard crashes that I am aware of and have made no recent changes to the structure and layout of my volumes.
I known nothing about ext3, but googling 'resize inode not valid' returns quite a few hits from the last 2-3 years. Maybe one those will give you a hint.
What I have done so far is basically fsck.ext3 -f on all my volumes from a recovery console, all volumes are reported as clean. I can actually manually mount /var but the system will not boot with /var in fstab.
I believe the only different between those two is that the latter does an fsck before mounting it. The mount command should be the same, and therefore should produce the same result (unless something else changes the state of the filesystem).
I don't think that it. I've been getting this with /var ever since I moved to 11.1 and thought it was a mistake I'd made until I saw this posting. I too run LVM and in my case /var is reiserFS. My investigation involved eliminating concurrency of the parallel fsck and setting thee 6th field of /etc/fstab. As far as I can make out fsck is returning exit code 2 on /var even after finding that it is perfectly good. This happens if I place the line for /var right after root or much later in thee list of filesystem. The real problem is that when when /var doesn't pass I get strings of errors from other processes in /etc/init.d, for example, unable to create PID files. The strange thing is that the error message says that it can't write to a read-only FS. If /var errored and can't be mounted then why those errors? Surely it defaults back to the root FS, the /var on root that the mount would overlay? I'm forced to conclude that the /dev/vgmain/var mounts in read-only mode. WHY? And more to the point, why is this intermittent? And why "/var"? And why with 11.1 when 11.0 worked fine? I HATE marginal, intermittent problems! -- "Quality is not a sprint; it is a long-distance event." Daniel Hunt. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Anton Aylward wrote:
If /var errored and can't be mounted then why those errors? Surely it defaults back to the root FS, the /var on root that the mount would overlay?
No, the start-up will/must fail when the required filesystems cannot be mounted. Required = what's specified in /etc/fstab. /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen said the following on 12/29/2008 08:35 AM:
Anton Aylward wrote:
If /var errored and can't be mounted then why those errors? Surely it defaults back to the root FS, the /var on root that the mount would overlay?
No, the start-up will/must fail when the required filesystems cannot be mounted. Required = what's specified in /etc/fstab.
Yes, the start-up fails. I get the long line of messages saying thigns can't bee done becuase /var is read only. Why is /var read only? If it hasn't mounted then the /var on the rootFS is RO. No, that's not the case. If it has mounted RO the why? Because the FSCK failed? Somehow? Yes there was an 'error' in re on the right hand side back there. But if FSCK failed why did it mount? I know, basically, what SHOULD happen. I'm reporting what DOES happen. -- I will not attack your doctrines nor your creeds if they accord liberty to me. If they hold thought to be dangerous - if they aver that doubt is a crime, then I attack them one and all, because they enslave the minds of men. --Robert Ingersoll (The Ghosts) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Anton Aylward wrote:
Per Jessen said the following on 12/29/2008 08:35 AM:
Anton Aylward wrote:
If /var errored and can't be mounted then why those errors? Surely it defaults back to the root FS, the /var on root that the mount would overlay?
No, the start-up will/must fail when the required filesystems cannot be mounted. Required = what's specified in /etc/fstab.
Yes, the start-up fails. I get the long line of messages saying thigns can't bee done becuase /var is read only.
Why is /var read only?
Wild guess - as long as the filesystem is actually fine, it may have been mounted read-only (for whatever reason), and was not yet mounted read-write - but during boot-up, lots of stuff will want to write to files in /var/log/, so it really needs to be up and running very fast. I think I would list /var with 6th field = 1 in fstab. What do you have now? /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen said the following on 12/29/2008 09:09 AM:
Anton Aylward wrote:
Per Jessen said the following on 12/29/2008 08:35 AM:
Anton Aylward wrote:
If /var errored and can't be mounted then why those errors? Surely it defaults back to the root FS, the /var on root that the mount would overlay? No, the start-up will/must fail when the required filesystems cannot be mounted. Required = what's specified in /etc/fstab. Yes, the start-up fails. I get the long line of messages saying thigns can't bee done becuase /var is read only.
Why is /var read only?
Wild guess - as long as the filesystem is actually fine, it may have been mounted read-only (for whatever reason), and was not yet mounted read-write - but during boot-up, lots of stuff will want to write to files in /var/log/, so it really needs to be up and running very fast. I think I would list /var with 6th field = 1 in fstab. What do you have now?
That's interesting. I checked my backup and installation sets everyiting except "/" to "2". That includes not only "/var" but "/boot" and "/tmp" and "/usr" as well. I agree with you, yes it needs to be up and running PDQ. That's why I have it in the 4th position: root, boot, swap, var And yes, I had it like that in 11.0 -- Whitehead's Law: The obvious answer is always overlooked. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Anton Aylward wrote:
Per Jessen said the following on 12/29/2008 09:09 AM:
Wild guess - as long as the filesystem is actually fine, it may have been mounted read-only (for whatever reason), and was not yet mounted read-write - but during boot-up, lots of stuff will want to write to files in /var/log/, so it really needs to be up and running very fast. I think I would list /var with 6th field = 1 in fstab. What do you have now?
That's interesting. I checked my backup and installation sets everyiting except "/" to "2". That includes not only "/var" but "/boot" and "/tmp" and "/usr" as well.
Yes, the norm is for everything but root to have a 2.
I agree with you, yes it needs to be up and running PDQ. That's why I have it in the 4th position: root, boot, swap, var
And yes, I had it like that in 11.0
Sounds like something's changed in the boot-up sequence from 11.0 to 11.1 - to see if the position in fstab makes any difference, you could try "root, var, boot, swap" - you don't need /boot nor swap for booting up. /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (3)
-
Anton Aylward
-
Graham Anderson
-
Per Jessen