I have seen many folk who have signatures stating that the machine has been running for something like 400 days. Whilst I applaud this type of reliability, I am wondering about the actual fs as the machine isn't rebooted so that fsck can check the partitions. I am perhaps paranoid about keeping the fs in tip top shape but it is the basis that we all rely on. so my question is this: Can a system that has such uptime have its all its fs checked and not be rebooted? Could a system Linux boot floppy specific to the system be inserted, mounted, and then be told to unmount the running systems partitions listed in the fstab(except the floppy, and fsck them and then re-mount them, without losing the uptime figure? Just wondering about the fs of those MAJOR uptime hosts -- ======================================================================== Hylton Conacher - Linux user # 229959 at http://counter.li.org Currently using SuSE 9.0 Professional with KDE 3.1 ========================================================================
On Friday 01 April 2005 11:57, Hylton Conacher (ZR1HPC) wrote:
I have seen many folk who have signatures stating that the machine has been running for something like 400 days.
Whilst I applaud this type of reliability, I am wondering about the actual fs as the machine isn't rebooted so that fsck can check the partitions.
Why? This isn't a FAT partition that gets screwed up at the drop of a hat.
I am perhaps paranoid about keeping the fs in tip top shape but it is the basis that we all rely on.
Perhaps. But if you look at the system and the fsck command, you will see that you can set the number of reboots or time before the command runs. I'm not sure what the max is, but what if you could set it to 100 reboots, or 500 days? As long as it is still running and you aren't getting any errors then the file system is fine. One of the nice things about linux is that with regards to the file system and the way it logs things, you would know long before there are any real problems.
so my question is this: Can a system that has such uptime have its all its fs checked and not be rebooted?
Depends on what it's doing. If there is a lot of disk activity, then I wouldn't take a partition off-line. That a way to have a disaster for sure. As I've already mentioned, it isn't something that is really required with todays file systems. Well, except for the m$ ones.. Mike -- Powered by SuSE 9.2 Kernel 2.6.8 KDE 3.3.0 Kmail 1.7.1 For Mondo/Mindi backup support go to http://www.mikenjane.net/~mike 8:14pm up 2 days 23:36, 3 users, load average: 2.07, 2.15, 2.08
The Friday 2005-04-01 at 11:57 +0200, Hylton Conacher (ZR1HPC) wrote:
I have seen many folk who have signatures stating that the machine has been running for something like 400 days.
Whilst I applaud this type of reliability, I am wondering about the actual fs as the machine isn't rebooted so that fsck can check the partitions.
I am perhaps paranoid about keeping the fs in tip top shape but it is the basis that we all rely on.
YES! You are! :-P X'-)
so my question is this: Can a system that has such uptime have its all its fs checked and not be rebooted?
No need to check them... this is linux, stability is the word here ;-)
Could a system Linux boot floppy specific to the system be inserted, mounted, and then be told to unmount the running systems partitions listed in the fstab(except the floppy, and fsck them and then re-mount them, without losing the uptime figure?
Hah! That floppy has no chance. If it is a boot floppy, it can not boot - unless you reboot. If it is a program, it has no other access than the running system has. The only posibility is umount a partition, one that is not used at the time, then fsck it. That means that you will never be able to check "/". In fact, you can not even check "/home" without login everybody off first. The only theoretical posibility would be, when using a raid 1, mirror, and puting it in single disk mode (if possible in linux, I don't know). One of the halfs would be active, the other could be checked. The problem, and a big one, would be when reactivating the raid: files will have changed during that time: how can you resync both drives...? a nightmare. I know a machine that can do that, kind of, but not a linux machine, and not a PC, and with a really huge price tag. -- Cheers, Carlos Robinson
Fri, 01 Apr 2005, by hylton@global.co.za:
I have seen many folk who have signatures stating that the machine has been running for something like 400 days.
Whilst I applaud this type of reliability, I am wondering about the actual fs as the machine isn't rebooted so that fsck can check the partitions.
I'd be more worried about (kernel) patches that haven't been applied, or even if they are downloaded, they're not incorporated in the running processes. Keeping a server running for a very long time is often a way to ask for trouble when it's going down unplanned too (power-outage e.g.) That doesn't apply to dedicated hosts, with little or no human users or processes that need maintenance of course. Theo -- Theo v. Werkhoven Registered Linux user# 99872 http://counter.li.org ICBM 52 13 26N , 4 29 47E. + ICQ: 277217131 SUSE 9.2 + Jabber: muadib@jabber.xs4all.nl Kernel 2.6.8 + See headers for PGP/GPG info.
On Friday 01 April 2005 03:57 am, Hylton Conacher (ZR1HPC) wrote:
I have seen many folk who have signatures stating that the machine has been running for something like 400 days.
Whilst I applaud this type of reliability, I am wondering about the actual fs as the machine isn't rebooted so that fsck can check the partitions.
I am perhaps paranoid about keeping the fs in tip top shape but it is the basis that we all rely on.
Yeah, the paranoia's unfounded. A good filesystem doesn't have problems like FAT. :)
so my question is this: Can a system that has such uptime have its all its fs checked and not be rebooted?
Could a system Linux boot floppy specific to the system be inserted, mounted, and then be told to unmount the running systems partitions listed in the fstab(except the floppy, and fsck them and then re-mount them, without losing the uptime figure?
A boot floppy won't do it. Well, maybe if you gave VMWare access to the physical drive and booted another system, but that's just asking for trouble. What you're looking for is "telinit". If you just want to maintain uptime, but can take the server off-line for a time, run "telinit 1" to drop down to single user mode, then either unmount the partitions one at a time, or, as long as thy don't need major work, most fscks can handle a mounted partition if it's mounted readonly. Run "mount /path/to/partition -o remount,ro" to remount the partition read-only, and then "mount /path/to/partition -o remount,rw" to reenable read-write. To bring the system back up to full multi-user mode, use either "telinit 3" or "telinit 5" depending on whether you want console or xdm login, respectively.
Just wondering about the fs of those MAJOR uptime hosts
They don't need fsck - they use filesystems that self-maintain. Like, most anything that Windows doesn't use (ext, reiser, HFS, etc) --Danny, whose longest-uptime machine is just barely a year (stupid extended power failure), and isn't directly accessible from the internet
Carlos E. R. wrote:
The Friday 2005-04-01 at 11:57 +0200, Hylton Conacher (ZR1HPC) wrote:
I have seen many folk who have signatures stating that the machine has been running for something like 400 days.
Whilst I applaud this type of reliability, I am wondering about the actual fs as the machine isn't rebooted so that fsck can check the partitions.
I am perhaps paranoid about keeping the fs in tip top shape but it is the basis that we all rely on.
YES! You are! :-P X'-)
so my question is this: Can a system that has such uptime have its all its fs checked and not be rebooted?
No need to check them... this is linux, stability is the word here ;-)
Could a system Linux boot floppy specific to the system be inserted, mounted, and then be told to unmount the running systems partitions listed in the fstab(except the floppy, and fsck them and then re-mount them, without losing the uptime figure?
Hah! That floppy has no chance. If it is a boot floppy, it can not boot - unless you reboot. If it is a program, it has no other access than the running system has.
The only posibility is umount a partition, one that is not used at the time, then fsck it. That means that you will never be able to check "/". In fact, you can not even check "/home" without login everybody off first.
The only theoretical posibility would be, when using a raid 1, mirror, and puting it in single disk mode (if possible in linux, I don't know). One of the halfs would be active, the other could be checked. The problem, and a big one, would be when reactivating the raid: files will have changed during that time: how can you resync both drives...? a nightmare.
I know a machine that can do that, kind of, but not a linux machine, and not a PC, and with a really huge price tag. Since I am unemployed, I'll have to give that machine a skip Carlos. :)
Tnx though. I guess I just gotta start trusting the Linux OS and STOP comparing to that other GUI OS. P.S.: The next chracters are an emoticon test: :') -- ======================================================================== Hylton Conacher - Linux user # 229959 at http://counter.li.org Currently using SuSE 9.0 Professional with KDE 3.1 ========================================================================
Danny Sauer wrote:
On Friday 01 April 2005 03:57 am, Hylton Conacher (ZR1HPC) wrote:
I have seen many folk who have signatures stating that the machine has been running for something like 400 days.
Whilst I applaud this type of reliability, I am wondering about the actual fs as the machine isn't rebooted so that fsck can check the partitions.
I am perhaps paranoid about keeping the fs in tip top shape but it is the basis that we all rely on.
[snip] Tnx Danny. Had a look at the telinit man page and see it is linked to init. Messing with the father of all processes does not bode well and I am just going to have to start trusting the kernel developers that it is going to look after itself, not like the other IS that I am used to and grew up with. -- ======================================================================== Hylton Conacher - Linux user # 229959 at http://counter.li.org Currently using SuSE 9.0 Professional with KDE 3.1 ========================================================================
The Wednesday 2005-04-06 at 16:35 +0200, Hylton Conacher (ZR1HPC) wrote:
I know a machine that can do that, kind of, but not a linux machine, and not a PC, and with a really huge price tag.
Since I am unemployed, I'll have to give that machine a skip Carlos. :)
Me too, by huge price I mean it :-p Anyway, it is a slow machine, it is not a general purpose computer. It doesn't even have a tcp stack.
Tnx though. I guess I just gotta start trusting the Linux OS and STOP comparing to that other GUI OS.
Right! Nevertheless, backups are always a must.
P.S.: The next chracters are an emoticon test: :')
Worked 8-) -- Cheers, Carlos Robinson
On Wednesday 06 April 2005 09:46 am, Hylton Conacher (ZR1HPC) wrote:
Danny Sauer wrote:
On Friday 01 April 2005 03:57 am, Hylton Conacher (ZR1HPC) wrote:
I have seen many folk who have signatures stating that the machine has been running for something like 400 days.
Whilst I applaud this type of reliability, I am wondering about the actual fs as the machine isn't rebooted so that fsck can check the partitions.
I am perhaps paranoid about keeping the fs in tip top shape but it is the basis that we all rely on.
[snip]
Tnx Danny.
Had a look at the telinit man page and see it is linked to init. Messing with the father of all processes does not bode well and I am just going to have to start trusting the kernel developers that it is going to look after itself, not like the other IS that I am used to and grew up with.
It's really not that big of a deal. Yeah, it sounds scary, but it doesn't change anything other than killing some running programs, and then starting them back up again. In the worst case, you reboot and everything's happy again. Granted, in this case you dont' actually need to do anything, but it's seriously no worse than running any other program. There are several legitimate reasons for wanting to drop to single-user mode temporarily - most of which involve changing mount points and otherwise working with disks... --Danny
participants (5)
-
Carlos E. R.
-
Danny Sauer
-
Hylton Conacher (ZR1HPC)
-
Mike
-
Theo v. Werkhoven