Hello - I have an OpenSuSE x64 15.3 system that has become unhappy about a 4Tb disk drive. When I start up the system, a "start job" with no time limit, is automatically started (I kinda suspect it is actually running fsck but I have no way to find out and the message I do see is pretty useless. Is there a way to find out what this "start job" is actually running? ). I let it run for 5 days before I found it had stopped (crashed perhaps?) So this morning I restarted the computer again and once again this "start job" started and has now been running for a few hours. So my question is, what is a reasonable amount of time to let this run? Can it be bypassed? Should I simply considered the drive is irreparable? I don't know what is a reasonable amount of time since I have never seen fsck take so long. But I have also seen fsck do what I almost call a miracle and restore drives completely after a disk crash... So not sure what I should do at this point and if it is indeed fsck that is running I would like it to repair the drive if it can.. Thanks for any advice... Marc -- *"The Truth is out there" - Spooky* *_ _ . . . . . . _ _ . _ _ _ _ . . . . _ . . . . _ _ . _ _ _ . . . . _ _ . _ . . _ . _ _ _ _ . _ . _ . _ . _ . * Computers: the final frontier. These are the voyages of the user Marc. His mission: to explore strange new hardware. To seek out new software and new applications. To boldly go where no Marc has gone before! (/This email is digitally signed and the OpenPGP electronic signature is added as an attachment. If you know how, you can use my public key to prove this email indeed came from me and has not been modified in transit. My public key, which can be used for sending encrypted email to me also, can be found at - https://keys.openpgp.org/search?q=marc@marcchamberlin.com or just ask me for it and I will send it to you as an attachment. If you don't understand all this geek speak, no worries, just ignore this explanation and ignore the OpenPGP signature key attached to this email (it will look like gibberish if you open it) and/or ask me to explain it further if you like./)
Hello,
In the Message;
Subject : Unhappy computer
Message-ID :
It may be due to the setting.
That is, under certain conditions, Linux performs an fsck filesystem check at OS boot time. Once the fsck check is performed, it may take a long time to boot up, resulting in an unexpected increase in downtime.
So, what is the result of;
# tune2fs -l /dev/sdaX | grep -A 4 "Mount count"
where sdaX is the system device. Thanks for your reply Masaru. Since I cannot boot up my system to either a command shell or a desktop I am not sure how you expected me to be able to run this command. So I took a guess and downloaded and set up a Live USB stick for OpenSuSE 15.3. I then mounted all the partitions from
On 5/1/22 17:54, Masaru Nomiya wrote: the drives that are part of my disk based OpenSuSE 15.3 system. Then I ran your command on the partition that would be mounted on "/" and the results were - Mount count 3 Maximum mount count -1 Last checked - Mon May 2 13:15:55 2022 Check interval 0 (none) Lifetime writes 34 TB HTHs Marc.. -- *"The Truth is out there" - Spooky* *_ _ . . . . . . _ _ . _ _ _ _ . . . . _ . . . . _ _ . _ _ _ . . . . _ _ . _ . . _ . _ _ _ _ . _ . _ . _ . _ . * Computers: the final frontier. These are the voyages of the user Marc. His mission: to explore strange new hardware. To seek out new software and new applications. To boldly go where no Marc has gone before! (/This email is digitally signed and the OpenPGP electronic signature is added as an attachment. If you know how, you can use my public key to prove this email indeed came from me and has not been modified in transit. My public key, which can be used for sending encrypted email to me also, can be found at - https://keys.openpgp.org/search?q=marc@marcchamberlin.com or just ask me for it and I will send it to you as an attachment. If you don't understand all this geek speak, no worries, just ignore this explanation and ignore the OpenPGP signature key attached to this email (it will look like gibberish if you open it) and/or ask me to explain it further if you like./)
Hello,
In the Message;
Subject : Re: Unhappy computer
Message-ID :
On 02.05.2022 03:36, Marc Chamberlin wrote:
Hello - I have an OpenSuSE x64 15.3 system that has become unhappy about a 4Tb disk drive. When I start up the system, a "start job" with no time limit, is automatically started (I kinda suspect it is actually running fsck but I have no way to find out and the message I do see is pretty useless. Is there a way to find out what this "start job" is actually running? ). I let it run for 5 days before I found it had stopped (crashed perhaps?)
So this morning I restarted the computer again and once again this "start job" started and has now been running for a few hours. So my question is, what is a reasonable amount of time to let this run? Can it
To let run *what*? We do not even know what computer is doing.
be bypassed? Should I simply considered the drive is irreparable? I don't know what is a reasonable amount of time since I have never seen fsck take so long. But I have also seen fsck do what I almost call a miracle and restore drives completely after a disk crash... So not sure what I should do at this point and if it is indeed fsck that is running I would like it to repair the drive if it can..
Thanks for any advice... Marc
On 5/1/22 21:34, Andrei Borzenkov wrote:
So this morning I restarted the computer again and once again this "start job" started and has now been running for a few hours. So my question is, what is a reasonable amount of time to let this run? Can it To let run*what*? We do not even know what computer is doing. Hi Andrei, and again thanks for responding. I guess I don't understand your question either, I am simply trying to boot up my computer. Very shortly after selecting which OS I want to boot, from the GRUB menu, the start up process(es) reaches a point where the "start job" I mentioned runs, and keeps running for days if I let it.
I am not sure what the "start job" is doing either, since the output from the boot up scripts don't tell me anything more than simply a "start job" is running with no time-out limit. It is a rather worthless message but that is all I got to work with. However there appears to be a clue in some of the messages prior to this "start job" message that may help. In my reply to Carlos, I am going to try to post a picture to pastbin of the startup messages. There I see an error message about sda6 and this is the partition that contains the "BIOS Boot Partition" At the moment I am trying to figure out how one can check and repair this partition. If you know, I sure would appreciate any help, Google is not being very friendly in my attempts at finding out how to repair a boot partition. HTHs, Marc
-- *"The Truth is out there" - Spooky* *_ _ . . . . . . _ _ . _ _ _ _ . . . . _ . . . . _ _ . _ _ _ . . . . _ _ . _ . . _ . _ _ _ _ . _ . _ . _ . _ . * Computers: the final frontier. These are the voyages of the user Marc. His mission: to explore strange new hardware. To seek out new software and new applications. To boldly go where no Marc has gone before! (/This email is digitally signed and the OpenPGP electronic signature is added as an attachment. If you know how, you can use my public key to prove this email indeed came from me and has not been modified in transit. My public key, which can be used for sending encrypted email to me also, can be found at - https://keys.openpgp.org/search?q=marc@marcchamberlin.com or just ask me for it and I will send it to you as an attachment. If you don't understand all this geek speak, no worries, just ignore this explanation and ignore the OpenPGP signature key attached to this email (it will look like gibberish if you open it) and/or ask me to explain it further if you like./)
On 2022-05-03 18:24, Marc Chamberlin wrote:
In my reply to Carlos, I am going to try to post a picture to pastbin of the startup messages. There I see an error message about sda6 and this is the partition that contains the "BIOS Boot Partition" At the moment I am trying to figure out how one can check and repair this partition. If you know, I sure would appreciate any help, Google is not being very friendly in my attempts at finding out how to repair a boot partition.
No need to check and repair. If it is bad, you "just" need to flash it over again. I don't know if there is a simple command to do this, but IMO, the partition is good if you have loaded the kernel, and you have. So maybe there is a wrong entry in fstab to check that partition when it should not. Try to post the fstab file, using the live system. -- Cheers / Saludos, Carlos E. R. (from 15.3 x86_64 at Telcontar)
On 2022-05-02 02:36, Marc Chamberlin wrote:
Hello - I have an OpenSuSE x64 15.3 system that has become unhappy about a 4Tb disk drive. When I start up the system, a "start job" with no time limit, is automatically started (I kinda suspect it is actually running fsck but I have no way to find out and the message I do see is pretty useless.
Take a photo and post it (on susepaste)
Is there a way to find out what this "start job" is actually running? ). I let it run for 5 days before I found it had stopped (crashed perhaps?)
So this morning I restarted the computer again and once again this "start job" started and has now been running for a few hours. So my question is, what is a reasonable amount of time to let this run? Can it be bypassed? Should I simply considered the drive is irreparable? I don't know what is a reasonable amount of time since I have never seen fsck take so long. But I have also seen fsck do what I almost call a miracle and restore drives completely after a disk crash... So not sure what I should do at this point and if it is indeed fsck that is running I would like it to repair the drive if it can..
But does it end booting? Can you use the computer? -- Cheers / Saludos, Carlos E. R. (from 15.3 x86_64 at Telcontar)
On 2022-05-02 02:36, Marc Chamberlin wrote:
Hello - I have an OpenSuSE x64 15.3 system that has become unhappy about a 4Tb disk drive. When I start up the system, a "start job" with no time limit, is automatically started (I kinda suspect it is actually running fsck but I have no way to find out and the message I do see is pretty useless.
Take a photo and post it (on susepaste) Hi Carlos, and thank you for replying. OK I posted a photo on susepaste at https://susepaste.org/92807944 I hope you can read it, I had to stand below the monitor looking up at it with my phone. (This is our media
On 5/2/22 03:07, Carlos E. R. wrote: center monitor, a rather large one up about 8 feet above the floor)
Is there a way to find out what this "start job" is actually running? ). I let it run for 5 days before I found it had stopped (crashed perhaps?)
So this morning I restarted the computer again and once again this "start job" started and has now been running for a few hours. So my question is, what is a reasonable amount of time to let this run? Can it be bypassed? Should I simply considered the drive is irreparable? I don't know what is a reasonable amount of time since I have never seen fsck take so long. But I have also seen fsck do what I almost call a miracle and restore drives completely after a disk crash... So not sure what I should do at this point and if it is indeed fsck that is running I would like it to repair the drive if it can..
But does it end booting?
No
Can you use the computer? No, but can use a Live USB stick instead.
HTHs, Marc....
-- *"The Truth is out there" - Spooky* *_ _ . . . . . . _ _ . _ _ _ _ . . . . _ . . . . _ _ . _ _ _ . . . . _ _ . _ . . _ . _ _ _ _ . _ . _ . _ . _ . * Computers: the final frontier. These are the voyages of the user Marc. His mission: to explore strange new hardware. To seek out new software and new applications. To boldly go where no Marc has gone before! (/This email is digitally signed and the OpenPGP electronic signature is added as an attachment. If you know how, you can use my public key to prove this email indeed came from me and has not been modified in transit. My public key, which can be used for sending encrypted email to me also, can be found at - https://keys.openpgp.org/search?q=marc@marcchamberlin.com or just ask me for it and I will send it to you as an attachment. If you don't understand all this geek speak, no worries, just ignore this explanation and ignore the OpenPGP signature key attached to this email (it will look like gibberish if you open it) and/or ask me to explain it further if you like./)
On 03.05.2022 20:23, Marc Chamberlin wrote:
On 2022-05-02 02:36, Marc Chamberlin wrote:
Hello - I have an OpenSuSE x64 15.3 system that has become unhappy about a 4Tb disk drive. When I start up the system, a "start job" with no time limit, is automatically started (I kinda suspect it is actually running fsck but I have no way to find out and the message I do see is pretty useless.
Take a photo and post it (on susepaste) Hi Carlos, and thank you for replying. OK I posted a photo on susepaste at https://susepaste.org/92807944 I hope you can read it, I had to stand below the monitor looking up at it with my phone. (This is our media
On 5/2/22 03:07, Carlos E. R. wrote: center monitor, a rather large one up about 8 feet above the floor)
Is there a way to find out what this "start job" is actually running? ). I let it run for 5 days before I found it had stopped (crashed perhaps?)
It waits for disk device that is not there. Judging by errors on the screen, you have physical issue with your disk. Disk itself, cable, disk controller (which may mean motherboard). Try reseating cable as the first step.
So this morning I restarted the computer again and once again this "start job" started and has now been running for a few hours. So my question is, what is a reasonable amount of time to let this run? Can it be bypassed? Should I simply considered the drive is irreparable? I don't know what is a reasonable amount of time since I have never seen fsck take so long. But I have also seen fsck do what I almost call a miracle and restore drives completely after a disk crash... So not sure what I should do at this point and if it is indeed fsck that is running I would like it to repair the drive if it can..
But does it end booting? No Can you use the computer? No, but can use a Live USB stick instead.
HTHs, Marc....
On 03.05.2022 20:23, Marc Chamberlin wrote:
On 2022-05-02 02:36, Marc Chamberlin wrote:
Hello - I have an OpenSuSE x64 15.3 system that has become unhappy about a 4Tb disk drive. When I start up the system, a "start job" with no time limit, is automatically started (I kinda suspect it is actually running fsck but I have no way to find out and the message I do see is pretty useless. Take a photo and post it (on susepaste) Hi Carlos, and thank you for replying. OK I posted a photo on susepaste at https://susepaste.org/92807944 I hope you can read it, I had to stand below the monitor looking up at it with my phone. (This is our media
On 5/2/22 03:07, Carlos E. R. wrote: center monitor, a rather large one up about 8 feet above the floor)
Is there a way to find out what this "start job" is actually running? ). I let it run for 5 days before I found it had stopped (crashed perhaps?)
It waits for disk device that is not there. Judging by errors on the screen, you have physical issue with your disk. Disk itself, cable, disk controller (which may mean motherboard).
Try reseating cable as the first step. Thanks, Andrei, Reseating the drive cables was one of the first things I did, but at your suggestion I decided to reseat them again. During
On 5/3/22 10:57, Andrei Borzenkov wrote: that process I discovered that one of the power chord connectors, to one of the drives, had a loose/broken socket fitting. Not easily seen either, had to look closely at it. So I used a different connector from the power supply and that made my computer a happy camper once again! Glad you were able to interpret those error messages, to me it really looked like some sort of disk crash that fsck was having a hard time finding/fixing. Much appreciate your help and encouragement to take a second look at the cables. Marc...
-- *"The Truth is out there" - Spooky* *_ _ . . . . . . _ _ . _ _ _ _ . . . . _ . . . . _ _ . _ _ _ . . . . _ _ . _ . . _ . _ _ _ _ . _ . _ . _ . _ . * Computers: the final frontier. These are the voyages of the user Marc. His mission: to explore strange new hardware. To seek out new software and new applications. To boldly go where no Marc has gone before! (/This email is digitally signed and the OpenPGP electronic signature is added as an attachment. If you know how, you can use my public key to prove this email indeed came from me and has not been modified in transit. My public key, which can be used for sending encrypted email to me also, can be found at - https://keys.openpgp.org/search?q=marc@marcchamberlin.com or just ask me for it and I will send it to you as an attachment. If you don't understand all this geek speak, no worries, just ignore this explanation and ignore the OpenPGP signature key attached to this email (it will look like gibberish if you open it) and/or ask me to explain it further if you like./)
On 2022-05-03 20:45, Marc Chamberlin wrote:
On 03.05.2022 20:23, Marc Chamberlin wrote:
On 2022-05-02 02:36, Marc Chamberlin wrote:
Hello - I have an OpenSuSE x64 15.3 system that has become unhappy about a 4Tb disk drive. When I start up the system, a "start job" with no time limit, is automatically started (I kinda suspect it is actually running fsck but I have no way to find out and the message I do see is pretty useless. Take a photo and post it (on susepaste) Hi Carlos, and thank you for replying. OK I posted a photo on susepaste at https://susepaste.org/92807944 I hope you can read it, I had to stand below the monitor looking up at it with my phone. (This is our media
On 5/2/22 03:07, Carlos E. R. wrote: center monitor, a rather large one up about 8 feet above the floor)
Is there a way to find out what this "start job" is actually running? ). I let it run for 5 days before I found it had stopped (crashed perhaps?)
It waits for disk device that is not there. Judging by errors on the screen, you have physical issue with your disk. Disk itself, cable, disk controller (which may mean motherboard).
Try reseating cable as the first step. Thanks, Andrei, Reseating the drive cables was one of the first things I did, but at your suggestion I decided to reseat them again. During
On 5/3/22 10:57, Andrei Borzenkov wrote: that process I discovered that one of the power chord connectors, to one of the drives, had a loose/broken socket fitting. Not easily seen either, had to look closely at it. So I used a different connector from the power supply and that made my computer a happy camper once again!
Glad you were able to interpret those error messages, to me it really looked like some sort of disk crash that fsck was having a hard time finding/fixing.
It is clearly seen in the photo :-p I/O error, dev sda, sector etc. :-) -- Cheers / Saludos, Carlos E. R. (from 15.3 x86_64 at Telcontar)
On 2022-05-03 22:31, Carlos E. R. wrote:
On 2022-05-03 20:45, Marc Chamberlin wrote:
On 03.05.2022 20:23, Marc Chamberlin wrote:
On 2022-05-02 02:36, Marc Chamberlin wrote:
Hello - I have an OpenSuSE x64 15.3 system that has become unhappy about a 4Tb disk drive. When I start up the system, a "start job" with no time limit, is automatically started (I kinda suspect it is actually running fsck but I have no way to find out and the message I do see is pretty useless. Take a photo and post it (on susepaste) Hi Carlos, and thank you for replying. OK I posted a photo on susepaste at https://susepaste.org/92807944 I hope you can read it, I had to stand below the monitor looking up at it with my phone. (This is our media
On 5/2/22 03:07, Carlos E. R. wrote: center monitor, a rather large one up about 8 feet above the floor)
Is there a way to find out what this "start job" is actually running? ). I let it run for 5 days before I found it had stopped (crashed perhaps?)
It waits for disk device that is not there. Judging by errors on the screen, you have physical issue with your disk. Disk itself, cable, disk controller (which may mean motherboard).
Try reseating cable as the first step. Thanks, Andrei, Reseating the drive cables was one of the first
On 5/3/22 10:57, Andrei Borzenkov wrote: things I did, but at your suggestion I decided to reseat them again. During that process I discovered that one of the power chord connectors, to one of the drives, had a loose/broken socket fitting. Not easily seen either, had to look closely at it. So I used a different connector from the power supply and that made my computer a happy camper once again!
Glad you were able to interpret those error messages, to me it really looked like some sort of disk crash that fsck was having a hard time finding/fixing.
It is clearly seen in the photo :-p
I/O error, dev sda, sector etc.
:-)
Now that the system is running, you should test the disk with smartctl. -- Cheers / Saludos, Carlos E. R. (from 15.3 x86_64 at Telcontar)
Le 03/05/2022 à 19:23, Marc Chamberlin a écrit :
at https://susepaste.org/92807944 I hope you can read it,
yes, it's readable. you disk sda is failing (sector unreadable), so it's unlikely that it will be other thing like cable I had to stand
below the monitor looking up at it with my phone. (This is our media center monitor, a rather large one up about 8 feet above the floor)
Is there a way to find out what this "start job" is actually running?
most probably the start job runs fsck, may be very long on a failing disk, and may also worn even more the disk.
No, but can use a Live USB stick instead.
good. are they any important data on this disk? if so boot an usb stick, try to mount the relevant partition and backup the data then you should remove the disk and take it apart for future test, but use now a new one. then. if you can, make a ddrescue copy to a file. From memory it's something like ddrescue if=/dev/sd?? (your disk in an usb enclosure) of=my-failing-disk.img but you need time (may be very long) and an other disk large enough to receive the data https://www.gnu.org/software/ddrescue/ the last line of the image gives the UID of the disk, most probably /dev/sda (on your system) of course all this are deduction and may be wrong :-( jdd -- http://dodin.org http://valeriedodin.com
Sun, 1 May 2022 17:36:44 -0700 Marc Chamberlin
Thanks for any advice...
In grub2, press 'e' to be able to edit the kernel cmdline. Look for the line starting with 'linux' or 'linuxefi'. Append ' emergency' to this line. Press either F10 or ctrl-x to leave grub and boot this modified grub entry. Systemd will ask for the root password, and open a shell. Run 'blkid -s TYPE' (or plain 'blkid') to see what was detected. Compare this with the output of 'cat /etc/fstab'. Unless btrfs is in use, try 'fsck -A' to check all filesystems. Good luck. Olaf
Sun, 1 May 2022 17:36:44 -0700 Marc Chamberlin
: Thanks for any advice... In grub2, press 'e' to be able to edit the kernel cmdline. Look for the line starting with 'linux' or 'linuxefi'. Append ' emergency' to this line. Press either F10 or ctrl-x to leave grub and boot this modified grub entry. Systemd will ask for the root password, and open a shell. Run 'blkid -s TYPE' (or plain 'blkid') to see what was detected. Compare this with the output of 'cat /etc/fstab'. Unless btrfs is in use, try 'fsck -A' to check all filesystems.
Good luck.
Olaf Hi Olaf, thanks for replying, but no joy. With the ' emergency' appended to the line starting with 'linux' the boot up process still reaches the
On 5/2/22 10:01, Olaf Hering wrote: point where the 'startup job' runs and stays there. Let me ask this, I don't use btrfs file systems anywhere, and I can only boot up a Live USB, so should I run fsck -A after doing so? Marc... -- *"The Truth is out there" - Spooky* *_ _ . . . . . . _ _ . _ _ _ _ . . . . _ . . . . _ _ . _ _ _ . . . . _ _ . _ . . _ . _ _ _ _ . _ . _ . _ . _ . * Computers: the final frontier. These are the voyages of the user Marc. His mission: to explore strange new hardware. To seek out new software and new applications. To boldly go where no Marc has gone before! (/This email is digitally signed and the OpenPGP electronic signature is added as an attachment. If you know how, you can use my public key to prove this email indeed came from me and has not been modified in transit. My public key, which can be used for sending encrypted email to me also, can be found at - https://keys.openpgp.org/search?q=marc@marcchamberlin.com or just ask me for it and I will send it to you as an attachment. If you don't understand all this geek speak, no worries, just ignore this explanation and ignore the OpenPGP signature key attached to this email (it will look like gibberish if you open it) and/or ask me to explain it further if you like./)
participants (6)
-
Andrei Borzenkov
-
Carlos E. R.
-
jdd@dodin.org
-
Marc Chamberlin
-
Masaru Nomiya
-
Olaf Hering