Re: Re : Re: [opensuse-kubic] MicroOS 20200815 Rendered System Broken
Thank you for pointing that blog post out, I did not know about the
change. That being said, I am not sure if that is relevant as the post
is for july 27th!
Personally I did not follow the blog post, since of course I did not
know of the change, and because I fall under the type of people who
need /tmp partition on disk rather than loaded into memory so I see no
reason why I should change anything at the moment.
cheers.
On Tue, 18 Aug 2020 at 10:16, contact@ffreitas.io
Did you follow the blog post about the migration to tmpfs for /tmp partition [1] ? Maybe it's correlated.
Regards,
Francisco
[1] : https://kubic.opensuse.org/blog/2020-07-27-tmp_on_tmpfs/
Le mar., août 18, 2020 à 06:57, The Undertaker
a écrit : I would like to confirm this, I started seeing this very exact behaviour earlier as well.
On Tue, 18 Aug 2020 at 07:58, Jim Heald
wrote: Hello! I don't know how to communicate "where" the breakage occurred, but with my latest snapshot (which was 10x bigger than my other snapshots, at 206MB) I observed the following behavior: 1. `/tmp` somehow became a mountpoint for `/`. As such, ls'ing /tmp and / returned the same results, and /tmp became ro, breaking some of my Docker services 2. `sudo` took an unreasonable amount of time. Before I decided to just become root, running something trivial such as `sudo echo hi` took 25 seconds (I typed my password in a previous sudo command)
I would also like to add a few other points that I have observed in this case, though still not having any idea what went wrong here.
1. The system is generally slower in doing pretty much everything now compared to before the update, that includes booting, logging in, and running commands in general.
2. upon boot, and before showing the login prompt, I know receive 5 lines of "failed to start login service" and one line of "failed to start NTP client/server"
3. My HPE Gen 10 DL380 server is in a small server room on a different floor, but now the NICs don't initiate anymore and I can only service it in person or through the HP iLO rather than remotely.
4. The only out of the ordinary prompt during the update process that I noticed was:
"(33/70) Installing: gtk3-tools-3.24.22-1.1.x86_64 [...............done] Additional rpm output: update-alternatives: warning: forcing reinstallation of alternative /usr/bin/gtk-update-icon-cache-3.0 because link group gtk-update-icon-cache is broken update-alternatives: warning: skip creation of /usr/share/man/man1/gtk-update-icon-cache.1.gz because associated file /usr/share/man/man1/gtk-update-icon-cache-3.0.1.gz (of link group gtk-update-icon-cache) doesn't exist"
but I don't have any clue how that pertains to this issue, if at all.
Doing `transactional-update rollback last` worked flawlessly,
In my case I rebooted > Start bootloader from read-only snapshot > chose the previous working snapshot > ran "snapper rollback" when system successfully rebooted.
Anything I can help with, count me in as well.
Cheers. -- To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
It might be a misunderstanding of my part. But reading the blog post it seems like you need to do a few actions even if you want to keep /tmp on disk.
Cheers,
Francisco
Sent with ProtonMail Secure Email.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
Le mardi 18 août 2020 07:59, The Undertaker
Thank you for pointing that blog post out, I did not know about the change. That being said, I am not sure if that is relevant as the post is for july 27th!
Personally I did not follow the blog post, since of course I did not know of the change, and because I fall under the type of people who need /tmp partition on disk rather than loaded into memory so I see no reason why I should change anything at the moment.
cheers.
On Tue, 18 Aug 2020 at 10:16, contact@ffreitas.io contact@ffreitas.io wrote:
Did you follow the blog post about the migration to tmpfs for /tmp partition [1] ? Maybe it's correlated. Regards, Francisco [1] : https://kubic.opensuse.org/blog/2020-07-27-tmp_on_tmpfs/ Le mar., août 18, 2020 à 06:57, The Undertaker takertxgs4@gmail.com a écrit : I would like to confirm this, I started seeing this very exact behaviour earlier as well. On Tue, 18 Aug 2020 at 07:58, Jim Heald james.r.heald@gmail.com wrote:
Hello! I don't know how to communicate "where" the breakage occurred, but with my latest snapshot (which was 10x bigger than my other snapshots, at 206MB) I observed the following behavior:
1. `/tmp` somehow became a mountpoint for `/`. As such, ls'ing /tmp and / returned the same results, and /tmp became ro, breaking some of my Docker services
2. `sudo` took an unreasonable amount of time. Before I decided to just become root, running something trivial such as `sudo echo hi` took 25 seconds (I typed my password in a previous sudo command)
I would also like to add a few other points that I have observed in this case, though still not having any idea what went wrong here.
1. The system is generally slower in doing pretty much everything now compared to before the update, that includes booting, logging in, and running commands in general.
2. upon boot, and before showing the login prompt, I know receive 5 lines of "failed to start login service" and one line of "failed to start NTP client/server"
3. My HPE Gen 10 DL380 server is in a small server room on a different floor, but now the NICs don't initiate anymore and I can only service it in person or through the HP iLO rather than remotely.
4. The only out of the ordinary prompt during the update process that I noticed was:
"(33/70) Installing: gtk3-tools-3.24.22-1.1.x86_64 [...............done] Additional rpm output: update-alternatives: warning: forcing reinstallation of alternative /usr/bin/gtk-update-icon-cache-3.0 because link group gtk-update-icon-cache is broken update-alternatives: warning: skip creation of /usr/share/man/man1/gtk-update-icon-cache.1.gz because associated file /usr/share/man/man1/gtk-update-icon-cache-3.0.1.gz (of link group gtk-update-icon-cache) doesn't exist" but I don't have any clue how that pertains to this issue, if at all.
Doing `transactional-update rollback last` worked flawlessly,
In my case I rebooted > Start bootloader from read-only snapshot > chose the previous working snapshot > ran "snapper rollback" when system successfully rebooted. Anything I can help with, count me in as well.
Cheers.
--------
To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
Hi, this is a genuine bug: https://bugzilla.opensuse.org/show_bug.cgi?id=1175379 The change in microos-tools did not take into account that /tmp might not be on tmpfs, which is the case for systems installed before July. This is why openQA did not catch the issue. Cheers, Fabian Am Dienstag, 18. August 2020, 07:59:36 CEST schrieb The Undertaker:
Thank you for pointing that blog post out, I did not know about the change. That being said, I am not sure if that is relevant as the post is for july 27th!
Personally I did not follow the blog post, since of course I did not know of the change, and because I fall under the type of people who need /tmp partition on disk rather than loaded into memory so I see no reason why I should change anything at the moment.
cheers.
On Tue, 18 Aug 2020 at 10:16, contact@ffreitas.io
wrote: Did you follow the blog post about the migration to tmpfs for /tmp partition [1] ? Maybe it's correlated.
Regards,
Francisco
[1] : https://kubic.opensuse.org/blog/2020-07-27-tmp_on_tmpfs/
Le mar., août 18, 2020 à 06:57, The Undertaker
a écrit : I would like to confirm this, I started seeing this very exact behaviour earlier as well.
On Tue, 18 Aug 2020 at 07:58, Jim Heald
wrote: Hello! I don't know how to communicate "where" the breakage occurred, but with my latest snapshot (which was 10x bigger than my other snapshots, at 206MB) I observed the following behavior: 1. `/tmp` somehow became a mountpoint for `/`. As such, ls'ing /tmp and / returned the same results, and /tmp became ro, breaking some of my Docker services 2. `sudo` took an unreasonable amount of time. Before I decided to just become root, running something trivial such as `sudo echo hi` took 25 seconds (I typed my password in a previous sudo command)
I would also like to add a few other points that I have observed in this case, though still not having any idea what went wrong here.
1. The system is generally slower in doing pretty much everything now compared to before the update, that includes booting, logging in, and running commands in general.
2. upon boot, and before showing the login prompt, I know receive 5 lines of "failed to start login service" and one line of "failed to start NTP client/server"
3. My HPE Gen 10 DL380 server is in a small server room on a different floor, but now the NICs don't initiate anymore and I can only service it in person or through the HP iLO rather than remotely.
4. The only out of the ordinary prompt during the update process that I noticed was:
"(33/70) Installing: gtk3-tools-3.24.22-1.1.x86_64 [...............done] Additional rpm output: update-alternatives: warning: forcing reinstallation of alternative /usr/bin/gtk-update-icon-cache-3.0 because link group gtk-update-icon-cache is broken update-alternatives: warning: skip creation of /usr/share/man/man1/gtk-update-icon-cache.1.gz because associated file /usr/share/man/man1/gtk-update-icon-cache-3.0.1.gz (of link group gtk-update-icon-cache) doesn't exist"
but I don't have any clue how that pertains to this issue, if at all.
Doing `transactional-update rollback last` worked flawlessly,
In my case I rebooted > Start bootloader from read-only snapshot > chose the previous working snapshot > ran "snapper rollback" when system successfully rebooted.
Anything I can help with, count me in as well.
Cheers. -- To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
Hi,
Thank you for letting us know about the currently open bug, which
completely explains what is happening. My only question is will this
bug be addressed any time soon or all the people who installed prior
to july, myself included, have to take the workaround?
Cheers.
On Tue, 18 Aug 2020 at 11:29, Fabian Vogt
Hi,
this is a genuine bug: https://bugzilla.opensuse.org/show_bug.cgi?id=1175379 The change in microos-tools did not take into account that /tmp might not be on tmpfs, which is the case for systems installed before July. This is why openQA did not catch the issue.
Cheers, Fabian
Am Dienstag, 18. August 2020, 07:59:36 CEST schrieb The Undertaker:
Thank you for pointing that blog post out, I did not know about the change. That being said, I am not sure if that is relevant as the post is for july 27th!
Personally I did not follow the blog post, since of course I did not know of the change, and because I fall under the type of people who need /tmp partition on disk rather than loaded into memory so I see no reason why I should change anything at the moment.
cheers.
On Tue, 18 Aug 2020 at 10:16, contact@ffreitas.io
wrote: Did you follow the blog post about the migration to tmpfs for /tmp partition [1] ? Maybe it's correlated.
Regards,
Francisco
[1] : https://kubic.opensuse.org/blog/2020-07-27-tmp_on_tmpfs/
Le mar., août 18, 2020 à 06:57, The Undertaker
a écrit : I would like to confirm this, I started seeing this very exact behaviour earlier as well.
On Tue, 18 Aug 2020 at 07:58, Jim Heald
wrote: Hello! I don't know how to communicate "where" the breakage occurred, but with my latest snapshot (which was 10x bigger than my other snapshots, at 206MB) I observed the following behavior: 1. `/tmp` somehow became a mountpoint for `/`. As such, ls'ing /tmp and / returned the same results, and /tmp became ro, breaking some of my Docker services 2. `sudo` took an unreasonable amount of time. Before I decided to just become root, running something trivial such as `sudo echo hi` took 25 seconds (I typed my password in a previous sudo command)
I would also like to add a few other points that I have observed in this case, though still not having any idea what went wrong here.
1. The system is generally slower in doing pretty much everything now compared to before the update, that includes booting, logging in, and running commands in general.
2. upon boot, and before showing the login prompt, I know receive 5 lines of "failed to start login service" and one line of "failed to start NTP client/server"
3. My HPE Gen 10 DL380 server is in a small server room on a different floor, but now the NICs don't initiate anymore and I can only service it in person or through the HP iLO rather than remotely.
4. The only out of the ordinary prompt during the update process that I noticed was:
"(33/70) Installing: gtk3-tools-3.24.22-1.1.x86_64 [...............done] Additional rpm output: update-alternatives: warning: forcing reinstallation of alternative /usr/bin/gtk-update-icon-cache-3.0 because link group gtk-update-icon-cache is broken update-alternatives: warning: skip creation of /usr/share/man/man1/gtk-update-icon-cache.1.gz because associated file /usr/share/man/man1/gtk-update-icon-cache-3.0.1.gz (of link group gtk-update-icon-cache) doesn't exist"
but I don't have any clue how that pertains to this issue, if at all.
Doing `transactional-update rollback last` worked flawlessly,
In my case I rebooted > Start bootloader from read-only snapshot > chose the previous working snapshot > ran "snapper rollback" when system successfully rebooted.
Anything I can help with, count me in as well.
Cheers. -- To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
Hi, Am Dienstag, 18. August 2020, 11:33:12 CEST schrieb The Undertaker:
Hi,
Thank you for letting us know about the currently open bug, which completely explains what is happening. My only question is will this bug be addressed any time soon or all the people who installed prior to july, myself included, have to take the workaround?
A microos-tools with the culprit removed was put into the TW update channel, so the next installation of updates will fix it. Cheers, Fabian
Cheers.
On Tue, 18 Aug 2020 at 11:29, Fabian Vogt
wrote: Hi,
this is a genuine bug: https://bugzilla.opensuse.org/show_bug.cgi?id=1175379 The change in microos-tools did not take into account that /tmp might not be on tmpfs, which is the case for systems installed before July. This is why openQA did not catch the issue.
Cheers, Fabian
Am Dienstag, 18. August 2020, 07:59:36 CEST schrieb The Undertaker:
Thank you for pointing that blog post out, I did not know about the change. That being said, I am not sure if that is relevant as the post is for july 27th!
Personally I did not follow the blog post, since of course I did not know of the change, and because I fall under the type of people who need /tmp partition on disk rather than loaded into memory so I see no reason why I should change anything at the moment.
cheers.
On Tue, 18 Aug 2020 at 10:16, contact@ffreitas.io
wrote: Did you follow the blog post about the migration to tmpfs for /tmp partition [1] ? Maybe it's correlated.
Regards,
Francisco
[1] : https://kubic.opensuse.org/blog/2020-07-27-tmp_on_tmpfs/
Le mar., août 18, 2020 à 06:57, The Undertaker
a écrit : I would like to confirm this, I started seeing this very exact behaviour earlier as well.
On Tue, 18 Aug 2020 at 07:58, Jim Heald
wrote: Hello! I don't know how to communicate "where" the breakage occurred, but with my latest snapshot (which was 10x bigger than my other snapshots, at 206MB) I observed the following behavior: 1. `/tmp` somehow became a mountpoint for `/`. As such, ls'ing /tmp and / returned the same results, and /tmp became ro, breaking some of my Docker services 2. `sudo` took an unreasonable amount of time. Before I decided to just become root, running something trivial such as `sudo echo hi` took 25 seconds (I typed my password in a previous sudo command)
I would also like to add a few other points that I have observed in this case, though still not having any idea what went wrong here.
1. The system is generally slower in doing pretty much everything now compared to before the update, that includes booting, logging in, and running commands in general.
2. upon boot, and before showing the login prompt, I know receive 5 lines of "failed to start login service" and one line of "failed to start NTP client/server"
3. My HPE Gen 10 DL380 server is in a small server room on a different floor, but now the NICs don't initiate anymore and I can only service it in person or through the HP iLO rather than remotely.
4. The only out of the ordinary prompt during the update process that I noticed was:
"(33/70) Installing: gtk3-tools-3.24.22-1.1.x86_64 [...............done] Additional rpm output: update-alternatives: warning: forcing reinstallation of alternative /usr/bin/gtk-update-icon-cache-3.0 because link group gtk-update-icon-cache is broken update-alternatives: warning: skip creation of /usr/share/man/man1/gtk-update-icon-cache.1.gz because associated file /usr/share/man/man1/gtk-update-icon-cache-3.0.1.gz (of link group gtk-update-icon-cache) doesn't exist"
but I don't have any clue how that pertains to this issue, if at all.
Doing `transactional-update rollback last` worked flawlessly,
In my case I rebooted > Start bootloader from read-only snapshot > chose the previous working snapshot > ran "snapper rollback" when system successfully rebooted.
Anything I can help with, count me in as well.
Cheers. -- To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
Hi, There are two ways to "fix" this: 1. do a rollback - login on the console - transactional-update rollback last - systemctl reboot - login again and now transactional-update dup should update your system without running again into this problem 2. Edit /etc/fstab - Remove line with /tmp - Reboot We will improve our health-checker to test if /tmp and systemd-logind are Ok, too. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & MicroOS SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany Managing Director: Felix Imendoerffer (HRB 36809, AG Nürnberg) -- To unsubscribe, e-mail: opensuse-kubic+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kubic+owner@opensuse.org
On Tue, 2020-08-18 at 11:34 +0200, Thorsten Kukuk wrote:
There are two ways to "fix" this:
[...]
2. Edit /etc/fstab - Remove line with /tmp - Reboot
Yep! FWIW, I faced the very same issue yesterday, did exactly this thing (remove /tmp from fstab) and everything got back to working. As a matter of fact, I was actually planning to move /tmp on tmpfs myself at some point, even before knowing that we were planning on doing that "officially"... So, at least for me, this is an actual improvement rather than a fix or a workaround! :-P Jokes apart, it was of course an annoying issue, but it's cool to see that it's being addressed quickly, and also that it led to this:
We will improve our health-checker to test if /tmp and systemd-logind are Ok, too.
Thanks and Regards
--
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<
participants (5)
-
contact@ffreitas.io
-
Dario Faggioli
-
Fabian Vogt
-
The Undertaker
-
Thorsten Kukuk