[opensuse] System is booting up. See pam_nologin(8)
I upgraded/patched a 13.2 xen guest just now and after rebooting, I see $SUBJ when I login via ssh. What does $SUBJ mean? /run/nologin exists. -- Per Jessen, Zürich (9.1°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 05/16/2016 05:27 AM, Per Jessen wrote:
I upgraded/patched a 13.2 xen guest just now and after rebooting, I see $SUBJ when I login via ssh. What does $SUBJ mean?
/run/nologin exists.
What is the ownership/permission and the contents of that file? I presume you've read the nologin man page? http://man7.org/linux/man-pages/man8/pam_nologin.8.html Can any user login? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 05/16/2016 05:27 AM, Per Jessen wrote:
I upgraded/patched a 13.2 xen guest just now and after rebooting, I see $SUBJ when I login via ssh. What does $SUBJ mean?
/run/nologin exists.
What is the ownership/permission and the contents of that file?
# ls -l /run/nologin -rw-r--r-- 1 root root 40 May 16 11:26 /run/nologin
I presume you've read the nologin man page? http://man7.org/linux/man-pages/man8/pam_nologin.8.html
Yes, but it offers no clue as to why this suddenly started happening.
Can any user login?
Nope, only root. -- Per Jessen, Zürich (9.3°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
16.05.2016 13:53, Per Jessen пишет:
Anton Aylward wrote:
On 05/16/2016 05:27 AM, Per Jessen wrote:
I upgraded/patched a 13.2 xen guest just now and after rebooting, I see $SUBJ when I login via ssh. What does $SUBJ mean?
/run/nologin exists.
What is the ownership/permission and the contents of that file?
# ls -l /run/nologin -rw-r--r-- 1 root root 40 May 16 11:26 /run/nologin
I presume you've read the nologin man page? http://man7.org/linux/man-pages/man8/pam_nologin.8.html
Yes, but it offers no clue as to why this suddenly started happening.
Normally it should be removed by systemd-user-sessions.service on bootup. If you remove this file and reboot - does it appear again? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
16.05.2016 13:53, Per Jessen пишет:
Anton Aylward wrote:
On 05/16/2016 05:27 AM, Per Jessen wrote:
I upgraded/patched a 13.2 xen guest just now and after rebooting, I see $SUBJ when I login via ssh. What does $SUBJ mean?
/run/nologin exists.
What is the ownership/permission and the contents of that file?
# ls -l /run/nologin -rw-r--r-- 1 root root 40 May 16 11:26 /run/nologin
I presume you've read the nologin man page? http://man7.org/linux/man-pages/man8/pam_nologin.8.html
Yes, but it offers no clue as to why this suddenly started happening.
Normally it should be removed by systemd-user-sessions.service on bootup. If you remove this file and reboot - does it appear again?
Just tried that, the file did not appear again. Weird. -- Per Jessen, Zürich (9.5°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Andrei Borzenkov wrote:
16.05.2016 13:53, Per Jessen пишет:
Anton Aylward wrote:
On 05/16/2016 05:27 AM, Per Jessen wrote:
I upgraded/patched a 13.2 xen guest just now and after rebooting, I see $SUBJ when I login via ssh. What does $SUBJ mean?
/run/nologin exists.
What is the ownership/permission and the contents of that file?
# ls -l /run/nologin -rw-r--r-- 1 root root 40 May 16 11:26 /run/nologin
I presume you've read the nologin man page? http://man7.org/linux/man-pages/man8/pam_nologin.8.html
Yes, but it offers no clue as to why this suddenly started happening.
Normally it should be removed by systemd-user-sessions.service on bootup. If you remove this file and reboot - does it appear again?
Just tried that, the file did not appear again. Weird.
Seems it is not being removed - 192.168.2.132:/run # touch nologin 192.168.2.132:/run # ls -l /run/nologin -rw-r--r-- 1 root root 0 May 16 13:24 /run/nologin 192.168.2.132:/run # reboot Connection to 192.168.2.132 closed by remote host. Connection to 192.168.2.132 closed. per@io64:~> ssh root@192.168.2.132 System is booting up. See pam_nologin(8) -- Per Jessen, Zürich (10.2°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
16.05.2016 14:29, Per Jessen пишет:
Per Jessen wrote:
Andrei Borzenkov wrote:
16.05.2016 13:53, Per Jessen пишет:
Anton Aylward wrote:
On 05/16/2016 05:27 AM, Per Jessen wrote:
I upgraded/patched a 13.2 xen guest just now and after rebooting, I see $SUBJ when I login via ssh. What does $SUBJ mean?
/run/nologin exists.
What is the ownership/permission and the contents of that file?
# ls -l /run/nologin -rw-r--r-- 1 root root 40 May 16 11:26 /run/nologin
I presume you've read the nologin man page? http://man7.org/linux/man-pages/man8/pam_nologin.8.html
Yes, but it offers no clue as to why this suddenly started happening.
Normally it should be removed by systemd-user-sessions.service on bootup. If you remove this file and reboot - does it appear again?
Just tried that, the file did not appear again. Weird.
Seems it is not being removed -
192.168.2.132:/run # touch nologin 192.168.2.132:/run # ls -l /run/nologin -rw-r--r-- 1 root root 0 May 16 13:24 /run/nologin 192.168.2.132:/run # reboot Connection to 192.168.2.132 closed by remote host. Connection to 192.168.2.132 closed. per@io64:~> ssh root@192.168.2.132 System is booting up. See pam_nologin(8)
Is file still the same? pam_nologin displays content of this file and you created empty file. Does it still have the same modification time and is empty? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
16.05.2016 14:29, Per Jessen пишет:
Per Jessen wrote:
Andrei Borzenkov wrote:
16.05.2016 13:53, Per Jessen пишет:
Anton Aylward wrote:
On 05/16/2016 05:27 AM, Per Jessen wrote: > I upgraded/patched a 13.2 xen guest just now and after > rebooting, I > see $SUBJ when I login via ssh. What does $SUBJ mean? > > /run/nologin exists.
What is the ownership/permission and the contents of that file?
# ls -l /run/nologin -rw-r--r-- 1 root root 40 May 16 11:26 /run/nologin
I presume you've read the nologin man page? http://man7.org/linux/man-pages/man8/pam_nologin.8.html
Yes, but it offers no clue as to why this suddenly started happening.
Normally it should be removed by systemd-user-sessions.service on bootup. If you remove this file and reboot - does it appear again?
Just tried that, the file did not appear again. Weird.
Seems it is not being removed -
192.168.2.132:/run # touch nologin 192.168.2.132:/run # ls -l /run/nologin -rw-r--r-- 1 root root 0 May 16 13:24 /run/nologin 192.168.2.132:/run # reboot Connection to 192.168.2.132 closed by remote host. Connection to 192.168.2.132 closed. per@io64:~> ssh root@192.168.2.132 System is booting up. See pam_nologin(8)
Is file still the same? pam_nologin displays content of this file and you created empty file. Does it still have the same modification time and is empty?
It is no longer empty, it now has the message from above. Timestamp looks the same, but I'll try it again. -- Per Jessen, Zürich (12.6°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
Is file still the same? pam_nologin displays content of this file and you created empty file. Does it still have the same modification time and is empty?
Just repeated the exercise - created an empty /run/nologin, rebooted, the file is then updated with the "system is booting up" text. The problem seems to be that this file is never removed? -- Per Jessen, Zürich (12.5°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Andrei Borzenkov wrote:
Is file still the same? pam_nologin displays content of this file and you created empty file. Does it still have the same modification time and is empty?
Just repeated the exercise - created an empty /run/nologin, rebooted, the file is then updated with the "system is booting up" text. The problem seems to be that this file is never removed?
Have also just reproduced on a brand-new 13.2 xen guest. -- Per Jessen, Zürich (12.2°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
* Per Jessen
Per Jessen wrote:
Andrei Borzenkov wrote:
Is file still the same? pam_nologin displays content of this file and you created empty file. Does it still have the same modification time and is empty?
Just repeated the exercise - created an empty /run/nologin, rebooted, the file is then updated with the "system is booting up" text. The problem seems to be that this file is never removed?
Have also just reproduced on a brand-new 13.2 xen guest.
I see this screen msg, "system is booting up...", occasionally when I try to ssh into a box which has just been rebooted and hasn't completed it startup. After a few moments I can access without seeing the msg. Perhaps whatever is supposed to clear the msg when booting is complete, is not doing it's job, but I have no idea what that would be. -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.org openSUSE Community Member facebook/ptilopteri http://wahoo.no-ip.org Photo Album: http://wahoo.no-ip.org/gallery2 Registered Linux User #207535 @ http://linuxcounter.net -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
16.05.2016 17:08, Patrick Shanahan пишет:
* Per Jessen
[05-16-16 09:36]: Per Jessen wrote:
Andrei Borzenkov wrote:
Is file still the same? pam_nologin displays content of this file and you created empty file. Does it still have the same modification time and is empty?
Just repeated the exercise - created an empty /run/nologin, rebooted, the file is then updated with the "system is booting up" text. The problem seems to be that this file is never removed?
Well, /run is in memory, so it is lost after reboot. Something must create it.
Have also just reproduced on a brand-new 13.2 xen guest.
I see this screen msg, "system is booting up...", occasionally when I try to ssh into a box which has just been rebooted and hasn't completed it startup. After a few moments I can access without seeing the msg.
Perhaps whatever is supposed to clear the msg when booting is complete, is not doing it's job, but I have no idea what that would be.
File is *created* on boot by this tmpfile snippet: /usr/lib/tmpfile.d/systemd-nologin.conf What is the content of this file? This should be executed only by the service systemd-tmpfiles-setup.service. This file (/run/nologin) should be removed by systemd-user-sessions.service. So the first step is to check status of both services (including execution time). Check that /run is indeed tmpfs. Boot with systemd.log_level=debug and "quiet" removed from kernel command line, it should show when each service is being started. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
File is *created* on boot by this tmpfile snippet:
/usr/lib/tmpfile.d/systemd-nologin.conf
What is the content of this file?
<------------- # This file is part of systemd. # # systemd is free software; you can redistribute it and/or modify it # under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation; either version 2.1 of the License, or # (at your option) any later version. # See tmpfiles.d(5) and systemd-forbid-user-logins.service(5). # This file has special suffix so it is not run by mistake. F! /run/nologin 0644 - - - "System is booting up. See pam_nologin(8)" ----------> Same as on more backlevel 13.2 systems.
This should be executed only by the service systemd-tmpfiles-setup.service. This file (/run/nologin) should be removed by systemd-user-sessions.service. So the first step is to check status of both services (including execution time).
Both look normal (I'll be happy post output if anyone wants to have a look).
Check that /run is indeed tmpfs.
Checked, it is.
Boot with systemd.log_level=debug and "quiet" removed from kernel command line, it should show when each service is being started.
I'll post some output in a minute. -- Per Jessen, Zürich (12.6°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Andrei Borzenkov wrote:
Boot with systemd.log_level=debug and "quiet" removed from kernel command line, it should show when each service is being started.
I'll post some output in a minute.
http://files.jessen.ch/temp78-dmesg-bootup1.txt http://files.jessen.ch/temp78-journal-bootup.txt /Per -- Per Jessen, Zürich (11.2°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
16.05.2016 18:22, Per Jessen пишет:
Per Jessen wrote:
Andrei Borzenkov wrote:
Boot with systemd.log_level=debug and "quiet" removed from kernel command line, it should show when each service is being started.
I'll post some output in a minute.
http://files.jessen.ch/temp78-dmesg-bootup1.txt http://files.jessen.ch/temp78-journal-bootup.txt
This looks pretty normal. I assume, /run/nologin was not present this time? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Per Jessen wrote:
Andrei Borzenkov wrote:
Boot with systemd.log_level=debug and "quiet" removed from kernel command line, it should show when each service is being started.
I'll post some output in a minute.
http://files.jessen.ch/temp78-dmesg-bootup1.txt http://files.jessen.ch/temp78-journal-bootup.txt
Seems to be a bit of a Heisenbug. I have two test systems: "calcium" (xen guest, been running longer, was patched today) and "temp78" (freshly installed xen guest), both on the same xen host. Both systems showed the problem today after upgrading. After booting temp78 with "systemd.log_level=debug", at first I suddenly seemed unable to reproduce, also when I removed that parameter. Now the problem has reappeared. Then I rebooted both, virtually at the same time, neither system had a problem. Then I rebooted temp78, problem is back. -- Per Jessen, Zürich (11.1°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
16.05.2016 18:49, Per Jessen пишет:
Per Jessen wrote:
Per Jessen wrote:
Andrei Borzenkov wrote:
Boot with systemd.log_level=debug and "quiet" removed from kernel command line, it should show when each service is being started.
I'll post some output in a minute.
http://files.jessen.ch/temp78-dmesg-bootup1.txt http://files.jessen.ch/temp78-journal-bootup.txt
Seems to be a bit of a Heisenbug. I have two test systems: "calcium" (xen guest, been running longer, was patched today) and "temp78" (freshly installed xen guest), both on the same xen host. Both systems showed the problem today after upgrading.
After booting temp78 with "systemd.log_level=debug", at first I suddenly seemed unable to reproduce, also when I removed that parameter. Now the problem has reappeared. Then I rebooted both, virtually at the same time, neither system had a problem. Then I rebooted temp78, problem is back.
Yes, it is apparently some race here. Unfortunately it means you are the only one who can possibly debug it :( Adding log_level=debug also changes timings so it may hide this problem. Try at least removing "quiet" from kernel command line, can you still reproduce it? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
16.05.2016 18:49, Per Jessen пишет:
Seems to be a bit of a Heisenbug. I have two test systems: "calcium" (xen guest, been running longer, was patched today) and "temp78" (freshly installed xen guest), both on the same xen host. Both systems showed the problem today after upgrading.
After booting temp78 with "systemd.log_level=debug", at first I suddenly seemed unable to reproduce, also when I removed that parameter. Now the problem has reappeared. Then I rebooted both, virtually at the same time, neither system had a problem. Then I rebooted temp78, problem is back.
Yes, it is apparently some race here. Unfortunately it means you are the only one who can possibly debug it :(
I wonder if it might be interesting to compare with a 13.2 system that does not have this issue? On real HW though, not xen.
Adding log_level=debug also changes timings so it may hide this problem.
Right.
Try at least removing "quiet" from kernel command line, can you still reproduce it?
I rarely use "quiet", so yes, always reproducable without. In 3-4-5 consecutive reboots, the problem is reproducable with a single system, without log_level=debug. In 9-10 consecutive reboots with log_level=debug, I didn't get a single case. As soon as I removed log_level=debug, I got the first case. -- Per Jessen, Zürich (9.9°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tue, May 17, 2016 at 10:47 AM, Per Jessen
Try at least removing "quiet" from kernel command line, can you still reproduce it?
I rarely use "quiet", so yes, always reproducable without.
Can you make available full "journalctl -b" after boot with this problem? Without "quiet" logs should contain at least timing of services start. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
On Tue, May 17, 2016 at 10:47 AM, Per Jessen
wrote: Try at least removing "quiet" from kernel command line, can you still reproduce it?
I rarely use "quiet", so yes, always reproducable without.
Can you make available full "journalctl -b" after boot with this problem? Without "quiet" logs should contain at least timing of services start.
Yep, this is the most recent one: http://files.jessen.ch/temp78-journal-system-is-booting-up.txt Just noticed a separate issue - nscd is having trouble starting: May 17 10:04:18 temp78 nscd[257]: 257 /var/run/nscd/nscd.pid: No such file or directory May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/passwd; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/group; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/hosts; no sharing possible May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/services; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/netgroup; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 /var/run/nscd/socket: No such file or directory -- Per Jessen, Zürich (11.6°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tue, May 17, 2016 at 11:09 AM, Per Jessen
Andrei Borzenkov wrote:
On Tue, May 17, 2016 at 10:47 AM, Per Jessen
wrote: Try at least removing "quiet" from kernel command line, can you still reproduce it?
I rarely use "quiet", so yes, always reproducable without.
Can you make available full "journalctl -b" after boot with this problem? Without "quiet" logs should contain at least timing of services start.
Yep, this is the most recent one: http://files.jessen.ch/temp78-journal-system-is-booting-up.txt
Well ... May 17 10:04:20 temp78 systemd[1]: Starting Create Volatile Files and Directories... May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Create Volatile Files and Directories. So they are started basically at the same time. Whoever wins. Now something is extremely wrong. "Create Volatile Files and Directories" should run BEFORE sysinit.target, which means before any other service including "Permit User Sessions". I am sure you have some loop in service dependencies. Could you please start once more with log_level.debug and this time provide full output of "journalctl -b", not just the lines you think are relevant.
Just noticed a separate issue - nscd is having trouble starting:
May 17 10:04:18 temp78 nscd[257]: 257 /var/run/nscd/nscd.pid: No such file or directory May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/passwd; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/group; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/hosts; no sharing possible May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/services; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/netgroup; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 /var/run/nscd/socket: No such file or directory
This is most likely just side effect of not running Create Volatile Files and Directories early enough. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
On Tue, May 17, 2016 at 11:09 AM, Per Jessen
wrote: Andrei Borzenkov wrote:
On Tue, May 17, 2016 at 10:47 AM, Per Jessen
wrote: Try at least removing "quiet" from kernel command line, can you still reproduce it?
I rarely use "quiet", so yes, always reproducable without.
Can you make available full "journalctl -b" after boot with this problem? Without "quiet" logs should contain at least timing of services start.
Yep, this is the most recent one: http://files.jessen.ch/temp78-journal-system-is-booting-up.txt
Well ...
May 17 10:04:20 temp78 systemd[1]: Starting Create Volatile Files and Directories... May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Create Volatile Files and Directories.
So they are started basically at the same time. Whoever wins.
Now something is extremely wrong. "Create Volatile Files and Directories" should run BEFORE sysinit.target, which means before any other service including "Permit User Sessions". I am sure you have some loop in service dependencies. Could you please start once more with log_level.debug and this time provide full output of "journalctl -b", not just the lines you think are relevant.
Okay: http://files.jessen.ch/temp78-journal-with-systemd-loglevel-debug.txt
Just noticed a separate issue - nscd is having trouble starting:
May 17 10:04:18 temp78 nscd[257]: 257 /var/run/nscd/nscd.pid: No such file or directory May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/passwd; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/group; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/hosts; no sharing possible May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/services; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 cannot create /var/run/nscd/netgroup; no persistent database used May 17 10:04:18 temp78 nscd[257]: 257 /var/run/nscd/socket: No such file or directory
This is most likely just side effect of not running Create Volatile Files and Directories early enough.
On the boot-up above, nscd had no problems. -- Per Jessen, Zürich (12.9°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Andrei Borzenkov wrote:
May 17 10:04:20 temp78 systemd[1]: Starting Create Volatile Files and Directories... May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Create Volatile Files and Directories.
So they are started basically at the same time. Whoever wins.
Now something is extremely wrong. "Create Volatile Files and Directories" should run BEFORE sysinit.target, which means before any other service including "Permit User Sessions". I am sure you have some loop in service dependencies. Could you please start once more with log_level.debug and this time provide full output of "journalctl -b", not just the lines you think are relevant.
Okay:
http://files.jessen.ch/temp78-journal-with-systemd-loglevel-debug.txt
With more precise timestamps: http://files.jessen.ch/temp78-journal-with-systemd-loglevel-debug2.txt temp78:~ # egrep 'Create Volatile Files|Permit User' /tmp/temp78-journal-with-systemd-loglevel-debug2.txt May 17 11:24:48.993423 temp78 systemd[1]: Starting Create Volatile Files and Directories... May 17 11:24:49.068520 temp78 systemd[1]: Started Create Volatile Files and Directories. May 17 11:24:49.188530 temp78 systemd[1]: Starting Permit User Sessions... May 17 11:24:49.511029 temp78 systemd[1]: Started Permit User Sessions. -- Per Jessen, Zürich (13.1°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Per Jessen wrote:
Andrei Borzenkov wrote:
May 17 10:04:20 temp78 systemd[1]: Starting Create Volatile Files and Directories... May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Create Volatile Files and Directories.
So they are started basically at the same time. Whoever wins.
Now something is extremely wrong. "Create Volatile Files and Directories" should run BEFORE sysinit.target, which means before any other service including "Permit User Sessions". I am sure you have some loop in service dependencies. Could you please start once more with log_level.debug and this time provide full output of "journalctl -b", not just the lines you think are relevant.
Okay:
http://files.jessen.ch/temp78-journal-with-systemd-loglevel-debug.txt
With more precise timestamps: http://files.jessen.ch/temp78-journal-with-systemd-loglevel-debug2.txt
temp78:~ # egrep 'Create Volatile Files|Permit User' /tmp/temp78-journal-with-systemd-loglevel-debug2.txt May 17 11:24:48.993423 temp78 systemd[1]: Starting Create Volatile Files and Directories... May 17 11:24:49.068520 temp78 systemd[1]: Started Create Volatile Files and Directories. May 17 11:24:49.188530 temp78 systemd[1]: Starting Permit User Sessions... May 17 11:24:49.511029 temp78 systemd[1]: Started Permit User Sessions.
Looking at another 13.2 box (real hw), slightly backlevel, just rebooted: sogo:~ # journalctl -b -o short-precise --no-pager | egrep 'Create Volatile Files|Permit User' May 17 11:49:34.150276 sogo systemd[1]: Starting Create Volatile Files and Directories... May 17 11:49:34.229376 sogo systemd[1]: Started Create Volatile Files and Directories. May 17 11:49:34.459178 sogo systemd[1]: Starting Permit User Sessions... May 17 11:49:34.774590 sogo systemd[1]: Started Permit User Sessions. On other machines, 13.1 or Leap, I see more like a 15 second gap between "Create Volatile Files and Directories" and "Permit User Sessions". -- Per Jessen, Zürich (13.2°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tue, May 17, 2016 at 12:29 PM, Per Jessen
Andrei Borzenkov wrote:
On Tue, May 17, 2016 at 11:09 AM, Per Jessen
wrote: Andrei Borzenkov wrote:
On Tue, May 17, 2016 at 10:47 AM, Per Jessen
wrote: Try at least removing "quiet" from kernel command line, can you still reproduce it?
I rarely use "quiet", so yes, always reproducable without.
Can you make available full "journalctl -b" after boot with this problem? Without "quiet" logs should contain at least timing of services start.
Yep, this is the most recent one: http://files.jessen.ch/temp78-journal-system-is-booting-up.txt
Well ...
May 17 10:04:20 temp78 systemd[1]: Starting Create Volatile Files and Directories... May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Permit User Sessions. May 17 10:04:20 temp78 systemd[1]: Started Create Volatile Files and Directories.
So they are started basically at the same time. Whoever wins.
Now something is extremely wrong. "Create Volatile Files and Directories" should run BEFORE sysinit.target, which means before any other service including "Permit User Sessions". I am sure you have some loop in service dependencies. Could you please start once more with log_level.debug and this time provide full output of "journalctl -b", not just the lines you think are relevant.
Okay:
http://files.jessen.ch/temp78-journal-with-systemd-loglevel-debug.txt
Well ... that's actually upstream bug and I'm immensely surprised it was not hit more often. Here is what happens. Normally systemd is started first in initrd; before switching root it saves its state (which includes information about units that are active) on "disk" (tmpfs actually) and reloads its state after it is started second time from real root. When systemd is started in initrd, it goes via sequence very close to real-root initialization, that includes also sysinit.target and basic.target (standard targets for low level system initialization). These targets are expected to be stopped before switching root, but in your case root is switched too fast, so systemd records sysinit.target and basic.target states as Started. It means that all other services that normally would wait for basic.target are started immediately, causing great confusion. Note that your logs show Cannot add dependency job for unit initrd-udevadm-cleanup-db.service, ignoring: Unit initrd-udevadm-cleanup-db.service failed to load: No such file or directory. Root switch is ordered after initrd-udevadm-cleanup-db.service, and it is the *ONLY* service that would add some delay to root switching. It is possible that some bug was introduced that made this unit unavailable, thus exposing this race condition. But adding initrd-udevadm-cleanup-db.service is not real fix, it just hides fundamental issue. So it's bugzilla time. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
Note that your logs show
Cannot add dependency job for unit initrd-udevadm-cleanup-db.service, ignoring: Unit initrd-udevadm-cleanup-db.service failed to load: No such file or directory.
On a backlevel openSUSE 13.2 system that file does exist: sogo:~ # rpm -qf /usr/lib/systemd/system/initrd-udevadm-cleanup-db.service udev-210.1456152170.f2b9ea6-25.34.1.i586 On one of the patched test systems: temp78:~ # rpm -q udev udev-210.1459453449.5237776-25.37.1.i586
Root switch is ordered after initrd-udevadm-cleanup-db.service, and it is the *ONLY* service that would add some delay to root switching. It is possible that some bug was introduced that made this unit unavailable, thus exposing this race condition.
But adding initrd-udevadm-cleanup-db.service is not real fix, it just hides fundamental issue. So it's bugzilla time.
I guess there are two issues. I'll report back with the ticket#s. -- Per Jessen, Zürich (14.9°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
I guess there are two issues. I'll report back with the ticket#s.
systemd/initrd: https://bugzilla.opensuse.org/show_bug.cgi?id=980324 initrd-udevadm-cleanup-db.service: https://bugzilla.opensuse.org/show_bug.cgi?id=980325 -- Per Jessen, Zürich (15.0°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 05/16/2016 09:14 AM, Per Jessen wrote:
Andrei Borzenkov wrote:
Is file still the same? pam_nologin displays content of this file and you created empty file. Does it still have the same modification time and is empty?
Just repeated the exercise - created an empty /run/nologin, rebooted, the file is then updated with the "system is booting up" text. The problem seems to be that this file is never removed?
If /run is a tmpfs then it should be removed. Perhaps the design assumes that? Try removing it (as it would be if a virgin /run was created as a tmpfs on boot) then rebooting. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 05/16/2016 11:20 AM, Anton Aylward wrote:
On 05/16/2016 09:14 AM, Per Jessen wrote:
Andrei Borzenkov wrote:
Is file still the same? pam_nologin displays content of this file and you created empty file. Does it still have the same modification time and is empty?
Just repeated the exercise - created an empty /run/nologin, rebooted, the file is then updated with the "system is booting up" text. The problem seems to be that this file is never removed?
If /run is a tmpfs then it should be removed. Perhaps the design assumes that?
Try removing it (as it would be if a virgin /run was created as a tmpfs on boot) then rebooting.
OK, just seen post about /usr/lib/tmpfile.d/systemd-nologin.conf Forget my responses then. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-05-16 13:18, Per Jessen wrote:
Andrei Borzenkov wrote:
16.05.2016 13:53, Per Jessen пишет:
Anton Aylward wrote:
On 05/16/2016 05:27 AM, Per Jessen wrote:
I upgraded/patched a 13.2 xen guest just now and after rebooting, I see $SUBJ when I login via ssh. What does $SUBJ mean?
/run/nologin exists.
What is the ownership/permission and the contents of that file?
# ls -l /run/nologin -rw-r--r-- 1 root root 40 May 16 11:26 /run/nologin
I presume you've read the nologin man page? http://man7.org/linux/man-pages/man8/pam_nologin.8.html
Yes, but it offers no clue as to why this suddenly started happening.
Normally it should be removed by systemd-user-sessions.service on bootup. If you remove this file and reboot - does it appear again?
Just tried that, the file did not appear again. Weird.
I don't think that file should be removed by the system. It is a file that you create, empty, to signal that you do not want users to login till you manually remove that file. The manual says it should be either /var/run/nologin or /etc/nologin, there is no mention of "/run/nologin". - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlc5u4sACgkQja8UbcUWM1yhHgEAkhyOskB/WxxubNddC0ubaGqh nagSyo1y28ocgceYU8gBAIBfRf0WWtLDaZLhAyGgY4SfTlPtagAAyyCRFUJ+zZRn =2KZl -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 2016-05-16 13:18, Per Jessen wrote:
Andrei Borzenkov wrote:
16.05.2016 13:53, Per Jessen пишет:
Anton Aylward wrote:
On 05/16/2016 05:27 AM, Per Jessen wrote:
I upgraded/patched a 13.2 xen guest just now and after rebooting, I see $SUBJ when I login via ssh. What does $SUBJ mean?
/run/nologin exists.
What is the ownership/permission and the contents of that file?
# ls -l /run/nologin -rw-r--r-- 1 root root 40 May 16 11:26 /run/nologin
I presume you've read the nologin man page? http://man7.org/linux/man-pages/man8/pam_nologin.8.html
Yes, but it offers no clue as to why this suddenly started happening.
Normally it should be removed by systemd-user-sessions.service on bootup. If you remove this file and reboot - does it appear again?
Just tried that, the file did not appear again. Weird.
I don't think that file should be removed by the system. It is a file that you create, empty, to signal that you do not want users to login till you manually remove that file.
Clearly it is being created by the system. It's probably part of startup/shutdown. -- Per Jessen, Zürich (12.7°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-05-16 15:12, Per Jessen wrote:
I don't think that file should be removed by the system. It is a file that you create, empty, to signal that you do not want users to login till you manually remove that file.
Clearly it is being created by the system. It's probably part of startup/shutdown.
Ah. That's a different thing completely. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On 05/16/2016 07:14 AM, Andrei Borzenkov wrote:
Normally it should be removed by systemd-user-sessions.service on bootup. If you remove this file and reboot - does it appear again?
Hmm. Why did I think that this was supposed to be in /etc rather than /run ?? Hmm, Per, ancestrally (that is 13.{1,2}), /run is tmpfs so that it is created anew on boot and initially unpopulated until services start doing things like putting their PIDs in it. I was under the impression that the shutdown created nologin and the boot process cleared it. Tell me, is "pam_nologin.o" the first line of /etc/pam.d/login ? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 05/16/2016 07:14 AM, Andrei Borzenkov wrote:
Normally it should be removed by systemd-user-sessions.service on bootup. If you remove this file and reboot - does it appear again?
Hmm. Why did I think that this was supposed to be in /etc rather than /run ??
Hmm, Per, ancestrally (that is 13.{1,2}), /run is tmpfs so that it is created anew on boot and initially unpopulated until services start doing things like putting their PIDs in it.
I was under the impression that the shutdown created nologin and the boot process cleared it.
Tell me, is "pam_nologin.o" the first line of /etc/pam.d/login ?
Yup, it is. -- Per Jessen, Zürich (11.1°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (5)
-
Andrei Borzenkov
-
Anton Aylward
-
Carlos E. R.
-
Patrick Shanahan
-
Per Jessen