[Bug 1141925] New: sssd ldap be goes into the "Backend is offline" state at boot, because dhcp hasn't gotten an ip address yet, and never recovers
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925 Bug ID: 1141925 Summary: sssd ldap be goes into the "Backend is offline" state at boot, because dhcp hasn't gotten an ip address yet, and never recovers Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.1 Hardware: x86-64 OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Other Assignee: bnc-team-screening@forge.provo.novell.com Reporter: tabmcleo@cs.ubc.ca QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36 OPR/62.0.3331.43 Build Identifier: Opened this problem in the support forum: https://forums.opensuse.org/showthread.php/536482-sssd-ldap-be-goes-into-quo... The two responses to this problem in the support forum were to open a bug report. The following is cut and pasted from the problem posted in the support forum: Hello, My department has run into a problem with openSuSE Leap 15.1, LDAP and sssd. In short, it appears that sssd starts prior to DHCP obtaining an IP address for the network interface. At that point, sssd ldap be goes into the "Backend is offline" state and never recovers. It appears to never recover, because it is never informed by inotify when a DHCP address is obtained and resolv.conf is updated. Consequently, a console login followed by a "systemctl restart sssd.service" is required or a reboot before non-local users can login. Some Ubuntu users have run into the same problem: https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1723350 I've modified our sssd.service file and placed it in /etc/systemd/system to override the default file in /usr/lib/systemd/system: [Unit] Description=System Security Services Daemon # SSSD must be running before we permit user sessions After=network-online.target <================================================ Added this line Before=systemd-user-sessions.service nss-user-lookup.target Wants=nss-user-lookup.target network-online.target <======================== Added "network-online.target" PartOf=network-online.target <=============================================== Added this line [Service] Environment=DEBUG_LOGGER=--logger=files EnvironmentFile=-/etc/sysconfig/sssd ExecStart=/usr/sbin/sssd -i ${DEBUG_LOGGER} Type=notify NotifyAccess=main PIDFile=/var/run/sssd.pid [Install] WantedBy=multi-user.target Note that this is a workaround and not a fix. It is my belief that sssd should recover once DHCP provides an IP address. We are using wicked. So far, this is reproducible with desktop computers running openSuSE 15.1 using DHCP, sssd and an LDAP backend for authentication. I've followed the methodology in the Ubuntu bug report. In a nutshell, it appears that sssd never recovers because /etc/resolv.conf is a symlink. Inotify reports when the file pointed to is changed, but not on the symlink itself. Reproducible: Always Steps to Reproduce: 1.Boot the computer 2.Only local logins work, e.g. root 3."systemctl status sssd.service" reveals sssd ldap be is "offline" 4."systemctl restart sssd.service" resolves the problem Actual Results: At boot time, sssd ldap be goes into an offline state because DHCP hasn't yet acquired an IP address. When an IP address is acquired, sssd ldap be stays in the offline state unless an "systemctl restart sssd.service" is performed. Expected Results: After booting the computer, non-local users should be able to login. Even though sssd is started prior to DHCP obtaining an IP address, and goes into an "offline" state, it should recover once an IP address is obtained and /etc/resolv.conf contains an entry (or entries) for a DNS. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c2
--- Comment #2 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c3
--- Comment #3 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c4
--- Comment #4 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c7
--- Comment #7 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c8
--- Comment #8 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c10
--- Comment #10 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c11
--- Comment #11 from Samuel Cabrero
Hello, Samuel.
I was finally able to get back to this matter. I installed a new openSuSE 15.1 machine today. There are no updates pending. The version of sssd installed is sssd-1.16.1-lp151.7.12.1.x86_64. Our mitigation technique of installing a custom sssd.service is not installed on this machine.
The problem does NOT appear to be fixed.
Upon booting, I can only login with cached credentials. The sssd be service is offline. When I check /var/log/message, it appears that sssd comes up, then DHCP provides an IP address and then sssd be goes offline and never recovers.
I note that the version of sssd installed is newer than the one referenced in bug 1136139.
I wonder if the problem was re-introduced with the newer version of sssd that we are using?
Trevor
Hi Trevor, that version is the latest available and it contains the relevant patches. I tried to reproduce manually and it is working fine for me. Could you attach the sssd log files at debug level 10 by the time the IP address is assigned? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c12
--- Comment #12 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c13
--- Comment #13 from Samuel Cabrero
First, I assume, this involves inserting an “debug_level = 10” statement into the sssd.conf file.
Exactly.
Secondly, there are multiple sections within our file. Each section is prefaced with “[something]” where “something” in our environment is one of:
sssd nss pam domain/LDAP
Please set it in [sssd] and [domain/LDAP].
Finally, there are sssd logs for all of the above. Do you require all of them?
No, only the ones where you enabled debug level 10, and please make sure the logs include the time the machine gets the IP address. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c14
--- Comment #14 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c15
--- Comment #15 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c16
--- Comment #16 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c17
--- Comment #17 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c18
--- Comment #18 from Trevor McLeod
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925
http://bugzilla.opensuse.org/show_bug.cgi?id=1141925#c19
Samuel Cabrero
participants (1)
-
bugzilla_noreply@novell.com