[Bug 242520] New: passwd: compat, group:compat in nsswitch.conf causes bad LDAP performance
https://bugzilla.novell.com/show_bug.cgi?id=242520 Summary: passwd: compat, group:compat in nsswitch.conf causes bad LDAP performance Product: openSUSE 10.3 Version: unspecified Platform: Other OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: forsberg@cendio.se QAContact: qa@suse.de When Yast is used to configure SUSE (as well as SLED and SLES) into using an LDAP directory for authentication and user/group data access, the following is written into /etc/nsswitch.conf: passwd: compat group: compat passwd_compat: ldap group_compat: ldap This is very bad for performance, as the initgroups function (used at login time by sshd, kdm, gdm and similar to find out which groups the user belongs to) will then use the "enumerate all groups and see if the current username belongs to any of them" method (suitable for flatfile databases) instead of the LDAP-optimized version of initgroups available in nss_ldap, which will query the LDAP server for groups the user is member of. The latter is very much faster than the former, as the former not only has to enumerate all groups, but also in many cases translate DNs into usernames, which will generate as many LDAP queries as there are members of all groups, instead of _one_ LDAP query for the LDAP-optimized initgroups. All other distributions I've seen as well as the nss_ldap documentation, recommends a /etc/nsswitch.conf with the following contents: passwd: files ldap group: files ldap -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 mhorvath@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team- |jsuchome@novell.com |screening@forge.provo.novell| |.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 jsuchome@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rhafer@novell.com Status|NEW |NEEDINFO Info Provider| |kukuk@novell.com ------- Comment #1 from jsuchome@novell.com 2007-02-13 02:50 MST ------- Thorsten, Ralf, could you comment? I remember we wanted to stick on 'compat' instead of using 'files' value. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #2 from rhafer@novell.com 2007-02-13 03:25 MST ------- IIRC we used the nss_compat with ldap because we needed an easy was to deny shell access to all LDAP users on a machine (without touching all the separate pam config files). But if nss_compat really uses setgrent/getgrent to emulate the initgroups function we should better switch away from it. But from a quick look at the source it does seem that if the underlying module (nss_ldap in this case) support initgroups, it uses initgroups and only does a fallback if initgroups is not implemented in the module. I must admit though, that I am not very familiar with the nss_compat code. Thorsten might know it better ... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 kukuk@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kukuk@novell.com Status|NEEDINFO |NEW Info Provider|kukuk@novell.com | ------- Comment #3 from kukuk@novell.com 2007-02-13 03:28 MST ------- We want to stay with 'compat' for different reasons, one was that we would need a reboot if we change it to files in a running system. Else all running system daemons will get problems. I don't remember anymore what I implemented in nss_compat, but since this code was changed a lot afterwards (not by me), I don't know the current status. nss_compat should use the initgroups function, if not, somebody need to fix the code or implement it, if it is really missing. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 rhafer@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |forsberg@cendio.se ------- Comment #4 from rhafer@novell.com 2007-02-13 03:37 MST ------- Erik, how did you come to the conclusion that nss_compat does not support initgroups() ? Looking at the code? Testing? How did you test? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 forsberg@cendio.se changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|forsberg@cendio.se | ------- Comment #5 from forsberg@cendio.se 2007-02-13 03:48 MST ------- Testing. I tested by configuring an OpenSUSE host to use nss_ldap as nss backend (configuration done using Yast in the ldap client tool, to make sure it was configured the suse way), talking to a Novell eDirectory server which contains groups that are large, then watching the LDAP traffic using ethereal. Each login generates a large amount of queries, especially as eDirectory uses the groupOfNames LDAP object class, which means nss_ldap has to translate from DN to uid for each member of each group. Using the same host, changing the lines in /etc/nsswitch.conf to use 'files ldap', a much more optimized LDAP query is sent. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 rhafer@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|jsuchome@novell.com |pbaudis@novell.com ------- Comment #6 from rhafer@novell.com 2007-02-13 04:54 MST ------- nss_compat indeed seems to do something different than plain nss_ldap. In its initgroups() function, it calls getgrgid() for every group that the initgroups() call of nss_ldap return. That can of course result indeed in a lot of more LDAP queries than when nss_ldap is used directly. One addtional query for each group and one addtional query for each member of that groups (when "groupOfNames" groups are used, which is our default). That will hurt especially when many large groups are used. This is a comment from the initgroups() function of nss_compat: /* For every gid in the list we get from the NSS module, get the whole group entry. We need to do this, since we need the group name to check if it is in the blacklist. In worst case, this is as twice as slow as stepping with getgrent_r through the whole group database. But for large group databases this is faster, since the user can only be in a limited number of groups. */ I have no idea what this blacklist is for that is mentioned there. Petr, Thorsten any idea how things can be improved here? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #7 from kukuk@novell.com 2007-02-13 04:59 MST ------- I remember, this is for the "-xyz" entries in /etc/passwd or /etc/group. We can check if the blacklist is empty, but I'm afraid normally this is not the case (we use the blacklist to filter duplicate entries, too). So with "compat", we have to live with the overhead. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #8 from rhafer@novell.com 2007-02-13 06:09 MST ------- (In reply to comment #7)
So with "compat", we have to live with the overhead. Then I would rather go for the "files ldap" approach and live with the requirement to reboot the machine. That is just a one-time thing, while the overhead of nss_compat would always be present and can get quite big in large environments.
We would also need a way to disallow non-local user shell access on a machine, while still having access to other service (e.g. imap). It should be possible with nss_override_attribute_value in /etc/ldap.conf which is new in nss_ldap since some releases. It might also be possible through pam_ldap (with a combination of pam_check_host_attr and pam_check_service_attr). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 pbaudis@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Comment #9 from pbaudis@novell.com 2007-02-19 03:36 MST ------- Thorsten, just in case, do you remember why the compat -> files change requires a reboot? That sounds wrong to me, perhaps the best approach would be to fix that bug unless that's a hard technical limitation (I'm not faimiliar with nss code yet). Alternatively, could yast give you a choice / tell you about this when installing nss_ldap? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #10 from kukuk@novell.com 2007-02-19 03:40 MST ------- glibc never rereads /etc/nsswitch.conf. So all already running deaemons will never see the change and will not use ldap. So you have to restart everything or, easier, reboot. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #11 from pbaudis@novell.com 2007-02-19 03:57 MST ------- Would it be reasonable if it re-stat()s it if it has last looked at it before more than, say, 5 minutes? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #12 from kukuk@novell.com 2007-02-19 04:04 MST ------- I think the problem is more that the interface is not designed to give all memory free and restart the subsystem. So you will have a memory leak here. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #13 from forsberg@cendio.se 2007-02-19 08:48 MST ------- (In reply to comment#10) I don't know about suse, but other distributions (Debian, Fedora Core) has a list of daemons it restarts when glibc is restarted. The same list could perhaps be used to let Yast restart daemons after changes to nsswitch.conf? If you fiddle manually in /etc/nsswitch.conf, you are of course on your own. But on the other hand - people that configure their nsswitch.conf are supposed to know what they are doing. A warning among the comments in the file should suffice. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #14 from kukuk@novell.com 2007-02-19 09:00 MST ------- (In reply to comment #13)
(In reply to comment#10) I don't know about suse, but other distributions (Debian, Fedora Core) has a list of daemons it restarts when glibc is restarted. The same list could perhaps be used to let Yast restart daemons after changes to nsswitch.conf?
YaST2 has already such a list, like the other distributions. And like the other distributions you will always miss daemons or restart them in the wrong order. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #15 from pbaudis@novell.com 2007-03-01 19:55 MST ------- In case we want to avoid these troubles, the other possible way is to ask the user whether to use the straight or the compat way during nss_ldap installation. It's a bit crude but probably the most reasonable solution...? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 rhafer@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |forsberg@cendio.se ------- Comment #16 from rhafer@novell.com 2007-03-15 07:29 MST ------- (In reply to comment #15)
In case we want to avoid these troubles, the other possible way is to ask the user whether to use the straight or the compat way during nss_ldap installation. It's a bit crude but probably the most reasonable solution...? Most users will have no idea about the differences between both approaches. Such a question will only confuse them. If we can't improve the performance with "compat" we should IMO switch to "files ldap" or "compat ldap" (if that make sense) and tell the user to reboot (behave similar to what the Windows Domain Membership YaST module does).
Thorsten, Petr are there any other problem with using "files ldap" than the needed reboot and the different way to deny shell access with "file ldap" (see comment #8). What would be better "files ldap" or "compat ldap"? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 rhafer@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pgephart@novell.com ------- Comment #17 from rhafer@novell.com 2007-03-15 09:28 MST ------- *** Bug 155268 has been marked as a duplicate of this bug. *** -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 cwinberg@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |cwinberg@novell.com ------- Comment #18 from cwinberg@novell.com 2007-03-15 11:35 MST ------- Some interesting test data from Customer DIB Test machine is a SLES9sp3. Tracing user login to NDS ldap server. With nsswitch.conf set to "files ldap": LDAP binds 6 LDAP Searches 1767 LDAP compare 1 With nsswitch.conf set to "compat" LDAP bind 5 LDAP search 1729 LDAP compare 1 With nsswitch.conf set to "ldap files" LDAP bind 4 LDAP search 1729 LDAP compare 1 With nsswitch set to "ldap" LDAP bind 6 LDAP search 1768 LDAP compare 1 RHES server with same ldap config LDAP bind 5 LDAP search 11 LDAP compare 1 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #19 from cwinberg@novell.com 2007-03-15 12:12 MST ------- config from my customers SLES9 machines # passwd: files nis # shadow: files nis # group: files nis passwd: compat group: compat hosts: files dns networks: files dns services: files protocols: files rpc: files ethers: files netmasks: files netgroup: files publickey: files bootparams: files automount: files nis aliases: files passwd_compat: ldap group_compat: ldap *****From the RHES 3.0****** passwd: files ldap #shadow: files ldap group: files ldap #hosts: db files nisplus nis dns hosts: files dns # Example - obey only what nisplus tells us... #services: nisplus [NOTFOUND=return] files #networks: nisplus [NOTFOUND=return] files #protocols: nisplus [NOTFOUND=return] files #rpc: nisplus [NOTFOUND=return] files #ethers: nisplus [NOTFOUND=return] files #netmasks: nisplus [NOTFOUND=return] files bootparams: files ethers: files netmasks: files networks: files dns protocols: files ldap rpc: files services: files ldap netgroup: files ldap publickey: files automount: files ldap aliases: files password_compat: ldap group_compat: ldap shadow: files ldap The test user has a primary group assigned that contains 1600 users or so. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #20 from cwinberg@novell.com 2007-03-15 13:19 MST ------- Interesting data from this config: passwd: ldap files group: ldap hosts: files dns networks: files dns services: files protocols: files rpc: files ethers: files netmasks: files netgroup: files publickey: files bootparams: files automount: files nis aliases: files #passwd_compat: ldap #group_compat: ldap ***********packet trace data***************** LDAP binds 3 LDAP search 2 LDAP compare 1 Is this config present any one with a problem? It sure is fast. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
Interesting data from this config:
passwd: ldap files group: ldap [..] ***********packet trace data***************** LDAP binds 3 LDAP search 2 LDAP compare 1
Is this config present any one with a problem? It sure is fast. This doesn't make much sense. Apart from the config be broken (it will break local users and groups) I guess that you had nscd running while and it had most
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #21 from rhafer@novell.com 2007-03-16 09:21 MST ------- (In reply to comment #20) things already in it's caches. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #22 from rhafer@novell.com 2007-03-16 09:45 MST ------- I did some more tests by myself now. System was openSUSE 10.3alpha2. nscd was not running. The LDAP Server has 1000 users, primary group of the testuser that 1000 members and the user was a (secondary) member of another group which had 1000 members as well. compat: Binds: 7 Searches: 2030 files ldap: Binds: 6 Searches: 1021 My Fedore Core 6 test machine gave me this results: Binds: 8 Searches: 20 The difference between 10.3 with compat and 10.3 with files ldap is caused by additional getgrgid() calls that nss_compat does (see comment #6). It took me a little longer to find out why 10.3 with "files ldap" does still a 1000 searches more than the FC6 system. But I figured that those queries are done while executing the /etc/profile script. One of the first things that our /etc/profile script does is "/bin/ls -l /proc/$$/exe" during which another getgrgid() call happens. After removing that "/bin/ls -l /proc/$$/exe" command from /etc/profile the results were similar to the result from the FC6 system. When nscd is running the result are of course only true for the first login attempt (and cold nscd caches). Subsequent logins just need some LDAP queries for pam_ldap the getgrgid() results are directly provided by nscd. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 rhafer@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Info Provider|forsberg@cendio.se |kukuk@novell.com ------- Comment #23 from rhafer@novell.com 2007-03-16 09:47 MST ------- Hm, I just saw the I set needinfo to the wrong person (with comment #16). So, next try: Thorsten, Petr are there any other problem with using "files ldap" than the needed reboot and the different way to deny shell access with "file ldap" (see comment #8). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 kukuk@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|kukuk@novell.com | ------- Comment #24 from kukuk@novell.com 2007-03-19 04:11 MST ------- No, should not. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #25 from cwinberg@novell.com 2007-03-19 07:39 MST ------- Another issue that the customer complains about. After the initial login, moving around the server can be very slow. Any cmd that calls getent takes a long time for the return. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #26 from cwinberg@novell.com 2007-03-19 07:54 MST ------- I have my customer's nds dibs configured in Provo if you would like to see how my customer has implemented PAM ldap. Tree: Transunion Admin ID: ".admin.services" passwd: "novell" TUXMLPOS 151.155.134.68 NETAPP1_PROD 151.155.134.77 Test authentication IDs cn=cwinber,o=assocaites passwd "biteme36" cn=bobo,o=assocaites passwd "biteme36" If you have a Linux system that you want to observe, modify either of the two test IDs with Console1, select the "other tab" then add the name of your local server to the "hosts" tab and you will be good to go. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #27 from rhafer@novell.com 2007-03-19 07:59 MST ------- (In reply to comment #25)
Another issue that the customer complains about. After the initial login, moving around the server can be very slow.
Any cmd that calls getent takes a long time for the return. That's a completely different issue. getpwent/getgrent are slow by design, as they iterate over the complete user/group database. Additionally both calls are not handled by nscd, so the results are not cached (caching the results of the getXXent calls would mean to have the whole database in the cache). The getpwent/getgrent should be avoided whenever possible, especially in Directory Enviroments (e.g. with nss_ldap or nss_winbind).
AFAIK nss_winbind even has getgrent() and getpwent() now disabled by default (meaning they will not return any results). As they push too much load on the servers and could get even slower than nss_ldap. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 rhafer@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |LATER ------- Comment #28 from rhafer@novell.com 2007-03-20 08:18 MST ------- This is now tracked as a Feature Request via Fate. Fate-Id is #302064. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #29 from forsberg@cendio.se 2007-03-20 08:41 MST -------
This is now tracked as a Feature Request via Fate. Fate-Id is #302064.
Umm.. what does that mean, exactly? Is Fate a system I can access to see further progress on this bug? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242520 ------- Comment #30 from rhafer@novell.com 2007-03-20 09:09 MST ------- (In reply to comment #29)
Umm.. what does that mean, exactly? Is Fate a system I can access to see further progress on this bug? Fate is our Feature Tracking tool. AFAIK it is not opened to the public (yet?)
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
participants (1)
-
bugzilla_noreply@novell.com