[Bug 1200030] New: slurmd can't load plugins
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030 Bug ID: 1200030 Summary: slurmd can't load plugins Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Network Assignee: screening-team-bugs@suse.de Reporter: cgoll@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Sicne some time the slurm rpm packages does not work any more in openSUSE tumbleweed. If a service should be started one gets following error: ``` slurmd: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/select_linear.so): /usr/lib64/slurm/select_linear.so: undefined symbol: gres_ctld_job_build_details slurmd: error: Couldn't load specified plugin name for select/linear: Dlopen of plugin file failed slurmd: fatal: Can't find plugin for select/linear ``` This is not related to slurm itself, as if the package is compiled on the machine itself, everything works fine. (Used the *exact* same ./configure for building includng $CFLAGS.) This must have something to do with the symbol stripping in Tumblweed, the leap packages work fine. Any comments? -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
https://bugzilla.suse.com/show_bug.cgi?id=1200030#c1
Jan Engelhardt
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
https://bugzilla.suse.com/show_bug.cgi?id=1200030#c2
Christian Goll
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
https://bugzilla.suse.com/show_bug.cgi?id=1200030#c3
--- Comment #3 from Jan Engelhardt
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
Marcus Meissner
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
Yiannis Bonatakis
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
https://bugzilla.suse.com/show_bug.cgi?id=1200030#c4
--- Comment #4 from Christian Goll
The error still reproduces (despite SUSE_ZNOW=0) when you invoke
LD_BIND_NOW=1 /usr/sbin/slurmd
for exactly the reason I have given. In detail:
1. slurmd dlopens (in my case) /usr/lib64/slurm/select_cons_tres.so 2. `nm -D select_cons_tres.so` yields " U gres_ctld_job_build_details". 3. All symbols must be resolvable upon load. This is not the case for select_cons_tres, select_linear, and probably more plugins. 4. If a symbol is not resolve on load, dlopen will fail. 5. If LD_BIND_NOW/ZNOW=0 and dlopen() was invoked with RTLD_LAZY, then the program will abort when the attempt to resolve is made and no match is found. (Which can be never if the piece of code in question is never invoked.)
And slurmd relying on #5 is a bad idea.
I am full with you that relying on #5 is a bad idea, but this is how upstream has implemented it. For example in order to find which mpi startup methods can be used, there is the command ``` srun --mpi=list ``` which tries to load all mpi related shared libraries and only list the one which it can load without error. -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
https://bugzilla.suse.com/show_bug.cgi?id=1200030#c5
Egbert Eich
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
https://bugzilla.suse.com/show_bug.cgi?id=1200030#c6
--- Comment #6 from Jan Engelhardt
further information on what you refer to as 'industry standard'
The info pages for GNU libtool have this to say: """ '-no-undefined' Declare that OUTPUT-FILE does not depend on any libraries other than the ones listed on the command line, i.e., after linking, it will not have unresolved symbols. Some platforms require all symbols in shared libraries to be resolved at library creation (*note Inter-library dependencies::), and using this parameter allows 'libtool' to assume that this will not happen. """ """ Some platforms, such as Windows, do not even allow you this flexibility. In order to build a shared library, it must be entirely self-contained or it must have dependencies known at link time (that is, have references only to symbols that are found in the '.lo' files or the specified '-l' libraries), and you need to specify the '-no-undefined' flag. By default, libtool builds only static libraries on these kinds of platforms. """ -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
https://bugzilla.suse.com/show_bug.cgi?id=1200030#c7
--- Comment #7 from Egbert Eich
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
https://bugzilla.suse.com/show_bug.cgi?id=1200030#c8
Marcus Meissner
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
Chenzi Cao
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200030
https://bugzilla.suse.com/show_bug.cgi?id=1200030#c12
--- Comment #12 from Egbert Eich
participants (1)
-
bugzilla_noreply@suse.com