[Bug 1065464] New: Don't allow multiple OpenCL ICD packages to be installed simultaneously
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 Bug ID: 1065464 Summary: Don't allow multiple OpenCL ICD packages to be installed simultaneously Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Installation Assignee: yast2-maintainers@suse.de Reporter: sonichedgehog_hyperblast00@yahoo.com QA Contact: jsrain@suse.com Found By: --- Blocker: --- I considered this important enough to warrant a bug report. The forum thread where I discovered and further explained the issue can be found here: https://forums.opensuse.org/showthread.php/527821-Blender-crashes-at-startup... When running the command 'zypper dup' without additional parameters, zypper wants to install the OpenCL ICD named pocl. However Mesa already uses another ICD which seems to come in the package libOpenCL1. This is a problem because from the looks of it, you may only have one ICD installed on your system at once; More than that will cause a conflict, and every OpenCL application will crash with the following error: mesa: CommandLine Error: Option 'enable-value-profiling' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options To test and confirm this, you can install both "libOpenCL1" and "pocl" simultaneously then attempt to run Blender 3D: The application should crash on startup with the console output quoted above. After removing the pocl package however, Blender will start and work properly again. Until this bug is solved by the X11 developers, it might be a good idea for the openSUSE repositories to be aware of it and mark multiple OpenCL ICD packages as incompatible, only recommending that you install one at once. Users may find that OpenCL applications have suddenly stopped working, and not know what to do in order to fix the problem. A workaround for the time being is to mark pocl as "taboo / never install". -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 Mircea Kitsune <sonichedgehog_hyperblast00@yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sonichedgehog_hyperblast00@ | |yahoo.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c1 Andreas Stieger <astieger@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |astieger@suse.com, | |mardnh@gmx.de Component|Installation |Other Assignee|yast2-maintainers@suse.de |mardnh@gmx.de QA Contact|jsrain@suse.com |qa-bugs@suse.de --- Comment #1 from Andreas Stieger <astieger@suse.com> --- This needs to be done by the packages themselves, not repo/YaST/libzypp. Either via direct explicit conflict, or... Provides: opencl-icd Conflicts: otherproviders(opencl-icd) Assign to maintainer of both packages. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 Martin Pluskal <mpluskal@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mpluskal@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c2 --- Comment #2 from Martin Hauke <mardnh@gmx.de> --- When multiple ICDs are installed, libopencl1 needs to dlopen() them all to find out which one works on the available hardware. If they are dynamically linked, this leads to them sharing a libllvm, which has enough global state that this is likely to error out. (This is a known LLVM bug, https://bugs.llvm.org/show_bug.cgi?id=22952 , but currently has no real fix.) I'm regularly using multiple ICDs (pocl, nvidia-binary, intel-binary) and never had any issues and I do not really like the idea allowing only one ICD installed at once. Afaik we're actually shipping three ICDs in Tubleweed that make use of libllvm: * beignet * pocl * mesa A workaround that is used by the debian packaging team is to statically link all these packages to avoid sharing a libllvm. Imo we should do the same. Any objections ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c3 --- Comment #3 from Mircea Kitsune <sonichedgehog_hyperblast00@yahoo.com> --- (In reply to Martin Hauke from comment #2) I don't have any knowledge in how the OpenCL ICD is supposed to work, so I don't know whether the problem I'm seeing is normal or the result of a bug.
From what you're saying, it is in fact a bug... therefore my proposal here would be a temporary workaround, not the proper permanent solution.
Whether or not it's a good idea would depend on how fast this bug is expected to be solved in LLVM: If it's going to take more than an year like a lot of issues seem to nowadays, it makes sense to me that openSUSE recommends you don't install multiple ICD's. The report you linked was opened in 2015 and last modified in 2016... since 2017 is almost over I'm definitely not getting my hopes up for a timely solution. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c4 --- Comment #4 from Mircea Kitsune <sonichedgehog_hyperblast00@yahoo.com> --- Sorry, forgot to write the other part of my last comment: If statically linking them solves the crash, I definitely have no objections. This feels like the cleanest solution to my limited understanding. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 Martin Hauke <mardnh@gmx.de> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |idonmez@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 Michal Srb <msrb@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |msrb@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c7 --- Comment #7 from Michal Srb <msrb@suse.com> --- First of all, the whole purpose of ICDs is to give multiple drivers the ability to coexist on the same system and the applications a method of enumerating them and choosing one. So making the packages providing ICDs conflict with each other would defeat their purpose. Secondly, making llvm static would have big negative impact on the distribution. It is against our guidelines (https://en.opensuse.org/openSUSE:Packaging_guidelines#Static_Libraries), which exist for good reasons. Building it statically would mean that every package that uses it would have to be rebuild and retested every time llvm changes. It would also increase the resources used for building llvm and all packages that use it. Building llvm already takes lot of resources and puts stress on our build service. I have spent last two months optimizing the llvm build to get it to at least acceptable levels. Note that we have never distributed llvm static libraries. The recent changes in the package only changed how we get rid of them during build. So I consider switching llvm to static as the last resort if there is no other solution. I have reproduced this bug and analyzed it. The issue is not that libLLVM.so is loaded twice, but that Mesa loads libLLVM.so.5 while pocl loads libLLVM.so.4. That is the thing that recently changed - we have introduced llvm 5 and made it the default. Mesa was rebuilt against llvm 5, but pocl was not. It is because we have pocl version 0.14 that supports at most llvm 4. I have updated pocl to version 1.0 (which supports llvm 5) and rebuilt it with llvm 5. It seems to solve the issue. I will double check it and if it is correct, submit the updated pocl. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c8 --- Comment #8 from Michal Srb <msrb@suse.com> --- Now I have noticed that Martin already prepared update to version 1.0 in the devel project and it is on its way to Factory and eventually to Tumbleweed: https://build.opensuse.org/package/show/science/pocl Mircea, can you try to install it and test? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c9 --- Comment #9 from Mircea Kitsune <sonichedgehog_hyperblast00@yahoo.com> --- (In reply to Michal Srb from comment #8) What does testing imply? This is my desktop computer, thus I can't risk playing with the software repositories in a way that might mess anything up. However I run openSUSE Tumbleweed: Once the changes are in, I can safely remove the Taboo (Never Install) lock from the package and see how the system behaves after a new 'zypper dup'. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c10 --- Comment #10 from Michal Srb <msrb@suse.com> --- (In reply to Mircea Kitsune from comment #9)
What does testing imply? This is my desktop computer, thus I can't risk playing with the software repositories in a way that might mess anything up.
If you do not want to mess with repositories, it is enough if you download and install this single RPM: https://download.opensuse.org/repositories/science/openSUSE_Tumbleweed/x86_6... All its dependencies are in Tumbleweed repository and most likely you already have them installed. Try to start blender, observe if it crashes or works. You can again uninstall the pocl package after that. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c11 --- Comment #11 from Mircea Kitsune <sonichedgehog_hyperblast00@yahoo.com> --- (In reply to Michal Srb from comment #10) Thanks for the info. I tried that package and unfortunately, it still appears to induce the exact same crash as the one currently in Tumbleweed: mircea@linux-qz0r:~> blender Read prefs: /home/mircea/.config/blender/2.79/config/userpref.blend mesa: CommandLine Error: Option 'enable-value-profiling' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c13 Michal Srb <msrb@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |IN_PROGRESS Assignee|mardnh@gmx.de |msrb@suse.com --- Comment #13 from Michal Srb <msrb@suse.com> --- Ok, updating pocl to libLLVM5 did solve my initial issue, but that was different issue than the one you are seeing. After I installed both pocl and beignet packages, I can see the same error as you: "Option 'enable-value-profiling' registered more than once!" The 'enable-value-profiling' is option from clang, specifically from the CodeGen component. It is registered in a constructor of a static variable in clang. It registers itself by calling a llvm function which stores it into a map that is also stored in a static variable in the llvm library. The issue here is that both pocl and beignet are linked with static libclangCodeGen.a and both link dynamically with libLLVM.so. So each of them has their own copy of clang's CodeGen. They both try to register the same option during initialization and end up saving it into the same map in the shared llvm library. The second attempt to register it fails. Compiling all clang and llvm libraries as static would be a solution, true. Similarly compiling them all as dynamic libraries would work too, but originally I thought we can not do that because using BUILD_SHARED_LIBS=ON is not supported and buggy and LLVM_LINK_LLVM_DYLIB=ON does not work with libclang. I've checked what other distributions do and I would like to try Fedora's approach: Combine it and build clang with BUILD_SHARED_LIBS=ON and everything else LLVM_LINK_LLVM_DYLIB=ON. That way we hopefully 1) solve this bug 2) avoid the bugs that BUILD_SHARED_LIBS=ON was causing 3) avoid building huge amount of static libraries. Martin, since pocl seems to be fine and this is a llvm issue, I am reassigning it to myself. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c16 Michal Srb <msrb@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|IN_PROGRESS |RESOLVED Resolution|--- |FIXED --- Comment #16 from Michal Srb <msrb@suse.com> --- The above mentioned changes finally bubbled to Tumbleweed and I was able to test them. The issue is fixed on my machine. Pocl, Beignet and other clang users now link to libclang and libLLVM libraries dynamically, so only one instance is loaded and the options are registered only once. Closing the bug as fixed. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c17 --- Comment #17 from Mircea Kitsune <sonichedgehog_hyperblast00@yahoo.com> --- Confirming that the issue was solved: I unblocked pocl and ran 'zypper dup' which reinstalled it. I can however now open Blender without it crashing or printing out any errors. Thank you! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1065464 http://bugzilla.opensuse.org/show_bug.cgi?id=1065464#c31 --- Comment #31 from OBSbugzilla Bot <bwiedemann+obsbugzillabot@suse.com> --- This is an autogenerated message for OBS integration: This bug (1065464) was mentioned in https://build.opensuse.org/request/show/932377 Backports:SLE-15-SP3 / llvm12 -- You are receiving this mail because: You are on the CC list for the bug.
participants (2)
-
bugzilla_noreply@novell.com
-
bugzilla_noreply@suse.com