On 12/9/18 4:18 PM, Todd Rme wrote:
Hi,
I went through a few packages which have an openMPI dependency or support, and found it quite mixed up:
Currently, we have openmpi(1), openmpi2 and openmpi3 in Leap and TW. While openmpi3 is currently unused, openmpi1 and openmpi2 are both used, with similar frequency:
https://build.opensuse.org/package/binary/openSUSE:Factory/openmpi2:standard... standard/x86_64/openmpi2-libs-2.1.5-2.1.x86_64.rpm https://build.opensuse.org/package/binary/openSUSE:Factory/openmpi:standard/ standard/x86_64/openmpi-libs-1.10.7-21.1.x86_64.rpm
Several programs will end up with implicitly linking to both versions, as libnetcdf and hdf5 use openmpi1 and boost_mpi uses openmpi2. One example is vtk.
As both libraries (libmpi.so.12 and libmpi.so.20) export the same symbols for large parts, this is mayhem waiting to happen.
For SLE, different MPI versions/implementations are supported using the HPC modules, but for Leap/TW, we should obviously stick with *one* single canonical version.
Question now, which version to choose?
Apparently, openmpi2 does not work on all architectures (PPC, PPC64BE) [1], and is not supported by some software packages [2].
Are there any drawbacks for using openmpi1 everywhere in TW/Leap 15.x?
I have opened a bug report: https://bugzilla.opensuse.org/show_bug.cgi? id=1118861
Kind regards,
Stefan
[1] "Stay with openmpi(1) also on PPC", boost, 2018-10-01, https:// build.opensuse.org/request/show/639401 [2] "Cntk packages do not support OpenMPI 2+", https://github.com/Microsoft/ CNTK/issues/3197
-- Stefan Brüns / Bergstraße 21 / 52062 Aachen home: +49 241 53809034 mobile: +49 151 50412019 No matter what we pick, I think it would be a good idea to do what we do with, say, gcc and llvm/clang, where we have separate "openmpi1", "openmpi2", and "openmpi3" packages, and have the "openmpi" package refer to the default version. This would make it easy to change default versions in the future, or set default versions on a
On Sat, Dec 8, 2018 at 2:53 PM Stefan Brüns <stefan.bruens@rwth-aachen.de> wrote: per-architecture basis.
As for openmpi 1 vs openmpi 2, the problem with openmpi 1 is that it is unmaintained [1]. The current version of openmpi is actually version 4. So using it openmpi 2 as the default comes with all the problems associated with unmaintained software, especially network-oriented software. openmpi 2 also adds support for MPI 3.x features.
openmpi 2 is supposed to support PPC. If it doesn't that is probably a bug that should be reported upstream. Unfortunately the linked request doesn't explain what the problem is.
It was disabled for ppc64be in v2.1.2 but reenabled in v2.1.4. See: https://github.com/open-mpi/ompi/issues/4349#issuecomment-374970982 My two cents on the MPI version pick: - openmpi1 has been unmaintained for over a year now. It is also deprecated in SLES/LEap15 although still available. We know there are some issue, specially reagrding the latest RDMA hardware. IMHO this should be dropped completely from Factory. - openmpi2 is the new "default" for SLES15. It seems to work well and is old enough to be stable. - openmpi3 was not picked for SLES15 as it was very recently released at the time and still pretty unstable, even running the testsuite it came with. We decided not to ship it. It might be mature enough to be a good candidate. - openmpi4 is just barely out. I haven't got around to test it yet but my best guess is that it will be similar to openmpi3 when it came out. Working but with lots of instabilty and issue (on some non x86_64 arch usually). I think it is too early to use it, although it should be packaged and available in Factory. TL;DR: I think openmpi2 and 3 are good candidates. openmpi2 has my preference because it means we can keep more in sync with SLES and Leap 15 ( which do not have openmpi3). Regarding the rest of the discussion, I've replied to the BZ#111861 Nicolas