openSUSE-RU-2018:2732-1: moderate: Recommended update for slurm
openSUSE Recommended Update: Recommended update for slurm ______________________________________________________________________________ Announcement ID: openSUSE-RU-2018:2732-1 Rating: moderate References: #1084917 #1103561 Affected Products: openSUSE Leap 15.0 ______________________________________________________________________________ An update that has two recommended fixes can now be installed. Description: This update for slurm provides version 17.11.9 and fixes the following issues: - When using a remote shared StateSaveLocation, slurmctld needs to be started after remote filesystems have become available. (bsc#1103561) - Fix race in the slurmctld backup controller which prevents it to clean up allocations on nodes properly after failing over. (bsc#1084917) - Fix segfault in slurmctld when a job's node bitmap is NULL during a scheduling cycle. - Remove erroneous unlock in acct_gather_energy/ipmi. - Enable support for hwloc version 2.0.1. - Fix 'srun -q' (--qos) option handling. - Fix socket communication issue that can lead to lost task completion messages, which will cause a permanently stuck srun process. - Avoid node layout fragmentation if running with a fixed CPU count but without Sockets and CoresPerSocket defined. - burst_buffer/cray: Fix datawarp swap default pool overriding jobdw. - Fix incorrect job priority assignment for multi-partition job with different PriorityTier settings on the partitions. - Fix sinfo to print correct node state. - Do not allocate nodes that were marked down due to the node not responding by ResumeTimeout. - task/cray plugin: Search for "mems" cgroup information in the file "cpuset.mems" then fall back to the file "mems". - Fix ipmi profile debug uninitialized variable. - PMIx: Fixed the direct connect inline msg sending. - MYSQL: Fix issue not handling all fields when loading an archive dump. - Allow a job_submit plugin to change the admin_comment field during job_submit_plugin_modify(). - job_submit/lua: Fix access into reservation table. - MySQL: Prevent deadlock caused by archive logic locking reads. - Don't enforce MaxQueryTimeRange when requesting specific jobs. - Modify --test-only logic to properly support jobs submitted to more than one partition. - Prevent slurmctld from abort when attempting to set non-existing qos as def_qos_id. - Add new job dependency type of "afterburstbuffer". The pending job will be delayed until the first job completes execution and it's burst buffer stage-out is completed. - Reorder proctrack/task plugin load in the slurmstepd to match that of slurmd and avoid race condition calling task before proctrack can introduce. - Prevent reboot of a busy KNL node when requesting inactive features. - Fix to reinitialize previously adjusted job members to their original value when validating the job memory in multi-partition requests. - Fix _step_signal() from always returning SLURM_SUCCESS. - Combine active and available node feature change logs on one line rather than one line per node for performance reasons. - Prevent occasionally leaking freezer cgroups. - Fix potential segfault when closing the mpi/pmi2 plugin. - Fix issues with --exclusive=[user|mcs] to work correctly with preemption or when job requests a specific list of hosts. - mpi/pmix: Fixed the collectives canceling. - SlurmDBD: Improve error message handling on archive load failure. - Fix incorrect locking when deleting reservations. - Fix incorrect locking when setting up the power save module. - Fix setting format output length for squeue when showing array jobs. - Add xstrstr function. - Fix printing out of --hint options in sbatch, salloc --help. - Prevent possible divide by zero in _validate_time_limit(). - Add Delegate=yes to the slurmd.service file to prevent systemd from interfering with the jobs' cgroup hierarchies. - Change the backlog argument to the listen() syscall within srun to 4096 to match elsewhere in the code, and avoid communication problems at scale. - Recommend slurm-munge for slurm-slurmdbd. This update was imported from the SUSE:SLE-15:Update update project. Patch Instructions: To install this openSUSE Recommended Update use the SUSE recommended installation methods like YaST online_update or "zypper patch". Alternatively you can run the command listed for your product: - openSUSE Leap 15.0: zypper in -t patch openSUSE-2018-1009=1 Package List: - openSUSE Leap 15.0 (x86_64): libpmi0-17.11.9-lp150.5.11.1 libpmi0-debuginfo-17.11.9-lp150.5.11.1 libslurm32-17.11.9-lp150.5.11.1 libslurm32-debuginfo-17.11.9-lp150.5.11.1 perl-slurm-17.11.9-lp150.5.11.1 perl-slurm-debuginfo-17.11.9-lp150.5.11.1 slurm-17.11.9-lp150.5.11.1 slurm-auth-none-17.11.9-lp150.5.11.1 slurm-auth-none-debuginfo-17.11.9-lp150.5.11.1 slurm-config-17.11.9-lp150.5.11.1 slurm-debuginfo-17.11.9-lp150.5.11.1 slurm-debugsource-17.11.9-lp150.5.11.1 slurm-devel-17.11.9-lp150.5.11.1 slurm-doc-17.11.9-lp150.5.11.1 slurm-lua-17.11.9-lp150.5.11.1 slurm-lua-debuginfo-17.11.9-lp150.5.11.1 slurm-munge-17.11.9-lp150.5.11.1 slurm-munge-debuginfo-17.11.9-lp150.5.11.1 slurm-node-17.11.9-lp150.5.11.1 slurm-node-debuginfo-17.11.9-lp150.5.11.1 slurm-openlava-17.11.9-lp150.5.11.1 slurm-pam_slurm-17.11.9-lp150.5.11.1 slurm-pam_slurm-debuginfo-17.11.9-lp150.5.11.1 slurm-plugins-17.11.9-lp150.5.11.1 slurm-plugins-debuginfo-17.11.9-lp150.5.11.1 slurm-seff-17.11.9-lp150.5.11.1 slurm-sjstat-17.11.9-lp150.5.11.1 slurm-slurmdbd-17.11.9-lp150.5.11.1 slurm-slurmdbd-debuginfo-17.11.9-lp150.5.11.1 slurm-sql-17.11.9-lp150.5.11.1 slurm-sql-debuginfo-17.11.9-lp150.5.11.1 slurm-sview-17.11.9-lp150.5.11.1 slurm-sview-debuginfo-17.11.9-lp150.5.11.1 slurm-torque-17.11.9-lp150.5.11.1 slurm-torque-debuginfo-17.11.9-lp150.5.11.1 References: https://bugzilla.suse.com/1084917 https://bugzilla.suse.com/1103561
participants (1)
-
maintenance@opensuse.org