This update for slurm2302 fixes the following issues:
Updated to version 23.02.5 with the following changes:
Bug Fixes:
SLURM_NTASKS was no longer set in the
job's environment when --ntasks-per-node was requested.
The method that is is being set, however, is different and should be more
accurate in more situations.SrunPortRange option. This matches the
new behavior of the pmix plugin in 23.02.0. Note that neither of these
plugins makes use of the MpiParams=ports= option, and previously
were only limited by the systems ephemeral port range.job_container/tmpfs - Avoid attempts to share BasePath between nodes.CR_Cpu_Memory, fix node selection for jobs that request gres and
--mem-per-cpu.slurmctld segfault when a node registers with a configured
CpuSpecList while slurmctld configuration has the node without
CpuSpecList.POWERED_DOWN+NO_RESPOND state after
not registering by ResumeTimeout.slurmstepd - Avoid cleanup of config.json-less containers spooldir
getting skipped.bind() and listen() calls
in the network stack when running with SrunPortRange set.-a/--all option for privileged users.--noheader
option.node_features/helpers - Fix node selection for jobs requesting
changeable.
features with the | operator, which could prevent jobs from
running on some valid nodes.node_features/helpers - Fix inconsistent handling of & and |,
where an AND'd feature was sometimes AND'd to all sets of features
instead of just the current set. E.g. foo|bar&baz was interpreted
as {foo,baz} or {bar,baz} instead of how it is documented:
{foo} or {bar,baz}.AllocNodes while it is pending or if it is canceled
before restarting.sacct - AllocCPUS now correctly shows 0 if a job has not yet
received an allocation or if the job was canceled before getting one./dev/dri/renderD[0-9]+ GPUs,
and do not detect /dev/dri/card[0-9]+.--gpus and a number of
tasks fewer than GPUs, which resulted in incorrectly rejecting these
jobs.MYSQL_OPT_RECONNECT completely.POWERING_UP state disappearing (getting set
to FUTURE)
when an scontrol reconfigure happens.openapi/dbv0.0.39 - Avoid assert / segfault on missing coordinators
list.slurmrestd - Correct memory leak while parsing OpenAPI specification
templates with server overrides.rpc_queue is enabled.slurmrestd - Correct OpenAPI specification generation bug where
fields with overlapping parent paths would not get generated.--gres=none sometimes not ignoring gres
from the job.--exclusive jobs incorrectly gang-scheduling where they shouldn't.CR_SOCKET, gres not assigned to a specific
socket, and block core distribion potentially allocating more sockets
than required.PrologEpilogTimeout) is strongly encouraged to avoid Slurm waiting
indefinitely for scripts to finish.slurmdbd -R not returning an error under certain conditions.slurmdbd - Avoid potential NULL pointer dereference in the mysql
plugin./etc/hosts.openapi/[db]v0.0.39 - fix memory leak on parsing error.data_parser/v0.0.39 - fix updating qos for associations.openapi/dbv0.0.39 - fix updating values for associations with null
users.--tres-per-task and licenses.--cpus-per-task < usable threads per core.slurmrestd - For GET /slurm/v0.0.39/node[s], change format of
node's energy field current_watts to a dictionary to account for
unset value instead of dumping 4294967294.slurmrestd - For GET /slurm/v0.0.39/qos, change format of QOS's
field 'priority' to a dictionary to account for unset value instead of
dumping 4294967294.GET /slurm/v0.0.39/job[s], the 'return code'
code field in v0.0.39_job_exit_code will be set to -127 instead of
being left unset where job does not have a relevant return code.Other Changes:
JobId to debug() messages indicating when
cpus_per_task/mem_per_cpu or pn_min_cpus are being automatically
adjusted.slurmstepd - Cleanup per task generated environment for containers in
spooldir.slurmrestd - Reduce memory usage when printing out job CPU frequency.data_parser/v0.0.39 - Add required/memory_per_cpu and
required/memory_per_node to sacct --json and sacct --yaml and
GET /slurmdb/v0.0.39/jobs from slurmrestd.gpu/oneapi - Store cores correctly so CPU affinity is tracked.slurmdbd -R to work if the root assoc id is not 1.TreeWidth.
Since unresolvable cloud/dynamic nodes must disable fanout by setting
TreeWidth to a large number, this would cause all nodes to register at
once.{
"binaries": [
{
"slurm_23_02-rest": "23.02.5-150300.7.11.2",
"libnss_slurm2_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-config-man": "23.02.5-150300.7.11.2",
"slurm_23_02-node": "23.02.5-150300.7.11.2",
"slurm_23_02-webdoc": "23.02.5-150300.7.11.2",
"slurm_23_02-auth-none": "23.02.5-150300.7.11.2",
"perl-slurm_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-plugins": "23.02.5-150300.7.11.2",
"slurm_23_02-lua": "23.02.5-150300.7.11.2",
"libslurm39": "23.02.5-150300.7.11.2",
"slurm_23_02-cray": "23.02.5-150300.7.11.2",
"slurm_23_02-devel": "23.02.5-150300.7.11.2",
"slurm_23_02": "23.02.5-150300.7.11.2",
"libpmi0_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-doc": "23.02.5-150300.7.11.2",
"slurm_23_02-sview": "23.02.5-150300.7.11.2",
"slurm_23_02-config": "23.02.5-150300.7.11.2",
"slurm_23_02-munge": "23.02.5-150300.7.11.2",
"slurm_23_02-torque": "23.02.5-150300.7.11.2",
"slurm_23_02-pam_slurm": "23.02.5-150300.7.11.2",
"slurm_23_02-sql": "23.02.5-150300.7.11.2",
"slurm_23_02-slurmdbd": "23.02.5-150300.7.11.2",
"slurm_23_02-plugin-ext-sensors-rrd": "23.02.5-150300.7.11.2"
}
]
}{
"binaries": [
{
"slurm_23_02-rest": "23.02.5-150300.7.11.2",
"libnss_slurm2_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-config-man": "23.02.5-150300.7.11.2",
"slurm_23_02-node": "23.02.5-150300.7.11.2",
"slurm_23_02-webdoc": "23.02.5-150300.7.11.2",
"slurm_23_02-auth-none": "23.02.5-150300.7.11.2",
"perl-slurm_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-plugins": "23.02.5-150300.7.11.2",
"slurm_23_02-lua": "23.02.5-150300.7.11.2",
"libslurm39": "23.02.5-150300.7.11.2",
"slurm_23_02-cray": "23.02.5-150300.7.11.2",
"slurm_23_02-devel": "23.02.5-150300.7.11.2",
"slurm_23_02": "23.02.5-150300.7.11.2",
"libpmi0_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-doc": "23.02.5-150300.7.11.2",
"slurm_23_02-sview": "23.02.5-150300.7.11.2",
"slurm_23_02-config": "23.02.5-150300.7.11.2",
"slurm_23_02-munge": "23.02.5-150300.7.11.2",
"slurm_23_02-torque": "23.02.5-150300.7.11.2",
"slurm_23_02-pam_slurm": "23.02.5-150300.7.11.2",
"slurm_23_02-sql": "23.02.5-150300.7.11.2",
"slurm_23_02-slurmdbd": "23.02.5-150300.7.11.2",
"slurm_23_02-plugin-ext-sensors-rrd": "23.02.5-150300.7.11.2"
}
]
}{
"binaries": [
{
"slurm_23_02-rest": "23.02.5-150300.7.11.2",
"libnss_slurm2_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-config-man": "23.02.5-150300.7.11.2",
"slurm_23_02-node": "23.02.5-150300.7.11.2",
"slurm_23_02-webdoc": "23.02.5-150300.7.11.2",
"slurm_23_02-auth-none": "23.02.5-150300.7.11.2",
"perl-slurm_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-plugins": "23.02.5-150300.7.11.2",
"slurm_23_02-lua": "23.02.5-150300.7.11.2",
"libslurm39": "23.02.5-150300.7.11.2",
"slurm_23_02-cray": "23.02.5-150300.7.11.2",
"slurm_23_02-devel": "23.02.5-150300.7.11.2",
"slurm_23_02": "23.02.5-150300.7.11.2",
"libpmi0_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-doc": "23.02.5-150300.7.11.2",
"slurm_23_02-sview": "23.02.5-150300.7.11.2",
"slurm_23_02-config": "23.02.5-150300.7.11.2",
"slurm_23_02-munge": "23.02.5-150300.7.11.2",
"slurm_23_02-torque": "23.02.5-150300.7.11.2",
"slurm_23_02-pam_slurm": "23.02.5-150300.7.11.2",
"slurm_23_02-sql": "23.02.5-150300.7.11.2",
"slurm_23_02-slurmdbd": "23.02.5-150300.7.11.2",
"slurm_23_02-plugin-ext-sensors-rrd": "23.02.5-150300.7.11.2"
}
]
}{
"binaries": [
{
"slurm_23_02-rest": "23.02.5-150300.7.11.2",
"libnss_slurm2_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-config-man": "23.02.5-150300.7.11.2",
"slurm_23_02-node": "23.02.5-150300.7.11.2",
"slurm_23_02-webdoc": "23.02.5-150300.7.11.2",
"slurm_23_02-auth-none": "23.02.5-150300.7.11.2",
"perl-slurm_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-plugins": "23.02.5-150300.7.11.2",
"slurm_23_02-lua": "23.02.5-150300.7.11.2",
"libslurm39": "23.02.5-150300.7.11.2",
"slurm_23_02-cray": "23.02.5-150300.7.11.2",
"slurm_23_02-devel": "23.02.5-150300.7.11.2",
"slurm_23_02": "23.02.5-150300.7.11.2",
"libpmi0_23_02": "23.02.5-150300.7.11.2",
"slurm_23_02-doc": "23.02.5-150300.7.11.2",
"slurm_23_02-sview": "23.02.5-150300.7.11.2",
"slurm_23_02-config": "23.02.5-150300.7.11.2",
"slurm_23_02-munge": "23.02.5-150300.7.11.2",
"slurm_23_02-torque": "23.02.5-150300.7.11.2",
"slurm_23_02-pam_slurm": "23.02.5-150300.7.11.2",
"slurm_23_02-sql": "23.02.5-150300.7.11.2",
"slurm_23_02-slurmdbd": "23.02.5-150300.7.11.2",
"slurm_23_02-plugin-ext-sensors-rrd": "23.02.5-150300.7.11.2"
}
]
}