* Thu Apr 24 2025 Christian Goll <cgoll@suse.com>
- removed openmpi4-hpc dependency for test suite
* Fri Mar 07 2025 Atri Bhattacharya <badshah400@gmail.com>
- Update to version 24.11.3:
* Fix database cluster ID generation not being random.
* Fix a regression in which slurmd -G gave no output.
* Fix a long-standing crash in slurmctld after updating a
reservation with an empty nodelist.
* Other minor to moderate bugs.
- Sync upgrades file to relfect last updated versions.
- Pass '-DH5_USE_112_API -DDH5Oget_info_vers=1' to CFLAGS to allow
building with hdf5 1.14 as slurm does not yet support HDF5 v114
API.
* Fri Feb 07 2025 Egbert Eich <eich@suse.com>
Update to version 24.11.1:
* With client commands `MIN_MEMORY` will show `mem_per_tres` if
specified.
* Fix errno message about bad constraint.
* `slurmctld` - Fix crash and possible split brain issue if the
backup controller handles an scontrol reconfigure while in control
before the primary resumes operation.
* Fix `stepmgr` not getting dynamic node addrs from the controller
* `stepmgr` - avoid `Unexpected missing socket` errors.
* Fix `scontrol show steps` with dynamic stepmgr.
* Deny jobs using the `R:` option of `--signal` if `PreemptMode=OFF`
globally.
* Force jobs using the `R:` option of `--signal` to be
preemptable.
by requeue or cancel only. If `PreemptMode` on the partition or
QOS is off or suspend, the job will default to using
`PreemptMode=cancel`.
* If `--mem-per-cpu` exceeds `MaxMemPerCPU`, the number of CPUs
per task will always be increased even if --cpus-per-task was
specified. This is needed to ensure each task gets the expected
amount of memory.
* Fix compilation issue on OpenSUSE Leap 15.
* Fix jobs using more nodes than needed when not using `-N`.
* Fix issue with allocation being allocated less resources.
than needed when using `--gres-flags=enforce-binding`.
* `select/cons_tres` - Fix errors with `MaxCpusPerSocket`
partition limit. Used CPUs/cores weren't counted properly,
nor limiting free ones to avail, when the socket was partially
allocated, or the job request went beyond this limit.
* Fix issue when jobs were preempted for licenses even if there
were enough licenses available.
* Fix `srun` `ntasks` calculation inside an allocation when nodes
are requested using a min-max range.
* Print correct number of digits for `TmpDisk` in `sdiag`.
* Fix a regression in 24.11 which caused file transfers to a job
with sbcast to not join the job container namespace.
* `data_parser/v0.0.40` - Prevent a segfault in the `slurmrestd`
when dumping data with v0.0.40+complex data parser.
* Remove logic to force lowercase GRES names.
* `data_parser/v0.0.42` - Prevent the association id from always
being dumped as NULL when parsing in complex mode. Instead it
will now dump the id. This affects the following endpoints:
- `GET slurmdb/v0.0.42/association`
- `GET slurmdb/v0.0.42/associations`
- `GET slurmdb/v0.0.42/config`
* Fixed a job requeuing issue that merged job entries into the
same SLUID when all nodes in a job failed simultaneously.
* When a job completes, try to give idle nodes to reservations with
the `REPLACE` flag before allowing them to be allocated to jobs.
* Avoid expensive lookup of all associations when dumping or
parsing for v0.0.42 endpoints.
* Avoid expensive lookup of all associations when dumping or
parsing for v0.0.41 endpoints.
* Avoid expensive lookup of all associations when dumping or
parsing for v0.0.40 endpoints.
* Fix segfault when testing jobs against nodes with invalid gres.
* Fix performance regression while packing larger RPCs.
* `job_container/tmpfs` - Fix Xauthoirty file being created.
outside the container when `EntireStepInNS` is enabled.
* `job_container/tmpfs` - Fix `spank_task_post_fork` not always
running in the container when `EntireStepInNS` is enabled.
* Fix a job potentially getting stuck in CG on permissions
errors while setting up X11 forwarding.
* Fix error on X11 shutdown if Xauthority file was not created.
* `slurmctld` - Fix memory or fd leak if an RPC is recieved that
is not registered for processing.
* Inject `OMPI_MCA_orte_precondition_transports` when using PMIx.
This fixes mpi apps using Intel OPA, PSM2 and OMPI 5.x when ran
through `srun`.
* Don't skip the first `partition_job_depth` jobs per partition.
* Fix gres allocation issue after controller restart.
* Fix issue where jobs requesting CPUs-per-GPU hang in queue.
* `switch/hpe_slingshot` - Treat HTTP status forbidden the same as
unauthorized, allowing for a graceful retry attempt.
- Slurmdbd no longer starts as root so that the systemd
`RuntimeDirectory` option works properly. This requires further
adjustments:
* Use /run/slurmdbd as directory for PID file. This is created
when slurmdbd starts by `RuntimeDirectory=slurmdbd` in the
systemd service file (boo#1236928).
* Upon installation, create empty `/var/log/slurmdbd.log` owned
by user slurm as slurmdbd is not able to create it (boo#1236929).
* Fri Jan 17 2025 Egbert Eich <eich@suse.com>
- Make test suite package work on SLE-12.
* Thu Jan 09 2025 Egbert Eich <eich@suse.com>
- Fix testsuite:
Cater for erroneous: `#include </src/[slurm_internal_header]>`
statements.
* Mon Jan 06 2025 Egbert Eich <eich@suse.com>
- Update to version 24.11
* `slurmctld` - Reject arbitrary distribution jobs that do not
specifying a task count.
* Fix backwards compatibility of the `RESPONSE_JOB_INFO RPC`
(used by `squeue`, `scontrol show job`, etc.) with Slurm clients
version 24.05 and below. This was a regression in 24.11.0rc1.
* Do not let `slurmctld`/`slurmd` start if there are more nodes
defined in `slurm.conf` than the maximum supported amount
(64k nodes).
* `slurmctld` - Set job's exit code to 1 when a job fails with
state `JOB_NODE_FAIL`. This fixes `sbatch --wait` not being able
to exit with error code when a job fails for this reason in
some cases.
* Fix certain reservation updates requested from 23.02 clients.
* `slurmrestd` - Fix populating non-required object fields of
objects as `{}` in JSON/YAML instead of `null` causing compiled
OpenAPI clients to reject the response to
`GET /slurm/v0.0.40/jobs` due to validation failure of
`.jobs[].job_resources`.
* Fix issue where older versions of Slurm talking to a 24.11 dbd
could loose step accounting.
* Fix minor memory leaks.
* Fix bad memory reference when `xstrchr` fails to find char.
* Remove duplicate checks for a data structure.
* Fix race condition in `stepmgr` step completion handling.
* `slurm.spec` - add ability to specify patches to apply on the
command line.
* `slurm.spec` - add ability to supply extra version information.
* Fix 24.11 HA issues.
* Fix requeued jobs keeping their priority until the decay thread
happens.
* Fix potential memory corruption in `select/cons_tres` plugin.
* Avoid cache coherency issue on non-x86 platforms that could
result in a POSIX signal being ignored or an abort().
* `slurmctld` - Remove assertion in development builds that would
trigger if an outdated client attempted to connect.
* `slurmd` - Wait for `PrologEpilogTimeout` on reconfigure for
prologs to finish. This avoids a situation where the slurmd
never detects that the prolog completed.
* `job_container/tmpfs` - Setup x11 forwarding within the namespace.
* `slurmctld` - fix memory leak when sending a `DBD_JOB_START`
message.
* Fix issue with accounting rollup dealing with association tables.
* Fix minor memory leaks.
* Fix potential thread safety issues.
* Init mutex in burst_buffer plugins.
* `slurmdbd` - don't log errors when no changes occur from db
requests.
* `slurmcltd`,`slurmd` - Avoid deadlock during reconfigure if too
many POSIX signals are received.
* Improve error type logged from partial or incomplete reading
from socket or pipe to avoid potentially logging an error from
a previous syscall.
* `slurmrestd` - Improve the handling of queries when unable to
connect to slurmdbd by providing responses when possible.
* `slurmrestd`,`sackd`,`scrun` - Avoid rare hangs related to I/O.
* `scrun` - Add support `--all` argument for kill subcommand.
* Remove `srun --cpu-bind=rank`.
* Add `resource_spec/cpus` and `resource_spec/memory` entry
points in data_parser to print the `CpuSpecList` and
`MemSpecLimit` in `sinfo --json`.
* `sinfo` - Add `.sinfo[].resource_spec.cpus` and
`.sinfo[].resource_spec.memory` fields to print the `CpuSpecList`
and `MemSpecLimit` dumped by `sinfo --{json|yaml}`.
* Increase efficency of sending logs to syslog.
* Switch to new official YAML mime type `application/yaml` in
compliance with RFC9512 as primary mime type for YAML formatting.
* `slurmrestd` - Remove deprecated fields from the following
endpoints:
`.result` from `POST /slurm/v0.0.42/job/submit`.
`.job_id`, `.step_id`, `.job_submit_user_msg` from `POST /slurm/v0.0.42/job/{job_id}`.
`.job.exclusive`, `.jobs[].exclusive` to `POST /slurm/v0.0.42/job/submit`.
`.jobs[].exclusive` from `GET /slurm/v0.0.42/job/{job_id}`.
`.jobs[].exclusive` from `GET /slurm/v0.0.42/jobs`.
`.job.oversubscribe`, `.jobs[].oversubscribe` to `POST /slurm/v0.0.42/job/submit`.
`.jobs[].oversubscribe` from `GET /slurm/v0.0.42/job/{job_id}`.
`.jobs[].oversubscribe` from `GET /slurm/v0.0.42/jobs`.
* `scontrol` - Removed deprecated fields `.jobs[].exclusive` and
`.jobs[].oversubscribe` from `scontrol show jobs --{json|yaml}`.
* `squeue` - Removed deprecated fields `.jobs[].exclusive` and
`.jobs[].oversubscribe` from `squeue --{json|yaml}`.
* Improve the way to run external commands and fork processes to
avoid non-async-signal safe calls between a fork and an exec.
We fork ourselves now and executes the commands in a safe
environment. This includes spank prolog/epilog executions.
* Improve `MaxMemPerCPU` enforcement when exclusive jobs request
per node memory and the partition has heterogeneous nodes.
* Remove a `TOCTOU` where multiple steps requesting an energy
reading at the same time could cause too frequent accesses
to the drivers.
* Limit `SwitchName` to `HOST_NAME_MAX` chars length.
* For `scancel --ctld` and the following rest api endpoints:
`DELETE /slurm/v0.0.40/jobs`
`DELETE /slurm/v0.0.41/jobs`
`DELETE /slurm/v0.0.42/jobs`
Support array expressions in the responses to the client.
* `salloc` - Always output node names to the user when an
allocation is granted.
* `slurmrestd` - Removed all v0.0.39 endpoints.
* `select/linear` - Reject jobs asking for GRES per
`job|socket|task` or `cpus|mem` per GRES.
* Add `/nodes` POST endpoint to REST API, supports multiple
node update whereas previously only single nodes could be
updated through `/node/<nodename>` endpoint:
`POST /slurm/v0.0.42/nodes`
* Do not allow changing or setting `PreemptMode=GANG` to a
partition as this is a cluster-wide option.
* Add `%b` as a file name pattern for the array task id modulo 10.
* Skip packing empty nodes when they are hidden during
`REQUEST_NODE_INFO RPC`.
* `accounting_storage/mysql` - Avoid a fatal condition when
the db server is not reachable.
* Always lay out steps cyclically on nodes in an allocation.
* `squeue` - add priority by partition
(`.jobs[].priority_by_partition`) to JSON and YAML output.
* `slurmrestd` - Add clarification to `failed to open slurmdbd
connection` error if the error was the result of an
authentication failure.
* Make it so `slurmctld` responds to RPCs that have authentication
errors with the `SLURM_PROTOCOL_AUTHENTICATION_ERROR` error
code.
* `openapi/slurmctld` - Display the correct error code instead
of `Unspecified error` if querying the following endpoints
fails:
`GET /slurm/v0.0.40/diag/`
`GET /slurm/v0.0.41/diag/`
`GET /slurm/v0.0.42/diag/`
`GET /slurm/v0.0.40/licenses/`
`GET /slurm/v0.0.41/licenses/`
`GET /slurm/v0.0.42/licenses/`
`GET /slurm/v0.0.40/reconfigure`
`GET /slurm/v0.0.41/reconfigure`
`GET /slurm/v0.0.42/reconfigure`
* Fix how used CPUs are tracked in a job allocation to allow the
max number of concurrent steps to run at a time if threads per
core is greater than 1.
* In existing allocations SLURM_GPUS_PER_NODE environment
variable will be ignored by srun if `--gpus` is specified.
* When using `--get-user-env` explicitly or implicitly, check
if PID or mnt namespaces are disabled and fall back to old
logic that does not rely on them when they are not available.
* Removed non-functional option `SLURM_PROLOG_CPU_MASK` from
`TaskProlog` which was used to reset the affinity of a task
based on the mask given.
* `slurmrestd` - Support passing of `-d latest` to load latest
version of `data_parser` plugin.
* `sacct`,`sacctmgr`,`scontrol`,`sdiag`,`sinfo`,`squeue`,`sshare`
- Change response to `--json=list` or `--yaml=list` to send
list of plugins to stdout and descriptive header to stderr to
allow for easier parsing.
* `slurmrestd` - Change response to `-d list`, `-a list` or
`-s list` to send list of plugins to stdout and descriptive
header to stderr to allow for easier parsing.
* `sacct`,`sacctmgr`,`scontrol`,`sdiag`,`sinfo`,`squeue`,
`sshare`,`slurmrestd` - Avoid crash when loading `data_parser`
plugins fail due to NULL dereference.
* Add autodetected GPUs to the output of `slurmd -C`
* Remove `burst_buffer/lua` call `slurm.job_info_to_string()`.
* Add `SchedulerParameters=bf_allow_magnetic_slot` option. It
allows jobs in magnetic reservations to be planned by backfill
scheduler.
* `slurmrestd` - Refuse to run as root, `SlurmUser`, and
`nobody(99)`.
* `openapi/slurmctld` - Revert regression that caused signaling
jobs to cancel entire job arrays instead of job array tasks:
`DELETE /slurm/v0.0.40/{job_id}`
`DELETE /slurm/v0.0.41/{job_id}`
`DELETE /slurm/v0.0.42/{job_id}`
* `openapi/slurmctld` - Support more formats for `{job_id}`
including job steps:
`DELETE /slurm/v0.0.40/{job_id}`
`DELETE /slurm/v0.0.41/{job_id}`
`DELETE /slurm/v0.0.42/{job_id}`
* Alter scheduling of jobs at submission time to consider job
submission time and job id. This makes it so that that
interactive jobs aren't allocated resources before batch jobs
when they have the same priority at submit time.
* Fix multi-cluster submissions with differing Switch plugins.
* `slurmrestd` - Change `+prefer_refs` flag to default in
`data_parser/v0.0.42` plugin. Add `+minimize_refs` flag to
inline single referenced schemas in the OpenAPI schema. This
sets the default OpenAPI schema generation behavior of
`data_parser/v0.0.42` to match v0.0.41 `+prefer_refs` and
v0.0.40 (without flags).
* Fix `LaunchParameters=batch_step_set_cpu_freq`.
* Clearer `seff` warning message for running jobs.
* `data_parser/v0.0.42` - Rename `JOB_INFO` field
`minimum_switches` to `required_switches` to reflect the
actual behavior.
* `data_parser/v0.0.42` - Rename `ACCOUNT_CONDITION` field
`assocation` to `association` to fix typo.
* `cgroup/v2` - fix cgroup cleanup when running inside a
container without write permissions to `/sys/fs/cgroup`.
* `cgroup/v2` - fix accounting of swap events detection.
* Fix gathering MaxRSS for jobs that run shorter than two
`jobacctgather` intervals. Get the metrics from cgroups
`memory.peak` or `memory.max_usage_in_bytes` where available.
* `openapi/slurmctld` - Set complex number support for the
following fields:
`.shares[][].fairshare.factor`
`.shares[][].fairshare.level`
for endpoints:
`GET /slurm/v0.0.42/shares`
and for commands:
`sshare --json`
`sshare --yaml`
* `data_parser/v0.0.42` - Avoid dumping `Infinity` for `NO_VAL`
tagged `number` fields.
* Add `TopologyParam=TopoMaxSizeUnroll=#` to allow
`--nodes=<min>-<max>` for `topology/block`.
* `sacct` - Respect `--noheader` for `--batch-script` and
`--env-vars`.
* `sacct` - Remove extra newline in output from `--batch-script`
and --env-vars.
* Add `sacctmgr ping` command to query status of `slurmdbd`.
* Generate an error message when a `NodeSet` name conflicts with
a `NodeName`, and prevent the controller from starting if such
a conflict exists.
* `slurmd` - properly detect slurmd restarts in the energy
gathering logic which caused bad numbers in accounting.
* `sackd` - retry fetching slurm configs indefinately in
configless mode.
* `job_submit/lua` - Add `assoc_qos` attribute to `job_desc`
to display all potential QOS's for a job's association.
* `job_submit/lua` - Add `slurm.get_qos_priority()` function
to retrieve the given QOS's priority.
* `sbcast` - Add `--nodelist` option to specify where files are
transmitted to.
* `sbcast` - Add `--no-allocation` option to transmit files to
nodes outside of a job allocation
* Add `DataParserParameters` `slurm.conf` parameter to allow
setting default value for CLI `--json` and `--yaml` arguments.
* `seff` - improve step's max memory consumption report by using
`TresUsageInTot` and `TresUsageInAve` instead of overestimating
the values.
* Enable RPC queueing for `REQUEST_KILL_JOBS`, which is used when
`scancel` is executed with `--ctld` flag.
* `slurmdbd` - Add `-u` option. This is used to determine if
restarting the DBD will result in database conversion.
* Fix `srun` inside an `salloc` in a federated cluster when using
IPv6.
* Calculate the forwarding timeouts according to tree depth
rather than node count / tree width for each level. Fixes race
conditions with same timeouts between two consecutive node
levels.
* Add ability to submit jobs with multiple QOS.
* Fix difference in behavior when swapping partition order in job
submission.
* Improve `PLANNED` state detection for mixed nodes and updating
state before yielding backfill locks.
* Always consider partition priority tiers when deciding to try
scheduling jobs on submit.
* Prevent starting jobs without reservations on submit when there
are pending jobs with reservations that have flags `FLEX` or
`ANY_NODES` that can be scheduled on overlapping nodes.
* Prevent jobs that request both high and low priority tier
partitions from starting on submit in lower priority tier
partitions if it could delay pending jobs in higher priority
tier partitions.
* `scontrol` - Wait for `slurmctld` to start reconfigure in
foreground mode before returning.
* Improve reconfigure handling on Linux to only close open file
descriptors to avoid long delays on systems with large
`RLIMIT_NOFILE` settings.
* `salloc` - Removed `--get-user-env` option.
* Removed the instant on feature from `switch/hpe_slingshot`.
* Hardware collectives in `switch/hpe_slingshot` now requires
`enable_stepmgr`.
* Allow backfill to plan jobs on nodes currently being used by
exclusive user or mcs jobs.
* Avoid miscaching IPv6 address to hostname lookups that could
have caused logs to have the incorrect hostname.
* `scontrol` - Add `--json`/`--yaml` support to `listpids`
* `scontrol` - Add `liststeps`
* `scontrol` - Add `listjobs`
* `slurmrestd` - Avoid connection to slurmdbd for the following
endpoints:
`GET /slurm/v0.0.42/jobs`
`GET /slurm/v0.0.42/job/{job_id}`
* `slurmctld` - Changed incoming RPC handling to dedicated thread
pool.
* `job_container/tmpfs` - Add `EntireStepInNS` option that will
place the `slurmstepd` process within the constructed namespace
directly.
* `scontrol show topo` - Show aggregated block sizes when using
`topology/block`.
* `slurmrestd` - Add more descriptive HTTP status for
authentication failure and connectivity errors with controller.
* `slurmrestd` - Improve reporting errors from `slurmctld` for
job queries:
`GET /slurm/v0.0.41/{job_id}`
`GET /slurm/v0.0.41/jobs/`
* Avoid rejecting a step request that needs fewer GRES than nodes
in the job allocation.
* `slurmrestd` - Tag the never populated `.jobs[].pid` field as
deprecated for the following endpoints:
`GET /slurm/v0.0.42/{job_id}`
`GET /slurm/v0.0.42/jobs/`
* `scontrol`,`squeue` - Tag the never populated `.jobs[].pid` field
as deprecated for the following:
`scontrol show jobs --json`
`scontrol show jobs --yaml`
`scontrol show job ${JOB_ID} --json`
`scontrol show job ${JOB_ID} --yaml`
`squeue --json`
`squeue --yaml`
* `data_parser` v0.0.42 - fix timestamp parsing regression
introduced in in v0.0.40 (eaf3b6631f), parsing of non iso 8601
style timestamps
* `cgroup/v2` will detect some special container and namespaced
setups and will work with it.
* Support IPv6 in configless mode.
* Add `SlurmctldParamters=ignore_constraint_validation` to ignore
`constraint/feature` validation at submission.
* `slurmrestd` - Set `.pings[].mode` field as deprecated in the
following endpoints:
`GET /slurm/v0.0.42/ping`
* `scontrol` - Set `.pings[].mode` field as deprecated in the
following commands:
`scontrol ping --json`
`scontrol ping --yaml`
* `slurmrestd` - Set `.pings[].pinged` field as deprecated in
the following endpoints:
`GET /slurm/v0.0.42/ping`
* `scontrol` - Set `.pings[].pinged` field as deprecated in the
following commands:
`scontrol ping --json`
`scontrol ping --yaml`
* `slurmrestd` - Add `.pings[].primary` field to the following
endpoints:
`GET /slurm/v0.0.42/ping`
* `scontrol` - Add `.pings[].primary` field to the following
commands:
`scontrol ping --json`
`scontrol ping --yaml`
* `slurmrestd` - Add `.pings[].responding` field to the following
endpoints:
`GET /slurm/v0.0.42/ping`
* `scontrol` - Add `.pings[].responding` field to the following
commands:
`scontrol ping --json`
`scontrol ping --yaml`
* Prevent jobs without reservations from delaying jobs in
reservations with flags `FLEX` or `ANY_NODES` in the main
scheduler.
* Fix allowing to ask for multiple different types of TRES
when one of them has a value of 0.
* `slurmctld` - Add a grace period to ensure the agent retry
queue is properly flushed during shutdown.
* Don't ship `src/slurmrestd/plugins/openapi/slurmdbd/openapi.json`
`slurmrest` should always be used to enerate a new OpenAPI
schema (aka openapi.json or openapi.yaml).
* `mpi/pmix` - Fix potential deadlock and races with het jobs,
and fix potential memory and FDs leaks.
* Fix jobs with `--gpus` being rejected in some edge cases for
partitions where not all nodes have the same amount of GPUs
and CPUs configured.
* In an extra constraints expression in a job request, do not
allow an empty string for a key or value.
* In an extra constraints expression in a job request, fix
validation that requests are separated by boolean operators.
* Add `TaskPluginParam=OOMKillStep` to kill the step as a whole
when one task OOMs.
* Fix `scontrol` show conf not showing all `TaskPluginParam`
elements.
* `slurmrestd` - Add fields `.job.oom_kill_step`
`.jobs[].oom_kill_step` to `POST /slurm/v0.0.42/job/submit`
and `POST /slurm/v0.0.42/job/allocate`.
* Improve performance for `_will_run_test()`.
* Add `SchedulerParameters=bf_topopt_enable` option to enable
experimental hook to control backfill.
* If a step fails to launch under certain conditions, set the
step's state to `NODE_FAIL`.
* `sched/backfill` - Fix certain situations where a job would
not get a planned time, which could lead to it being delayed
by lower priority jobs.
* `slurmrestd` - Dump JSON `null` instead of `{}` (empty object)
for non-required fields in objects to avoid client
compatiblity issues for v0.0.42 version tagged endpoints.
* `sacct`,`sacctmgr`,`scontrol`,`sdiag`,`sinfo`,`squeue`,
`sshare` - Dump `null` instead `{}` (empty object) for
non-required fields in objects to avoid client compatiblity
issues when run with `--json` or `--yaml`.
* Fri Nov 01 2024 Egbert Eich <eich@suse.com>
- Update to version 24.05.4 & fix for CVE-2024-48936.
* Fix generic int sort functions.
* Fix user look up using possible unrealized uid in the dbd.
* `slurmrestd` - Fix regressions that allowed `slurmrestd` to
be run as SlurmUser when `SlurmUser` was not root.
* mpi/pmix fix race conditions with het jobs at step start/end
which could make srun to hang.
* Fix not showing some `SelectTypeParameters` in `scontrol show
config`.
* Avoid assert when dumping removed certain fields in JSON/YAML.
* Improve how shards are scheduled with affinity in mind.
* Fix `MaxJobsAccruePU` not being respected when `MaxJobsAccruePA`
is set in the same QOS.
* Prevent backfill from planning jobs that use overlapping
resources for the same time slot if the job's time limit is
less than `bf_resolution`.
* Fix memory leak when requesting typed gres and
`--[cpus|mem]-per-gpu`.
* Prevent backfill from breaking out due to "system state
changed" every 30 seconds if reservations use `REPLACE` or
`REPLACE_DOWN` flags.
* `slurmrestd` - Make sure that scheduler_unset parameter defaults
to true even when the following flags are also set:
`show_duplicates`, `skip_steps`, `disable_truncate_usage_time`,
`run_away_jobs`, `whole_hetjob`, `disable_whole_hetjob`,
`disable_wait_for_result`, `usage_time_as_submit_time`,
`show_batch_script`, and or `show_job_environment`. Additionaly,
always make sure show_duplicates and
`disable_truncate_usage_time` default to true when the following
flags are also set: `scheduler_unset`, `scheduled_on_submit`,
`scheduled_by_main`, `scheduled_by_backfill`, and or `job_started`.
This effects the following endpoints:
`GET /slurmdb/v0.0.40/jobs`
`GET /slurmdb/v0.0.41/jobs`
* Ignore `--json` and `--yaml` options for `scontrol` show config
to prevent mixing output types.
* Fix not considering nodes in reservations with Maintenance or
Overlap flags when creating new reservations with `nodecnt` or
when they replace down nodes.
* Fix suspending/resuming steps running under a 23.02 `slurmstepd`
process.
* Fix options like `sprio --me` and `squeue --me` for users with
a uid greater than 2147483647.
* `fatal()` if `BlockSizes=0`. This value is invalid and would
otherwise cause the `slurmctld` to crash.
* `sacctmgr` - Fix issue where clearing out a preemption list using
`preempt=''` would cause the given qos to no longer be preempt-able
until set again.
* Fix `stepmgr` creating job steps concurrently.
* `data_parser/v0.0.40` - Avoid dumping "Infinity" for `NO_VAL` tagged
"number" fields.
* `data_parser/v0.0.41` - Avoid dumping "Infinity" for `NO_VAL` tagged
"number" fields.
* `slurmctld` - Fix a potential leak while updating a reservation.
* `slurmctld` - Fix state save with reservation flags when a update
fails.
* Fix reservation update issues with parameters Accounts and Users, when
using +/- signs.
* `slurmrestd` - Don't dump warning on empty wckeys in:
`GET /slurmdb/v0.0.40/config`
`GET /slurmdb/v0.0.41/config`
* Fix slurmd possibly leaving zombie processes on start up in configless
when the initial attempt to fetch the config fails.
* Fix crash when trying to drain a non-existing node (possibly deleted
before).
* `slurmctld` - fix segfault when calculating limit decay for jobs with
an invalid association.
* Fix IPMI energy gathering with multiple sensors.
* `data_parser/v0.0.39` - Remove xassert requiring errors and warnings
to have a source string.
* `slurmrestd` - Prevent potential segfault when there is an error
parsing an array field which could lead to a double xfree. This
applies to several endpoints in `data_parser` v0.0.39, v0.0.40 and
v0.0.41.
* `scancel` - Fix a regression from 23.11.6 where using both the
`--ctld` and `--sibling` options would cancel the federated job on
all clusters instead of only the cluster(s) specified by `--sibling`.
* `accounting_storage/mysql` - Fix bug when removing an association
specified with an empty partition.
* Fix setting multiple partition state restore on a job correctly.
* Fix difference in behavior when swapping partition order in job
submission.
* Fix security issue in stepmgr that could permit an attacker to
execute processes under other users' jobs. CVE-2024-48936.
* Wed Oct 23 2024 Egbert Eich <eich@suse.com>
- Add %(?%sysusers_requires} to slurm-config.
This fixes issues when building against Slurm.
* Mon Oct 14 2024 Egbert Eich <eich@suse.com>
- Update to version 24.05.3
* `data_parser/v0.0.40` - Added field descriptions.
* `slurmrestd` - Avoid creating new slurmdbd connection per request
to `* /slurm/slurmctld/*/*` endpoints.
* Fix compilation issue with `switch/hpe_slingshot` plugin.
* Fix gres per task allocation with threads-per-core.
* `data_parser/v0.0.41` - Added field descriptions.
* `slurmrestd` - Change back generated OpenAPI schema for
`DELETE /slurm/v0.0.40/jobs/` to `RequestBody` instead of using
parameters for request. `slurmrestd` will continue accept endpoint
requests via `RequestBody` or HTTP query.
* `topology/tree` - Fix issues with switch distance optimization.
* Fix potential segfault of secondary `slurmctld` when falling back
to the primary when running with a `JobComp` plugin.
* Enable `--json`/`--yaml=v0.0.39` options on client commands to
dump data using data_parser/v0.0.39 instead or outputting nothing.
* `switch/hpe_slingshot` - Fix issue that could result in a 0 length
state file.
* Fix unnecessary message protocol downgrade for unregistered nodes.
* Fix unnecessarily packing alias addrs when terminating jobs with
a mix of non-cloud/dynamic nodes and powered down cloud/dynamic
nodes.
* `accounting_storage/mysql` - Fix issue when deleting a qos that
could remove too many commas from the qos and/or delta_qos fields
of the assoc table.
* `slurmctld` - Fix memory leak when using RestrictedCoresPerGPU.
* Fix allowing access to reservations without `MaxStartDelay` set.
* Fix regression introduced in 24.05.0rc1 breaking
`srun --send-libs` parsing.
* Fix slurmd vsize memory leak when using job submission/allocation
commands that implicitly or explicitly use --get-user-env.
* `slurmd` - Fix node going into invalid state when using
`CPUSpecList` and setting CPUs to the # of cores on a
multithreaded node.
* Fix reboot asap nodes being considered in backfill after a restart.
* Fix `--clusters`/`-M queries` for clusters outside of a
federation when `fed_display` is configured.
* Fix `scontrol` allowing updating job with bad `cpus-per-task` value.
* `sattach` - Fix regression from 24.05.2 security fix leading to
crash.
* `mpi/pmix` - Fix assertion when built under `--enable-debug`.
- Changes from Slurm 24.05.2
* Fix energy gathering rpc counter underflow in
`_rpc_acct_gather_energy` when more than 10 threads try to get
energy at the same time. This prevented the possibility to get
energy from slurmd by any step until slurmd was restarted,
so losing energy accounting metrics in the node.
* `accounting_storage/mysql` - Fix issue where new user with `wckey`
did not have a default wckey sent to the slurmctld.
* `slurmrestd` - Prevent slurmrestd segfault when handling the
following endpoints when none of the optional parameters are
specified:
`DELETE /slurm/v0.0.40/jobs`
`DELETE /slurm/v0.0.41/jobs`
`GET /slurm/v0.0.40/shares`
`GET /slurm/v0.0.41/shares`
`GET /slurmdb/v0.0.40/instance`
`GET /slurmdb/v0.0.41/instance`
`GET /slurmdb/v0.0.40/instances`
`GET /slurmdb/v0.0.41/instances`
`POST /slurm/v0.0.40/job/{job_id}`
`POST /slurm/v0.0.41/job/{job_id}`
* Fix IPMI energy gathering when no IPMIPowerSensors are specified
in `acct_gather.conf`. This situation resulted in an accounted
energy of 0 for job steps.
* Fix a minor memory leak in slurmctld when updating a job dependency.
* `scontrol`,`squeue` - Fix regression that caused incorrect values
for multisocket nodes at `.jobs[].job_resources.nodes.allocation`
for `scontrol show jobs --(json|yaml)` and `squeue --(json|yaml)`.
* `slurmrestd` - Fix regression that caused incorrect values for
multisocket nodes at `.jobs[].job_resources.nodes.allocation` to
be dumped with endpoints:
`GET /slurm/v0.0.41/job/{job_id}`
`GET /slurm/v0.0.41/jobs`
* `jobcomp/filetxt` - Fix truncation of job record lines > 1024
characters.
* `switch/hpe_slingshot` - Drain node on failure to delete CXI
services.
* Fix a performance regression from 23.11.0 in CPU frequency
handling when no `CpuFreqDef` is defined.
* Fix one-task-per-sharing not working across multiple nodes.
* Fix inconsistent number of CPUs when creating a reservation
using the TRESPerNode option.
* `data_parser/v0.0.40+` - Fix job state parsing which could
break filtering.
* Prevent `cpus-per-task` to be modified in jobs where a `-c`
value has been explicitly specified and the requested memory
constraints implicitly increase the number of CPUs to allocate.
* `slurmrestd` - Fix regression where args `-s v0.0.39,dbv0.0.39`
and `-d v0.0.39` would result in `GET /openapi/v3` not
registering as a valid possible query resulting in 404 errors.
* `slurmrestd` - Fix memory leak for dbv0.0.39 jobs query which
occurred if the query parameters specified account, association,
cluster, constraints, format, groups, job_name, partition, qos,
reason, reservation, state, users, or wckey. This affects the
following endpoints:
`GET /slurmdb/v0.0.39/jobs`
* `slurmrestd` - In the case the slurmdbd does not respond to a
persistent connection init message, prevent the closed fd from
being used, and instead emit an error or warning depending on
if the connection was required.
* Fix 24.05.0 regression that caused the slurmdbd not to send back
an error message if there is an error initializing a persistent
connection.
* Reduce latency of forwarded x11 packets.
* Add `curr_dependency` (representing the current dependency of
the job).
and `orig_dependency` (representing the original requested
dependency of the job) fields to the job record in
`job_submit.lua` (for job update) and `jobcomp.lua`.
* Fix potential segfault of slurmctld configured with
`SlurmctldParameters=enable_rpc_queue` from happening on
reconfigure.
* Fix potential segfault of slurmctld on its shutdown when rate
limitting is enabled.
* `slurmrestd` - Fix missing job environment for `SLURM_JOB_NAME`,
`SLURM_OPEN_MODE`, `SLURM_JOB_DEPENDENCY`, `SLURM_PROFILE`,
`SLURM_ACCTG_FREQ`, `SLURM_NETWORK` and `SLURM_CPU_FREQ_REQ` to
match sbatch.
* Fix GRES environment variable indices being incorrect when only
using a subset of all GPUs on a node and the
`--gres-flags=allow-task-sharing` option.
* Prevent `scontrol` from segfaulting when requesting scontrol
show reservation `--json` or `--yaml` if there is an error
retrieving reservations from the `slurmctld`.
* `switch/hpe_slingshot` - Fix security issue around managing VNI
access. CVE-2024-42511.
* `switch/nvidia_imex` - Fix security issue managing IMEX channel
access. CVE-2024-42511.
* `switch/nvidia_imex` - Allow for compatibility with
`job_container/tmpfs`.
- Changes in Slurm 24.05.1
* Fix `slurmctld` and `slurmdbd` potentially stopping instead of
performing a logrotate when recieving `SIGUSR2` when using
`auth/slurm`.
* `switch/hpe_slingshot` - Fix slurmctld crash when upgrading
from 23.02.
* Fix "Could not find group" errors from `validate_group()` when
using `AllowGroups` with large `/etc/group` files.
* Add `AccountingStoreFlags=no_stdio` which allows to not record
the stdio paths of the job when set.
* `slurmrestd` - Prevent a slurmrestd segfault when parsing the
`crontab` field, which was never usable. Now it explicitly
ignores the value and emits a warning if it is used for the
following endpoints:
`POST /slurm/v0.0.39/job/{job_id}`
`POST /slurm/v0.0.39/job/submit`
`POST /slurm/v0.0.40/job/{job_id}`
`POST /slurm/v0.0.40/job/submit`
`POST /slurm/v0.0.41/job/{job_id}`
`POST /slurm/v0.0.41/job/submit`
`POST /slurm/v0.0.41/job/allocate`
* `mpi/pmi2` - Fix communication issue leading to task launch
failure with "`invalid kvs seq from node`".
* Fix getting user environment when using sbatch with
`--get-user-env` or `--export=` when there is a user profile
script that reads `/proc`.
* Prevent slurmd from crashing if `acct_gather_energy/gpu` is
configured but `GresTypes` is not configured.
* Do not log the following errors when `AcctGatherEnergyType`
plugins are used but a node does not have or cannot find sensors:
"`error: _get_joules_task: can't get info from slurmd`"
"`error: slurm_get_node_energy: Zero Bytes were transmitted or
received`"
However, the following error will continue to be logged:
"`error: Can't get energy data. No power sensors are available.
Try later`"
* `sbatch`, `srun` - Set `SLURM_NETWORK` environment variable if
`--network` is set.
* Fix cloud nodes not being able to forward to nodes that restarted
with new IP addresses.
* Fix cwd not being set correctly when running a SPANK plugin with a
`spank_user_init()` hook and the new "`contain_spank`" option set.
* `slurmctld` - Avoid deadlock during shutdown when `auth/slurm`
is active.
* Fix segfault in `slurmctld` with `topology/block`.
* `sacct` - Fix printing of job group for job steps.
* `scrun` - Log when an invalid environment variable causes the
job submission to be rejected.
* `accounting_storage/mysql` - Fix problem where listing or
modifying an association when specifying a qos list could hang
or take a very long time.
* `gpu/nvml` - Fix `gpuutil/gpumem` only tracking last GPU in step.
Now, `gpuutil/gpumem` will record sums of all GPUS in the step.
* Fix error in `scrontab` jobs when using
`slurm.conf:PropagatePrioProcess=1`.
* Fix `slurmctld` crash on a batch job submission with
`--nodes 0,...`.
* Fix dynamic IP address fanout forwarding when using `auth/slurm`.
* Restrict listening sockets in the `mpi/pmix` plugin and `sattach`
to the `SrunPortRange`.
* `slurmrestd` - Limit mime types returned from query to
`GET /openapi/v3` to only return one mime type per serializer
plugin to fix issues with OpenAPI client generators that are
unable to handle multiple mime type aliases.
* Fix many commands possibly reporting an "`Unexpected Message
Received`" when in reality the connection timed out.
* Prevent slurmctld from starting if there is not a json
serializer present and the `extra_constraints` feature is enabled.
* Fix heterogeneous job components not being signaled with
`scancel --ctld` and `DELETE slurm/v0.0.40/jobs` if the job ids
are not explicitly given, the heterogeneous job components match
the given filters, and the heterogeneous job leader does not
match the given filters.
* Fix regression from 23.02 impeding job licenses from being cleared.
* Move error to `log_flag` which made `_get_joules_task` error to
be logged to the user when too many rpcs were queued in slurmd
for gathering energy.
* For `scancel --ctld` and the associated rest api endpoints:
`DELETE /slurm/v0.0.40/jobs`
`DELETE /slurm/v0.0.41/jobs`
Fix canceling the final array task in a job array when the task
is pending and all array tasks have been split into separate job
records. Previously this task was not canceled.
* Fix `power_save operation` after recovering from a failed
reconfigure.
* `slurmctld` - Skip removing the pidfile when running under
systemd. In that situation it is never created in the first place.
* Fix issue where altering the flags on a Slurm account
(`UsersAreCoords`) several limits on the account's association
would be set to 0 in Slurm's internal cache.
* Fix memory leak in the controller when relaying `stepmgr` step
accounting to the dbd.
* Fix segfault when submitting stepmgr jobs within an existing
allocation.
* Added `disable_slurm_hydra_bootstrap` as a possible `MpiParams`
parameter in `slurm.conf`. Using this will disable env variable
injection to allocations for the following variables:
`I_MPI_HYDRA_BOOTSTRAP,` `I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS`,
`HYDRA_BOOTSTRAP`, `HYDRA_LAUNCHER_EXTRA_ARGS`.
* `scrun` - Delay shutdown until after start requested.
This caused `scrun` to never start or shutdown and hung forever
when using `--tty`.
* Fix backup `slurmctld` potentially not running the agent when
taking over as the primary controller.
* Fix primary controller not running the agent when a reconfigure
of the `slurmctld` fails.
* `slurmd` - fix premature timeout waiting for
`REQUEST_LAUNCH_PROLOG` with large array jobs causing node to
drain.
* `jobcomp/{elasticsearch,kafka}` - Avoid sending fields with
invalid date/time.
* `jobcomp/elasticsearch` - Fix `slurmctld` memory leak from
curl usage.
* `acct_gather_profile/influxdb` - Fix slurmstepd memory leak from
curl usage
* Fix 24.05.0 regression not deleting job hash dirs after
`MinJobAge`.
* Fix filtering arguments being ignored when using squeue `--json`.
* `switch/nvidia_imex` - Move setup call after `spank_init()` to
allow namespace manipulation within the SPANK plugin.
* `switch/nvidia_imex` - Skip plugin operation if
`nvidia-caps-imex-channels` device is not present rather than
preventing slurmd from starting.
* `switch/nvidia_imex` - Skip plugin operation if
`job_container/tmpfs` is configured due to incompatibility.
* `switch/nvidia_imex` - Remove any pre-existing channels when
`slurmd` starts.
* `rpc_queue` - Add support for an optional `rpc_queue.yaml`
configuration file.
* `slurmrestd` - Add new +prefer_refs flag to `data_parser/v0.0.41`
plugin. This flag will avoid inlining single referenced schemas
in the OpenAPI schema.
* Tue Jun 04 2024 Christian Goll <cgoll@suse.com>
- Updated to new release 24.05.0 with following major changes
* Important Notes:
If using the slurmdbd (Slurm DataBase Daemon) you must update
this first. NOTE: If using a backup DBD you must start the
primary first to do any database conversion, the backup will not
start until this has happened. The 24.05 slurmdbd will work
with Slurm daemons of version 23.02 and above. You will not
need to update all clusters at the same time, but it is very
important to update slurmdbd first and having it running before
updating any other clusters making use of it.
* Highlights
+ Federation - allow client command operation when slurmdbd is
unavailable.
+ `burst_buffer/lua` - Added two new hooks: `slurm_bb_test_data_in`
and `slurm_bb_test_data_out`. The syntax and use of the new hooks
are documented in `etc/burst_buffer.lua.example`. These are
required to exist. slurmctld now checks on startup if the
`burst_buffer.lua` script loads and contains all required hooks;
`slurmctld` will exit with a fatal error if this is not
successful. Added `PollInterval` to `burst_buffer.conf`. Removed
the arbitrary limit of 512 copies of the script running
simultaneously.
+ Add QOS limit `MaxTRESRunMinsPerAccount`.
+ Add QOS limit `MaxTRESRunMinsPerUser`.
+ Add `ELIGIBLE` environment variable to `jobcomp/script` plugin.
+ Always use the QOS name for `SLURM_JOB_QOS` environment variables.
Previously the batch environment would use the description field,
which was usually equivalent to the name.
+ `cgroup/v2` - Require dbus-1 version >= 1.11.16.
+ Allow `NodeSet` names to be used in SuspendExcNodes.
+ `SuspendExcNodes=<nodes>:N` now counts allocated nodes in `N`.
The first `N` powered up nodes in <nodes> are protected from
being suspended.
+ Store job output, input and error paths in `SlurmDBD`.
+ Add `USER_DELETE` reservation flag to allow users with access
to a reservation to delete it.
+ Add `SlurmctldParameters=enable_stepmgr` to enable step
management through the `slurmstepd` instead of the controller.
+ Added `PrologFlags=RunInJob` to make prolog and epilog run
inside the job extern step to include it in the job's cgroup.
+ Add ability to reserve MPI ports at the job level for stepmgr
jobs and subdivide them at the step level.
+ `slurmrestd` - Add `--generate-openapi-spec argument`.
* Configuration File Changes (see appropriate man page for details)
+ `CoreSpecPlugin` has been removed.
+ Removed `TopologyPlugin` tree and dragonfly support from
`select/linear`. If those topology plugins are desired please
switch to `select/cons_tres`.
+ Changed the default value for `UnkillableStepTimeout` to 60
seconds or five times the value of `MessageTimeout`, whichever
is greater.
+ An error log has been added if `JobAcctGatherParams` '`UsePss`'
or '`NoShare`' are configured with a plugin other than
`jobacct_gather/linux`. In such case these parameters are ignored.
+ `helpers.conf` - Added `Flags=rebootless` parameter allowing
feature changes without rebooting compute nodes.
+ `topology/block` - Replaced the `BlockLevels` with `BlockSizes`
in `topology.conf`.
+ Add `contain_spank` option to `SlurmdParameters`. When set,
`spank_user_init()`, `spank_task_post_fork()`, and
`spank_task_exit()` will execute within the
`job_container/tmpfs` plugin namespace.
+ Add `SlurmctldParameters=max_powered_nodes=N`, which prevents
powering up nodes after the max is reached.
+ Add `ExclusiveTopo` to a partition definition in `slurm.conf`.
+ Add `AccountingStorageParameters=max_step_records` to limit how
many steps are recorded in the database for each job - excluding
batch.
* Command Changes (see man pages for details)
+ Add support for "elevenses" as an additional time specification.
+ Add support for `sbcast --preserve` when `job_container/tmpfs`
configured (previously documented as unsupported).
+ `scontrol` - Add new subcommand `power` for node power control.
+ `squeue` - Adjust `StdErr`, `StdOut`, and `StdIn` output formats.
These will now consistently print "`(null)`" if a value is
unavailable. `StdErr` will no longer display `StdOut` if it is
not distinctly set. `StdOut` will now correctly display the
default filename pattern for job arrays, and no longer show it
for non-batch jobs. However, the expansion patterns will
no longer be substituted by default.
+ Add `--segment` to job allocation to be used in topology/block.
+ Add `--exclusive=topo` for use with topology/block.
+ `squeue` - Add `--expand-patterns` option to expand `StdErr`,
`StdOut`, `StdIn` filename patterns as best as possible.
+ `sacct` - Add `--expand-patterns` option to expand `StdErr`,
`StdOut`, `StdIn` filename patterns as best as possible.
+ `sreport` - Requesting `format=Planned` will now return the
expected `Planned` time as documented, instead of `PlannedDown`.
To request `Planned Down`, one must use now `format=PLNDDown`
or `format=PlannedDown` explicitly. The abbreviations
"`Pl`" or "`Pla`" will now make reference to Planned instead
of `PlannedDown`.
* API Changes
+ Removed `ListIterator` type from `<slurm/slurm.h>`.
+ Removed `slurm_xlate_job_id()` from `<slurm/slurm.h>`
* SLURMRESTD Changes
+ `openapi/dbv0.0.38` and `openapi/v0.0.38` plugins have been
removed.
+ `openapi/dbv0.0.39` and `openapi/v0.0.39` plugins have been
tagged as deprecated to warn of their removal in the next release.
+ Changed `slurmrestd.service` to only listen on TCP socket by
default. Environments with existing drop-in units for the
service may need further adjustments to work after upgrading.
+ `slurmrestd` - Tagged `script` field as deprecated in
`POST /slurm/v0.0.41/job/submit` in anticipation of removal in
future OpenAPI plugin versions. Job submissions should set the
`job.script` (or `jobs[0].script` for HetJobs) fields instead.
+ `slurmrestd` - Attempt to automatically convert enumerated
string arrays with incoming non-string values into strings.
Add warning when incoming value for enumerated string arrays
can not be converted to string and silently ignore instead of
rejecting entire request. This change affects any endpoint that
uses an enunmerated string as given in the OpenAPI specification.
An example of this conversion would be to
`POST /slurm/v0.0.41/job/submit` with `.job.exclusive = true`.
While the JSON (boolean) true value matches a possible
enumeration, it is not the expected "true" string. This change
automatically converts the (boolean) `true` to (string) "`true`"
avoiding a parsing failure.
+ `slurmrestd` - Add `POST /slurm/v0.0.41/job/allocate` endpoint.
This endpoint will create a new job allocation without any steps.
The allocation will need to be ended via signaling the job or
it will run to the timelimit.
+ `slurmrestd` - Allow startup when `slurmdbd` is not configured
and avoid loading `slurmdbd` specific plugins.
* MPI/PMI2 Changes
+ Jobs submitted with the `SLURM_HOSTFILE` environment variable
set implies using an arbitrary distribution. Nevertheless, the
logic used in PMI2 when generating their associated
`PMI_process_mapping` values has been changed and will now be
the same used for the plane distribution, as if `-m plane` were
used. This has been changed because the original arbitrary
distribution implementation did not account for multiple
instances of the same host being present in `SLURM_HOSTFILE`,
providing an incorrect process mapping in such case. This
change also enables distributing tasks in blocks when using
arbitrary distribution, which was not the case before. This
only affects `mpi`/`pmi2` plugin.
- Removed Fix-test-21.41.patch as upstream test changed.
- Dropped package plugin-ext-sensors-rrd as the plugin module no
longer exists.
Version: 20.11.5-bp153.2.1
* Mon May 03 2021 Egbert Eich <eich@suse.com>
- Ship REST API version and auth plugins with slurmrestd.
- Add YAML support for REST API to build (bsc#1185603).
* Wed Mar 17 2021 Christian Goll <cgoll@suse.com>
- Udpate to 20.11.5:
- New features:
* New job_container/tmpfs plugin developed by NERSC that can be used to
create per-job filesystem namespaces. Documentaiion and configuration
can be found in the respecting man page.
- Bug fixes:
* Fix main scheduler bug where bf_hetjob_prio truncates SchedulerParameters.
* Fix sacct not displaying UserCPU, SystemCPU and TotalCPU for large times.
* scrontab - fix to return the correct index for a bad #SCRON option.
* scrontab - fix memory leak when invalid option found in #SCRON line.
* Add errno for when a user requests multiple partitions and they are using
partition based associations.
* Fix issue where a job could run in a wrong partition when using
EnforcePartLimits=any and partition based associations.
* Remove possible deadlock when adding associations/wckeys in multiple
threads.
* When using PrologFlags=alloc make sure the correct Slurm version is set
in the credential.
* When sending a job a warning signal make sure we always send SIGCONT
beforehand.
* Fix issue where a batch job would continue running if a prolog failed on a
node that wasn't the batch host and requeuing was disabled.
* Fix issue where sometimes salloc/srun wouldn't get a message about a prolog
failure in the job's stdout.
* Requeue or kill job on a prolog failure when PrologFlags is not set.
* Fix race condition causing node reboots to get requeued before
ResumeTimeout expires.
* Preserve node boot_req_time on reconfigure.
* Preserve node power_save_req_time on reconfigure.
* Fix node reboots being queued and issued multiple times and preventing the
reboot to time out.
* Fix run_command to exit correctly if track_script kills the calling thread.
* Only requeue a job when the PrologSlurmctld returns nonzero.
* When a job is signaled with SIGKILL make sure we flush all
prologs/setup scripts.
* Handle burst buffer scripts if the job is canceled while stage_in is
happening.
* When shutting down the slurmctld make note to ignore error message when
we have to kill a prolog/setup script we are tracking.
* scrontab - add support for the --open-mode option.
* acct_gather_profile/influxdb - avoid segfault on plugin shutdown if setup
has not completed successfully.
* Reduce delay in starting salloc allocations when running with prologs.
* Alter AllocNodes check to work if the allocating node's domain doesn't
match the slurmctld's. This restores the pre*20.11 behavior.
* Fix slurmctld segfault if jobs from a prior version had the now-removed
INVALID_DEPEND state flag set and were allowed to run in 20.11.
* Add job_container/tmpfs plugin to give a method to provide a private /tmp
per job.
* Set the correct core affinity when using AutoDetect.
* slurmrestd - mark "environment" as required for job submissions in schema.
* Tue Feb 23 2021 Christian Goll <cgoll@suse.com>
- Udpate to 20.11.04
* Fix node selection for advanced reservations with features.
* mpi/pmix: Handle pipe failure better when using ucx.
* mpi/pmix: include PMIX_NODEID for each process entry.
* Fix job getting rejected after being requeued on same node that died.
* job_submit/lua - add "network" field.
* Fix situations when a reoccuring reservation could erroneously skip a
period.
* Ensure that a reservations [pro|epi]log are ran on reoccuring reservations.
* Fix threads-per-core memory allocation issue when using CR_CPU_MEMORY.
* Fix scheduling issue with --gpus.
* Fix gpu allocations that request --cpus-per-task.
* mpi/pmix: fixed print messages for all PMIXP_* macros
* Add mapping for XCPU to --signal option.
* Fix regression in 20.11 that prevented a full pass of the main scheduler
from ever executing.
* Work around a glibc bug in which "0" is incorrectly printed as "nan"
which will result in corrupted association state on restart.
* Fix regression in 20.11 which made slurmd incorrectly attempt to find the
parent slurmd address when not applicable and send incorrect reverse*tree
info to the slurmstepd.
* Fix cgroup ns detection when using containers (e.g. LXC or Docker).
* scrontab - change temporary file handling to work with emacs.
- Removed check-for-lipmix.so.MAJOR.patch
- Added: load-pmix-major-version.patch
* Wed Jan 20 2021 Ana Guerrero Lopez <aguerrero@suse.com>
- Update to 20.11.03
- This release includes a major functional change to how job step launch is
handled compared to the previous 20.11 releases. This affects srun as
well as MPI stacks - such as Open MPI - which may use srun internally as
part of the process launch.
One of the changes made in the Slurm 20.11 release was to the semantics
for job steps launched through the 'srun' command. This also
inadvertently impacts many MPI releases that use srun underneath their
own mpiexec/mpirun command.
For 20.11.{0,1,2} releases, the default behavior for srun was changed
such that each step was allocated exactly what was requested by the
options given to srun, and did not have access to all resources assigned
to the job on the node by default. This change was equivalent to Slurm
setting the --exclusive option by default on all job steps. Job steps
desiring all resources on the node needed to explicitly request them
through the new '--whole' option.
In the 20.11.3 release, we have reverted to the 20.02 and older behavior
of assigning all resources on a node to the job step by default.
This reversion is a major behavioral change which we would not generally
do on a maintenance release, but is being done in the interest of
restoring compatibility with the large number of existing Open MPI (and
other MPI flavors) and job scripts that exist in production, and to
remove what has proven to be a significant hurdle in moving to the new
release.
Please note that one change to step launch remains - by default, in
20.11 steps are no longer permitted to overlap on the resources they
have been assigned. If that behavior is desired, all steps must
explicitly opt-in through the newly added '--overlap' option.
Further details and a full explanation of the issue can be found at:
https://bugs.schedmd.com/show_bug.cgi?id=10383#c63
- Other changes from 20.11.03
* Fix segfault when parsing bad "#SBATCH hetjob" directive.
* Allow countless gpu:<typenode GRES specifications in slurm.conf.
* PMIx - Don't set UCX_MEM_MMAP_RELOC for older version of UCX (pre 1.5).
* Don't green-light any GPU validation when core conversion fails.
* Allow updates to a reservation in the database that starts in the future.
* Better check/handling of primary key collision in reservation table.
* Improve reported error and logging in _build_node_list().
* Fix uninitialized variable in _rpc_file_bcast() which could lead to an
incorrect error return from sbcast / srun --bcast.
* mpi/cray_shasta - fix use-after-free on error in _multi_prog_parse().
* Cray - Handle setting correct prefix for cpuset cgroup with respects to
expected_usage_in_bytes. This fixes Cray's OOM killer.
* mpi/pmix: Fix PMIx_Abort support.
* Don't reject jobs allocating more cores than tasks with MaxMemPerCPU.
* Fix false error message complaining about oversubscribe in cons_tres.
* scrontab - fix parsing of empty lines.
* Fix regression causing spank_process_option errors to be ignored.
* Avoid making multiple interactive steps.
* Fix corner case issues where step creation should fail.
* Fix job rejection when --gres is less than --gpus.
* Fix regression causing spank prolog/epilog not to be called unless the
spank plugin was loaded in slurmd context.
* Fix regression preventing SLURM_HINT=nomultithread from being used
to set defaults for salloc->srun, sbatch->srun sequence.
* Reject job credential if non-superuser sets the LAUNCH_NO_ALLOC flag.
* Make it so srun --no-allocate works again.
* jobacct_gather/linux - Don't count memory on tasks that have already
finished.
* Fix 19.05/20.02 batch steps talking with a 20.11 slurmctld.
* jobacct_gather/common - Do not process jobacct's with same taskid when
calling prec_extra.
* Cleanup all tracked jobacct tasks when extern step child process finishes.
* slurmrestd/dbv0.0.36 - Correct structure of dbv0.0.36_tres_list.
* Fix regression causing task/affinity and task/cgroup to be out of sync when
configured ThreadsPerCore is different than the physical threads per core.
* Fix situation when --gpus is given but not max nodes (-N1-1) in a job
allocation.
* Interactive step - ignore cpu bind and mem bind options, and do not set
the associated environment variables which lead to unexpected behavior
from srun commands launched within the interactive step.
* Handle exit code from pipe when using UCX with PMIx.
* Fri Jan 08 2021 Egbert Eich <eich@suse.com>
- Fix fallout introduced by:
"Replace '%service_del_postun -n' with '%service_del_postun_without_restart'"
for older Leap/SLE versions.
* Fri Jan 08 2021 Egbert Eich <eich@suse.com>
- Fix Provides:/Conflicts: for libnss_slurm.
* Tue Jan 05 2021 Ana Guerrero Lopez <aguerrero@suse.com>
- Add support for configuration files from external plugins.
While built-in plugins have their configuration added in slurm.conf,
external SPANK plugins add their configuration to plugstack.conf
To allow packaging easily spank plugins, their configuration files
should be added independently at /etc/spack/plugstack.conf.d and
plugstack.conf should be left with an oneliner including all the
files under /etc/spack/plugstack.conf.d
* Mon Dec 28 2020 Ana Guerrero Lopez <aguerrero@suse.com>
- Update to 20.11.02
* Fix older versions of sacct not working with 20.11.
* Fix slurmctld crash when using a pre-20.11 srun in a job allocation.
* Correct logic problem in _validate_user_access.
* Fix libpmi to initialize Slurm configuration correctly.
- Update to 20.11.01
* Fix spelling of "overcomited" to "overcomitted" in sreport's cluster
utilization report.
* Silence debug message about shutting down backup controllers if none are
configured.
* Don't create interactive srun until PrologSlurmctld is done.
* Fix fd symlink path resolution.
* Fix slurmctld segfault on subnode reservation restore after node
configuration change.
* Fix resource allocation response message environment allocation size.
* Ensure that details->env_sup is NULL terminated.
* select/cray_aries - Correctly remove jobs/steps from blades using NPC.
* cons_tres - Avoid max_node_gres when entire node is allocated with
- -ntasks-per-gpu.
* Allow NULL arg to data_get_type().
* In sreport have usage for a reservation contain all jobs that ran in the
reservation instead of just the ones that ran in the time specified. This
matches the report for the reservation is not truncated for a time period.
* Fix issue with sending wrong batch step id to a < 20.11 slurmd.
* Add a job's alloc_node to lua for job modification and completion.
* Fix regression getting a slurmdbd connection through the perl API.
* Stop the extern step terminate monitor right after proctrack_g_wait().
* Fix removing the normalized priority of assocs.
* slurmrestd/v0.0.36 - Use correct name for partition field:
"min nodes per job" -"min_nodes_per_job".
* slurmrestd/v0.0.36 - Add node comment field.
* Fix regression marking cloud nodes as "unexpectedly rebooted" after
multiple boots.
* Fix slurmctld segfault in _slurm_rpc_job_step_create().
* slurmrestd/v0.0.36 - Filter node states against NODE_STATE_BASE to avoid
the extended states all being reported as "invalid".
* Fix race that can prevent the prolog for a requeued job from running.
* cli_filter - add "type" to readily distinguish between the CLI command in
use.
* smail - reduce sleep before seff to 5 seconds.
* Ensure SPANK prolog and epilog run without an explicit PlugStackConfig.
* Disable MySQL automatic reconnection.
* Fix allowing "b" after memory unit suffixes.
* Fix slurmctld segfault with reservations without licenses.
* Due to internal restructuring ahead of the 20.11 release, applications
calling libslurm MUST call slurm_init(NULL) before any API calls.
Otherwise the API call is likely to fail due to libslurm's internal
configuration not being available.
* slurm.spec - allow custom paths for PMIx and UCX install locations.
* Use rpath if enabled when testing for Mellanox's UCX libraries.
* slurmrestd/dbv0.0.36 - Change user query for associations to optional.
* slurmrestd/dbv0.0.36 - Change account query for associations to optional.
* mpi/pmix - change the error handler error message to be more useful.
* Add missing connection in acct_storage_p_{clear_stats, reconfig, shutdown}.
* Perl API - fix issue when running in configless mode.
* nss_slurm - avoid deadlock when stray sockets are found.
* Display correct value for ScronParameters in 'scontrol show config'
* Mon Nov 30 2020 Egbert Eich <eich@suse.com>
- Update to version 20.11.0
Slurm 20.11 includes a number of new features including:
* Overhaul of the job step management and launch code, alongside improved
GPU task placement support.
* A new "Interactive Step" mode of operation for salloc.
* A new "scrontab" command that can be used to submit and manage
periodically repeating jobs.
* IPv6 support.
* Changes to the reservation logic, with new options allowing users
to delete reservations, allowing admins to skip the next occurance of a
repeated reservation, and allowing for a job to be submitted and eligible
to run within multiple reservations.
* Dynamic Future Nodes - automatically associate a dynamically
provisioned (or "cloud") node against a NodeName definition with matching
hardware.
* An experimental new RPC queuing mode for slurmctld to reduce thread
contention on heavily loaded clusters.
* SlurmDBD integration with the Slurm REST API.
Also check
https://github.com/SchedMD/slurm/blob/slurm-20-11-0-1/RELEASE_NOTES
* Wed Nov 18 2020 Ana Guerrero Lopez <aguerrero@suse.com>
- Updated to 20.02.6, addresses two security fixes:
* PMIx - fix potential buffer overflows from use of unpackmem().
CVE-2020-27745 (bsc#1178890)
* X11 forwarding - fix potential leak of the magic cookie when sent as an
argument to the xauth command. CVE-2020-27746 (bsc#1178891)
- And many other bugfixes, full log and details available at:
* https://lists.schedmd.com/pipermail/slurm-announce/2020/000045.html
* Tue Nov 03 2020 Franck Bui <fbui@suse.com>
- Replace '%service_del_postun -n' with '%service_del_postun_without_restart'
'-n' is deprecated and will be removed in the future.
* Thu Oct 29 2020 Ana Guerrero Lopez <aguerrero@suse.com>
- Updated to 20.02.5, changes:
* Fix leak of TRESRunMins when job time is changed with --time-min
* pam_slurm - explicitly initialize slurm config to support configless mode.
* scontrol - Fix exit code when creating/updating reservations with wrong
Flags.
* When a GRES has a no_consume flag, report 0 for allocated.
* Fix cgroup cleanup by jobacct_gather/cgroup.
* When creating reservations/jobs don't allow counts on a feature unless
using an XOR.
* Improve number of boards discovery
* Fix updating a reservation NodeCnt on a zero-count reservation.
* slurmrestd - provide an explicit error messages when PSK auth fails.
* cons_tres - fix job requesting single gres per-node getting two or more
nodes with less CPUs than requested per-task.
* cons_tres - fix calculation of cores when using gres and cpus-per-task.
* cons_tres - fix job not getting access to socket without GPU or with less
than --gpus-per-socket when not enough cpus available on required socket
and not using --gres-flags=enforce binding.
* Fix HDF5 type version build error.
* Fix creation of CoreCnt only reservations when the first node isn't
available.
* Fix wrong DBD Agent queue size in sdiag when using accounting_storage/none.
* Improve job constraints XOR option logic.
* Fix preemption of hetjobs when needed nodes not in leader component.
* Fix wrong bit_or() messing potential preemptor jobs node bitmap, causing
bad node deallocations and even allocation of nodes from other partitions.
* Fix double-deallocation of preempted non-leader hetjob components.
* slurmdbd - prevent truncation of the step nodelists over 4095.
* Fix nodes remaining in drain state state after rebooting with ASAP option.
- changes from 20.02.4:
* srun - suppress job step creation warning message when waiting on
PrologSlurmctld.
* slurmrestd - fix incorrect return values in data_list_for_each() functions.
* mpi/pmix - fix issue where HetJobs could fail to launch.
* slurmrestd - set content-type header in responses.
* Fix cons_res GRES overallocation for --gres-flags=disable-binding.
* Fix cons_res incorrectly filtering cores with respect to GRES locality for
- -gres-flags=disable-binding requests.
* Fix regression where a dependency on multiple jobs in a single array using
underscores would only add the first job.
* slurmrestd - fix corrupted output due to incorrect use of memcpy().
* slurmrestd - address a number of minor Coverity warnings.
* Handle retry failure when slurmstepd is communicating with srun correctly.
* Fix jobacct_gather possibly duplicate stats when _is_a_lwp error shows up.
* Fix tasks binding to GRES which are closest to the allocated CPUs.
* Fix AMD GPU ROCM 3.5 support.
* Fix handling of job arrays in sacct when querying specific steps.
* slurmrestd - avoid fallback to local socket authentication if JWT
authentication is ill-formed.
* slurmrestd - restrict ability of requests to use different authentication
plugins.
* slurmrestd - unlink named unix sockets before closing.
* slurmrestd - fix invalid formatting in openapi.json.
* Fix batch jobs stuck in CF state on FrontEnd mode.
* Add a separate explicit error message when rejecting changes to active node
features.
* cons_common/job_test - fix slurmctld SIGABRT due to double-free.
* Fix updating reservations to set the duration correctly if updating the
start time.
* Fix update reservation to promiscuous mode.
* Fix override of job tasks count to max when ntasks-per-node present.
* Fix min CPUs per node not being at least CPUs per task requested.
* Fix CPUs allocated to match CPUs requested when requesting GRES and
threads per core equal to one.
* Fix NodeName config parsing with Boards and without CPUs.
* Ensure SLURM_JOB_USER and SLURM_JOB_UID are set in SrunProlog/Epilog.
* Fix error messages for certain invalid salloc/sbatch/srun options.
* pmi2 - clean up sockets at step termination.
* Fix 'scontrol hold' to work with 'JobName'.
* sbatch - handle --uid/--gid in #SBATCH directives properly.
* Fix race condition in job termination on slurmd.
* Print specific error messages if trying to run use certain
priority/multifactor factors that cannot work without SlurmDBD.
* Avoid partial GRES allocation when --gpus-per-job is not satisfied.
* Cray - Avoid referencing a variable outside of it's correct scope when
dealing with creating steps within a het job.
* slurmrestd - correctly handle larger addresses from accept().
* Avoid freeing wrong pointer with SlurmctldParameters=max_dbd_msg_action
with another option after that.
* Restore MCS label when suspended job is resumed.
* Fix insufficient lock levels.
* slurmrestd - use errno from job submission.
* Fix "user" filter for sacctmgr show transactions.
* Fix preemption logic.
* Fix no_consume GRES for exclusive (whole node) requests.
* Fix regression in 20.02 that caused an infinite loop in slurmctld when
requesting --distribution=plane for the job.
* Fix parsing of the --distribution option.
* Add CONF READ_LOCK to _handle_fed_send_job_sync.
* prep/script - always call slurmctld PrEp callback in _run_script().
* Fix node estimation for jobs that use GPUs or --cpus-per-task.
* Fix jobcomp, job_submit and cli_filter Lua implementation plugins causing
slurmctld and/or job submission CLI tools segfaults due to bad return
handling when the respective Lua script failed to load.
* Fix propagation of gpu options through hetjob components.
* Add SLURM_CLUSTERS environment variable to scancel.
* Fix packing/unpacking of "unlinked" jobs.
* Connect slurmstepd's stderr to srun for steps launched with --pty.
* Handle MPS correctly when doing exclusive allocations.
* slurmrestd - fix compiling against libhttpparser in a non-default path.
* slurmrestd - avoid compilation issues with libhttpparser < 2.6.
* Fix compile issues when compiling slurmrestd without --enable-debug.
* Reset idle time on a reservation that is getting purged.
* Fix reoccurring reservations that have Purge_comp= to keep correct
duration if they are purged.
* scontrol - changed the "PROMISCUOUS" flag to "MAGNETIC"
* Early return from epilog_set_env in case of no_consume.
* Fix cons_common/job_test start time discovery logic to prevent skewed
results between "will run test" executions.
* Ensure TRESRunMins limits are maintained during "scontrol reconfigure".
* Improve error message when host lookup fails.
- Refresh patch: pam_slurm-Initialize-arrays-and-pass-sizes.patch
* Tue Jul 07 2020 Egbert Eich <eich@suse.com>
- Add support for openPMIx also for Leap/SLE 15.0/1 (bsc#1173805).
- Do not run %check on SLE-12-SP2: Some incompatibility in tcl
makes this fail.
- Remove unneeded build dependency to postgresql-devel.
- Disable build on s390 (requires 64bit).
* Wed Jun 03 2020 Egbert Eich <eich@suse.com>
- Bring QA to the package build: add %%check stage.
- Remove cruft that isn't needed any longer.
- Add 'ghosted' run-file.
- Add rpmlint filter to handle issues with library packages
for Leap and enterprise upgrade versions.
* Wed May 06 2020 Egbert Eich <eich@suse.com>
- Treat libnss_slurm like any other package: add version string to
upgrade package.
Version: 20.02.3-bp152.1.1
* Fri May 22 2020 Christian Goll <cgoll@suse.com>
- Updated to 20.02.3 which fixes CVE-2020-12693 (bsc#1172004).
- Other changes are:
* Factor in ntasks-per-core=1 with cons_tres.
* Fix formatting in error message in cons_tres.
* Fix calling stat on a NULL variable.
* Fix minor memory leak when using reservations with flags=first_cores.
* Fix gpu bind issue when CPUs=Cores and ThreadsPerCore > 1 on a node.
* Fix --mem-per-gpu for heterogenous --gres requests.
* Fix slurmctld load order in load_all_part_state().
* Fix race condition not finding jobacct gather task cgroup entry.
* Suppress error message when selecting nodes on disjoint topologies.
* Improve performance of _pack_default_job_details() with large number of job
* arguments.
* Fix archive loading previous to 17.11 jobs per-node req_mem.
* Fix regresion validating that --gpus-per-socket requires --sockets-per-node
* for steps. Should only validate allocation requests.
* error() instead of fatal() when parsing an invalid hostlist.
* nss_slurm - fix potential deadlock in slurmstepd on overloaded systems.
* cons_tres - fix --gres-flags=enforce-binding and related --cpus-per-gres.
* cons_tres - Allocate lowest numbered cores when filtering cores with gres.
* Fix getting system counts for named GRES/TRES.
* MySQL - Fix for handing typed GRES for association rollups.
* Fix step allocations when tasks_per_core > 1.
* Fix allocating more GRES than requested when asking for multiple GRES types.
* Fri Mar 27 2020 Christian Goll <cgoll@suse.com>
- Updated to 20.02.1 with following changes"
* Improve job state reason for jobs hitting partition_job_depth.
* Speed up testing of singleton dependencies.
* Fix negative loop bound in cons_tres.
* srun - capture the MPI plugin return code from mpi_hook_client_fini() and
use as final return code for step failure.
* Fix segfault in cli_filter/lua.
* Fix --gpu-bind=map_gpu reusability if tasks > elements.
* Make sure config_flags on a gres are sent to the slurmctld on node
registration.
* Prolog/Epilog - Fix missing GPU information.
* Fix segfault when using config parser for expanded lines.
* Fix bit overlap test function.
* Don't accrue time if job begin time is in the future.
* Remove accrue time when updating a job start/eligible time to the future.
* Fix regression in 20.02.0 that broke --depend=expand.
* Reset begin time on job release if it's not in the future.
* Fix for recovering burst buffers when using high-availability.
* Fix invalid read due to freeing an incorrectly allocated env array.
* Update slurmctld -i message to warn about losing data.
* Fix scontrol cancel_reboot so it clears the DRAIN flag and node reason for a
pending ASAP reboot.
* Sun Mar 08 2020 Egbert Eich <eich@suse.com>
- Remove legacy_cray: with 20.02 the special treatment for
cray-specific plugins on SLE version prior to 15SP2 is
no longer required.
* Wed Mar 04 2020 Christian Goll <cgoll@suse.com>
- slurm-plugins will now also require pmix not only libpmix
(bsc#1164326)
* Fri Feb 28 2020 Egbert Eich <eich@suse.com>
- Removed autopatch as it doesn't work for the SLE-11-SP4 build.
* Thu Feb 27 2020 Kasimir _ <kasimir_@outlook.de>
- Disable %arm builds as this is no longer supported.
* Thu Feb 27 2020 Christian Goll <cgoll@suse.com>
- pmix searches now also for libpmix.so.2 so that there is no dependency
for devel package (bsc#1164386)
* added patch file check-for-lipmix.so.MAJOR.patch
* reworded patch file Remove-rpath-from-build.patch to use %autopatch
* Wed Feb 26 2020 Egbert Eich <eich@suse.com>
- Update to version 20.02.0 (jsc#SLE-8491)
* Fix minor memory leak in slurmd on reconfig.
* Fix invalid ptr reference when rolling up data in the database.
* Change shtml2html.py to require python3 for RHEL8 support, and match
man2html.py.
* slurm.spec - override "hardening" linker flags to ensure RHEL8 builds
in a usable manner.
* Fix type mismatches in the perl API.
* Prevent use of uninitialized slurmctld_diag_stats.
* Fixed various Coverity issues.
* Only show warning about root-less topology in daemons.
* Fix accounting of jobs in IGNORE_JOBS reservations.
* Fix issue with batch steps state not loading correctly when upgrading from
19.05.
* Deprecate max_depend_depth in SchedulerParameters and move it to
DependencyParameters.
* Silence erroneous error on slurmctld upgrade when loading federation state.
* Break infinite loop in cons_tres dealing with incorrect tasks per tres
request resulting in slurmctld hang.
* Improve handling of --gpus-per-task to make sure appropriate number of GPUs
is assigned to job.
* Fix seg fault on cons_res when requesting --spread-job.
- Move to python3 for everything but SLE-11-SP4
* For SLE-11-SP4 add a workaround to handle a python3 script (python2.7
compliant).
* Wed Feb 19 2020 Egbert Eich <eich@suse.com>
- Add explicit version dependency to libpmix as well.
'slurm-devel' has a tight version dependency on libpmix -
allowing multiple libpmix versions in one package repository
is therefore essential.
* Thu Feb 13 2020 Egbert Eich <eich@suse.com>
- Update to version 20.02.0-rc1
* sbatch - fix segfault when no newline at the end of a burst buffer file.
* Change scancel to only check job's base state when matching -t options.
* Save job dependency list in state files.
* cons_tres - allow jobs to be run on systems with root-less topologies.
* Restore pre-20.02pre1 PrologSlurmctld synchonization behavior to avoid
various race conditions, and ensure proper batch job launch.
* Add new slurmrestd command/daemon which implements the Slurm REST API.
* Tue Feb 11 2020 Christian Goll <cgoll@suse.com>
- Update to version 20.02.0-0pre1, highlights are
Highlights:
* Exclusive behavior of a node includes all GRES on a node as well
as the cpus.
* Use python3 instead of python for internal build/test scripts.
The slurm.spec file has been updated to depend on python3 as well.
* Added new NodeSet configuration option to help simplify partition
configuration sections for heterogeneous / condo*style clusters.
* Added slurm.conf option MaxDBDMsgs to control how many messages will be
stored in the slurmctld before throwing them away when the slurmdbd is down.
* The checkpoint plugin interface and all associated API calls have been
removed.
* slurm_init_job_desc_msg() initializes mail_type as uint16_t. This allows
mail_type to be set to NONE with scontrol.
* Add new slurm_spank_log() function to print messages back to the user from
within a SPANK plugin without prepending "error: " from slurm_error().
* Enforce having partition name and nodelist=ALL when creating reservations
with flags=PART_NODES.
* SPANK - removed never-implemented slurm_spank_slurmd_init() interface. This
hook has always been accessible through slurm_spank_init() in the
S_CTX_SLURMD context instead.
* sbcast - add new BcastAddr option to NodeName lines to allow sbcast traffic
to flow over an alternate network path.
* Added auth/jwt plugin, and 'scontrol token' subcommand. PMIx - improve
* performance of proc map generation. Deprecate kill_invalid_depend in
* SchedulerParameters and move it to a new
option called DependencyParameters.
* Enable job dependencies for any job on any cluster in the same federation.
* Allow clusters to be added automatically to db at startup of ctld. Add
* AccountingStorageExternalHost slurm.conf parameter. The
* "ConditionPathExists" condition in slurmd.service has been disabled by
default to permit simpler installation of a "configless" Slurm cluster.
* In SchedulerParameters remove deprecated max_job_bf and replace with
bf_max_job_test.
* Disable sbatch, salloc, srun --reboot for non-admins. SPANK - added support
* for S_JOB_GID in the job script context with
spank_get_item().
* Prolog/Epilog - add SLURM_JOB_GID environment variable.
configuration file changes:
* The mpi/openmpi plugin has been removed as it does nothing.
MpiDefault=openmpi will be translated to the functionally-equivalent
MpiDefault=none.
command changes (see man pages for details)
* Display StepId=<jobid>.batch instead of StepId=<jobid>.4294967294 in output
of "scontrol show step". (slurm_sprint_job_step_info())
* MPMD in srun will now defer PATH resolution for the commands to launch to
slurmstepd. Previously it would handle resolution client*side, but with
a non*standard approach that walked PATH in reverse.
* squeue - added "--me" option, equivalent to --user=$USER.
* The LicensesUsed line has been removed from 'scontrol show config'.
Please see the 'scontrol show licenses' command as an alternative.
* sbatch - adjusted backoff times for "--wait" option to reduce load on
slurmctld. This results in a steady*state delay of 32s between queries,
instead of the prior 10s delay.
- Removed following deprecated patches:
* removed patch slurmctld-rerun-agent_init-when-backup-controller-takes-over.patch
* removed patch split-xdaemon-in-xdaemon_init-and-xdaemon_finish-for.patch
* removed patch slurmctld-uses-xdaemon_-for-systemd.patch
* removed patch slurmd-uses-xdaemon_-for-systemd.patch
* removed patch slurmdbd-uses-xdaemon_-for-systemd.patch
* removed patch slurmsmwd-uses-xdaemon_-for-systemd.patch
* removed patch removed-deprecated-xdaemon.patch
* Wed Feb 05 2020 Christian Goll <cgoll@suse.com>
- standard slurm.conf uses now also SlurmctldHost on all build
targets (bsc#1162377)
* Mon Jan 27 2020 Egbert Eich <eich@suse.com>
- Fix a missed systemd_requires -> systemd_ordering conversion.
* Fri Jan 24 2020 Egbert Eich <eich@suse.com>
- Remove special OHPC compatibility macro: these settings should
be applied univerally.
- Add a Recommends for mariadb to slurm-slurmdbd: it is recommened
to run the database on the same machine as the daemon.
* Fri Jan 24 2020 Dominique Leuenberger <dimstar@opensuse.org>
- BuildRequire pkgconfig(systemd) instead of systemd: allow OBS to
shortcut through the -mini flavors.
- Use systemd_ordering instead of systemd_requires: systemd is
never a strict requirement; but in case the system is scheduled
for installation together with systemd, we want systemd to be
installed prior to slurm.
* Thu Jan 23 2020 Christian Goll <cgoll@suse.com>
- start slurmdbd after mariadb (bsc#1161716)
* Mon Jan 13 2020 Egbert Eich <eich@suse.com>
- Fix base_ver for SLE 15 SP2.
* Wed Jan 08 2020 Egbert Eich <eich@suse.com>
- Update to version 19.05.5 (jsc#SLE-8491)
* Check %docdir/NEWS for details.
* Includes security fixes CVE-2019-19727, CVE-2019-19728,
CVE-2019-12838.
* Disable i586 builds as this is no longer supported.
* Create libnss_slurm package to support user and group resolution
thru slurmstepd.
* slurm-2.4.4-rpath.patch -> Remove-rpath-from-build.patch
Obsoleted:
- pam_slurm_adopt-avoid-running-outside-of-the-sshd-PA.patch
- pam_slurm_adopt-send_user_msg-don-t-copy-undefined-d.patch
- pam_slurm_adopt-use-uid-to-determine-whether-root-is.patch
* Thu Jan 02 2020 Egbert Eich <eich@suse.com>
- Deprecate "ControlMachine" only for SLURM version upgrades and
products newer than 1501. This ensures that the original setting
is retained for the SLURM version shipped origianlly with SLE-15-SP1
or Leap 15.1.
* Sat Dec 21 2019 Egbert Eich <eich@suse.com>
- Update to v18.08.9 for fixing CVE-2019-19728 (bsc#1159692).
* Wrap END_TIMER{,2,3} macro definition in "do {} while (0)" block.
* Make sview work with glib2 v2.62.
* Make Slurm compile on linux after sys/sysctl.h was deprecated.
* Install slurmdbd.conf.example with 0600 permissions to encourage secure
use. CVE-2019-19727.
* srun - do not continue with job launch if --uid fails. CVE-2019-19728.