-
Notifications
You must be signed in to change notification settings - Fork 40
fix: partition cluster selection #385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe PR enhances the SLURM executor plugin's partition selection logic to support cluster-aware handling. The Changes
Sequence DiagramsequenceDiagram
participant Job as Job Resources
participant PSel as Partition Selection<br/>(get_best_partition)
participant Score as Scoring Logic<br/>(score_job_fit)
participant Partitions as Partition Config
Job->>PSel: Submit with cluster requirement
PSel->>Partitions: Load partition config
loop For each partition
PSel->>Score: Evaluate fit
alt Job has cluster & Partition has partition_cluster
Score->>Score: Check: clusters match?
alt Clusters match
Score-->>PSel: Score > 0 ✓
else Clusters don't match
Score-->>PSel: Score ≤ 0 (ineligible)
end
else Job has cluster & Partition lacks partition_cluster
Score-->>PSel: Score ≤ 0 (ineligible)
else Job lacks cluster & Partition has partition_cluster
Score-->>PSel: Score ≤ 0 (ineligible)
else Neither has cluster constraint
Score-->>PSel: Score > 0 ✓
end
end
PSel->>PSel: Select best partition (score > 0)
PSel-->>Job: Return partition or fallback
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
snakemake_executor_plugin_slurm/__init__.py(1 hunks)snakemake_executor_plugin_slurm/partitions.py(3 hunks)snakemake_executor_plugin_slurm/submit_string.py(1 hunks)snakemake_executor_plugin_slurm/validation.py(1 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.
Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.
📚 Learning: 2024-12-11T14:17:08.749Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 178
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2024-12-11T14:17:08.749Z
Learning: In the `snakemake-executor-plugin-slurm` project, when handling exceptions in `snakemake_executor_plugin_slurm/__init__.py`, prefer concise error messages and avoid unnecessary verbosity or exception chaining when it's not necessary.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.pysnakemake_executor_plugin_slurm/__init__.pysnakemake_executor_plugin_slurm/validation.py
📚 Learning: 2025-03-10T13:34:59.270Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:34:59.270Z
Learning: In the Snakemake executor plugin for SLURM, all GPU resources (even simple numeric ones) should be translated to the `--gpus` flag (plural) to match SLURM's expectations, not `--gpu` (singular).
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-03-10T15:20:51.829Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.pysnakemake_executor_plugin_slurm/partitions.pysnakemake_executor_plugin_slurm/__init__.py
📚 Learning: 2025-03-10T13:34:59.270Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:34:59.270Z
Learning: In the Snakemake executor plugin for SLURM, all GPU resources (both simple numeric values and model:number specifications) should be translated to the `--gpus` flag (plural form) to match SLURM's command line interface expectations.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-03-31T11:52:05.373Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.pysnakemake_executor_plugin_slurm/partitions.pysnakemake_executor_plugin_slurm/__init__.py
📚 Learning: 2025-02-18T14:40:27.064Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2025-02-18T14:40:27.064Z
Learning: In the Snakemake executor plugin for SLURM, the GPU resource is specified using the "gpus" (plural) resource name, not "gpu" (singular).
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-01-13T09:54:22.950Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-01-20T09:13:26.443Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.pysnakemake_executor_plugin_slurm/__init__.py
📚 Learning: 2025-03-10T13:22:54.081Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:22:54.081Z
Learning: In the Snakemake executor plugin for SLURM, when using GPU resources, the function returns `--gpu=n` (singular) when only a number is provided, but `--gpus=model:n` (plural) when a model is specified in the implementation.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-09-09T15:55:55.669Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-09-09T15:55:55.669Z
Learning: In the Snakemake SLURM executor plugin, Snakemake submits itself as a job using the --wrap parameter in the sbatch command, rather than creating or submitting a separate job.sh script file. The actual implementation uses `call += f' --wrap="{exec_job}"'` where exec_job is the formatted snakemake execution command.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
🧬 Code graph analysis (1)
snakemake_executor_plugin_slurm/__init__.py (1)
snakemake_executor_plugin_slurm/partitions.py (1)
get_best_partition(58-83)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: testing
🔇 Additional comments (7)
snakemake_executor_plugin_slurm/validation.py (1)
132-139: Good bug fix for the validation logic.The previous condition
> 0was inverted - it would reject all positive integers. The fix to< 1now correctly accepts integers ≥ 1 and rejects 0 or negative values, which matches the error message stating "must be a positive integer."snakemake_executor_plugin_slurm/partitions.py (2)
42-46: LGTM - cluster key extraction from partition config.The logic correctly extracts cluster from multiple possible keys and removes it from the config before passing to
PartitionLimits. The priority order (slurm_cluster→cluster→clusters) is appropriate for partition configuration files.
248-248: Good naming choice.Renaming from
clustertopartition_clusteravoids ambiguity with the job's cluster resource specification.snakemake_executor_plugin_slurm/submit_string.py (1)
45-53: LGTM - cluster argument construction.The logic correctly resolves cluster from multiple resource keys and safely quotes the value. The comment documents the backward-compatible priority order.
Note: As mentioned in the
partitions.pyreview, the key resolution order here differs fromscore_job_fit. Consider aligning them for consistency.snakemake_executor_plugin_slurm/__init__.py (3)
875-879: Consistent cluster key order withpartitions.py.The resolution order (
slurm_cluster→cluster→clusters) matches the order used inscore_job_fit, ensuring partition selection and cluster compatibility checks use the same cluster value.
881-894: Well-structured cluster-aware partition validation.The logic correctly handles the key scenarios:
- If partition is in config and belongs to a different cluster → auto-select
- If partition is in config and matches cluster (or has no cluster) → use it
- If partition isn't in config → trust user's specification
The short-circuit evaluation on line 888 safely handles the case where the requested partition isn't defined in the configuration.
896-898: LGTM - auto-selection fallback.This correctly triggers partition auto-selection when no partition was specified by the user and a partition configuration is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
snakemake_executor_plugin_slurm/partitions.py (1)
63-68: Confirm intentionality ofscore > 0filter for unconstrained-but-compatible partitionsThe condition
if score is not None and score > 0:at line 65 excludes partitions where the job has no specific resource requirements and all partition limits are effectively unconstrained (all limits areinf). Such partitions would score0.0because the scoring algorithm only accumulates points whenjob_requirement > 0ANDlimit is not inf.This change was introduced in commit 4ca2f13 and represents a behavior shift from the prior version (commit 77902fb), which used
if score is not None:and allowed score-0.0 partitions to be auto-selectable.With the current
> 0filter:
- Truly unconstrained-but-compatible partitions (all limits=
inf) are dropped and trigger the "No suitable partition found" fallback path- The system returns
Noneand relies on SLURM default partition selection insteadIf this exclusion is intentional (i.e., you want auto-selection to require meaningful differentiation among partitions), please document it. If unconstrained partitions should remain auto-selectable, remove the
> 0check or ensure unconstrained-but-compatible partitions receive a small positive baseline score.Please confirm this behavior change is intentional for deployments using mostly-unbounded partition definitions.
[Also applies to: 290-374]
♻️ Duplicate comments (1)
snakemake_executor_plugin_slurm/partitions.py (1)
41-52: Align cluster key resolution order with submit_string.py to avoid mismatchesPartition and job cluster resolution here uses
("slurm_cluster", "cluster", "clusters"), whilesubmit_string.py(per learnings and current behavior) resolves cluster keys ascluster → clusters → slurm_cluster. If a job or partition config ever sets more than one of these keys with differing values, the partition-selection logic could use a different cluster than the one ultimately passed tosbatch, leading to subtle wrong-cluster scheduling.To keep behavior consistent across the plugin, consider matching the priority order used in
submit_string.pyin both places that resolve clusters here:- # Extract optional cluster name from partition config - cluster = None - for key in ("slurm_cluster", "cluster", "clusters"): + # Extract optional cluster name from partition config + cluster = None + # Match submit_string.py: prefer `cluster`, then `clusters`, fallback `slurm_cluster` + for key in ("cluster", "clusters", "slurm_cluster"): if key in partition_config: cluster = partition_config.pop(key) break @@ - job_cluster = ( - job.resources.get("slurm_cluster") - or job.resources.get("cluster") - or job.resources.get("clusters") - ) + # Match submit_string.py key priority for job cluster resolution + job_cluster = ( + job.resources.get("cluster") + or job.resources.get("clusters") + or job.resources.get("slurm_cluster") + )This keeps
Partition.partition_clusterand job-side cluster resolution in sync with the rest of the codebase.Also applies to: 248-248, 272-278
🧹 Nitpick comments (1)
tests/test_partition_selection.py (1)
667-709: Tighten multicluster “no-cluster” tests and avoid mutating the fixture config in placeThe new tests correctly exercise the intended semantics around jobs with and without cluster requirements, but a couple of small adjustments would make them more robust and future-proof:
- Avoid mutating the nested
partitionsdict from the fixture
config = dict(multicluster_partition_config)only shallow-copies the outer dict; adding"no-cluster"mutates the shared innerpartitionsmapping. This is safe with the current function-scoped fixture, but could become brittle if the fixture scope changes or is reused elsewhere.Consider building a fresh config instead:
- config = dict(multicluster_partition_config) - config["partitions"]["no-cluster"] = { + config = { + "partitions": { + **multicluster_partition_config["partitions"], + "no-cluster": { "max_runtime": 360, "max_threads": 32, "max_mem_mb": 64000, - } + }, + } + }Apply the same pattern in
test_job_without_cluster_can_select_partition_without_cluster.
- Strengthen the expectation for jobs without a cluster
Given the strict cluster eligibility implemented in
Partition.score_job_fit(jobs without a cluster requirement should only see partitions without a cluster assignment),test_job_without_cluster_can_select_partition_without_clustercan assert that the job lands specifically on theno-clusterpartition instead of using a broaderin [...]check:- # Should be able to select 'no-cluster' partition - assert selected_partition in ["normal-small", "normal-large", "no-cluster"] + # With strict cluster eligibility, only the unclustered partition is eligible + assert selected_partition == "no-cluster"This will better guard against future regressions that accidentally re-allow clustered partitions for jobs that omit a cluster.
Also applies to: 711-725
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
snakemake_executor_plugin_slurm/partitions.py(4 hunks)tests/test_partition_selection.py(3 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.
Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.
📚 Learning: 2025-03-31T11:52:05.373Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.
Applied to files:
snakemake_executor_plugin_slurm/partitions.py
📚 Learning: 2025-03-10T15:20:51.829Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.
Applied to files:
snakemake_executor_plugin_slurm/partitions.py
📚 Learning: 2024-12-11T14:17:08.749Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 178
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2024-12-11T14:17:08.749Z
Learning: In the `snakemake-executor-plugin-slurm` project, when handling exceptions in `snakemake_executor_plugin_slurm/__init__.py`, prefer concise error messages and avoid unnecessary verbosity or exception chaining when it's not necessary.
Applied to files:
snakemake_executor_plugin_slurm/partitions.py
🧬 Code graph analysis (1)
tests/test_partition_selection.py (2)
tests/tests.py (3)
mock_job(226-247)mock_job(457-478)mock_job(795-816)snakemake_executor_plugin_slurm/partitions.py (2)
read_partition_file(14-55)get_best_partition(58-83)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: testing
🔇 Additional comments (1)
tests/test_partition_selection.py (1)
591-596: Good coverage ofpartition_clusterparsingAsserting
partition_clusterfor named partitions via a lookup map avoids order assumptions and directly validates that the partition file parsing honors the configured cluster values. This is a solid sanity check for the new field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
snakemake_executor_plugin_slurm/partitions.py (1)
274-290: Inconsistent cluster key priority order with submit_string.py.The resolution order here (
slurm_cluster→cluster→clustersat lines 274-277) differs fromsubmit_string.py(lines 48-51) which uses (cluster→clusters→slurm_cluster). This was flagged in the previous review and marked as addressed, but the order still differs.The strict eligibility logic (lines 280-290) correctly returns
Nonefor cluster mismatches in both directions, addressing the previous concern about returning0.job_cluster = ( - job.resources.get("slurm_cluster") - or job.resources.get("cluster") + job.resources.get("cluster") or job.resources.get("clusters") + or job.resources.get("slurm_cluster") )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
snakemake_executor_plugin_slurm/__init__.py(1 hunks)snakemake_executor_plugin_slurm/partitions.py(4 hunks)snakemake_executor_plugin_slurm/submit_string.py(1 hunks)tests/test_partition_selection.py(3 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.
Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.
📚 Learning: 2024-12-11T14:17:08.749Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 178
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2024-12-11T14:17:08.749Z
Learning: In the `snakemake-executor-plugin-slurm` project, when handling exceptions in `snakemake_executor_plugin_slurm/__init__.py`, prefer concise error messages and avoid unnecessary verbosity or exception chaining when it's not necessary.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.pysnakemake_executor_plugin_slurm/__init__.pysnakemake_executor_plugin_slurm/partitions.py
📚 Learning: 2025-03-10T13:34:59.270Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:34:59.270Z
Learning: In the Snakemake executor plugin for SLURM, all GPU resources (even simple numeric ones) should be translated to the `--gpus` flag (plural) to match SLURM's expectations, not `--gpu` (singular).
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-03-10T15:20:51.829Z
Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.pysnakemake_executor_plugin_slurm/__init__.py
📚 Learning: 2025-03-10T13:34:59.270Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:34:59.270Z
Learning: In the Snakemake executor plugin for SLURM, all GPU resources (both simple numeric values and model:number specifications) should be translated to the `--gpus` flag (plural form) to match SLURM's command line interface expectations.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-03-31T11:52:05.373Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.pysnakemake_executor_plugin_slurm/__init__.pysnakemake_executor_plugin_slurm/partitions.py
📚 Learning: 2025-02-18T14:40:27.064Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2025-02-18T14:40:27.064Z
Learning: In the Snakemake executor plugin for SLURM, the GPU resource is specified using the "gpus" (plural) resource name, not "gpu" (singular).
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-01-13T09:54:22.950Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-03-10T13:22:54.081Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:22:54.081Z
Learning: In the Snakemake executor plugin for SLURM, when using GPU resources, the function returns `--gpu=n` (singular) when only a number is provided, but `--gpus=model:n` (plural) when a model is specified in the implementation.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-01-20T09:13:26.443Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
📚 Learning: 2025-05-26T12:22:07.005Z
Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 221
File: snakemake_executor_plugin_slurm/efficiency_report.py:34-49
Timestamp: 2025-05-26T12:22:07.005Z
Learning: In the parse_reqmem function in snakemake_executor_plugin_slurm/efficiency_report.py, the conditional check for per_unit == "/node" is necessary for proper conversion of memory values in multi-node jobs, not redundant as it might initially appear.
Applied to files:
snakemake_executor_plugin_slurm/submit_string.py
🧬 Code graph analysis (2)
snakemake_executor_plugin_slurm/__init__.py (1)
snakemake_executor_plugin_slurm/partitions.py (1)
get_best_partition(58-83)
tests/test_partition_selection.py (2)
tests/tests.py (3)
mock_job(226-247)mock_job(457-478)mock_job(795-816)snakemake_executor_plugin_slurm/partitions.py (2)
read_partition_file(14-55)get_best_partition(58-83)
🔇 Additional comments (8)
snakemake_executor_plugin_slurm/submit_string.py (1)
45-54: Cluster resolution order is now consistent.The key priority order (
cluster→clusters→slurm_cluster) here matches the order used inpartitions.py(lines 274-277), resolving the inconsistency flagged in the previous review. The backwards compatibility comment accurately describes the behavior.snakemake_executor_plugin_slurm/__init__.py (1)
881-907: Cluster-aware partition validation logic looks correct.The logic correctly handles the case where an explicit
slurm_partitionis specified but belongs to a different cluster than requested—it falls back to auto-selection viaget_best_partition. The condition at line 894 also correctly triggers auto-selection when the partition has no cluster defined (partition_cluster is None) but the job requires a specific cluster.tests/test_partition_selection.py (3)
594-595: LGTM!Attribute access correctly updated from
.clusterto.partition_clusterto match the renamed field.
667-711: Good test coverage for cluster eligibility rules.These two new tests effectively validate the strict cluster eligibility logic:
- Jobs with a cluster requirement cannot select partitions without a cluster.
- Jobs without a cluster requirement can select partitions without a cluster.
This aligns well with the implementation in
partitions.pylines 285-290.
713-732: Updated expectation correctly reflects new behavior.The test now expects
Nonewhen all partitions have cluster assignments and the job has no cluster specification. This matches the strict eligibility rule: jobs without a cluster can only select partitions without a cluster.snakemake_executor_plugin_slurm/partitions.py (3)
42-51: Partition config parsing correctly handles cluster keys.The extraction of cluster from partition configuration uses a reasonable priority order. Note that this order (
slurm_cluster→cluster→clusters) is for parsing the YAML config file, which is a separate concern from job resource resolution.
67-68: Score threshold change is correct.Adding
score > 0ensures only partitions with positive scores are considered. Combined with the cluster eligibility changes (returningNonefor mismatches), this properly excludes ineligible partitions.
248-248: LGTM!Field rename from
clustertopartition_clusteris clear and distinguishes the partition's cluster assignment from job cluster requirements.
| job_cluster = ( | ||
| job.resources.get("slurm_cluster") | ||
| or job.resources.get("cluster") | ||
| or job.resources.get("clusters") | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent cluster key priority order with submit_string.py.
The resolution order here (slurm_cluster → cluster → clusters) differs from submit_string.py (lines 48-51) which uses (cluster → clusters → slurm_cluster). If a job specifies multiple cluster keys with different values, partition validation may use a different cluster than what's passed to sbatch.
job_cluster = (
- job.resources.get("slurm_cluster")
- or job.resources.get("cluster")
+ job.resources.get("cluster")
or job.resources.get("clusters")
+ or job.resources.get("slurm_cluster")
)🤖 Prompt for AI Agents
In snakemake_executor_plugin_slurm/__init__.py around lines 875 to 879, the
cluster key lookup currently prefers slurm_cluster before cluster and clusters,
which is inconsistent with submit_string.py; change the resolution order to
match submit_string.py by checking job.resources for "cluster" then "clusters"
then "slurm_cluster" so partition validation uses the same cluster value passed
to sbatch, and update any related comments or docstrings to reflect the
canonical priority.
🤖 I have created a release *beep* *boop* --- ## [2.0.2](v2.0.1...v2.0.2) (2025-12-08) ### Bug Fixes * partition cluster selection ([#385](#385)) ([2863f78](2863f78)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Version 2.0.2 * **Bug Fixes** * Fixed an issue with partition cluster selection <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
The automated partition selection did not consider a multicluster setup. This might require a partition selection which considers the cluster keyword. Now, the plugin considers
cluster,clustersandslurm_clusterSummary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.