fix: partition cluster selection #385

cmeesters · 2025-12-08T11:40:15Z

The automated partition selection did not consider a multicluster setup. This might require a partition selection which considers the cluster keyword. Now, the plugin considers cluster , clusters and slurm_cluster

Summary by CodeRabbit

New Features
- Added support for multiple cluster specification formats (cluster, clusters, slurm_cluster) treated equivalently across configurations.
Bug Fixes
- Improved partition selection to enforce strict cluster matching when a cluster is explicitly specified.
Documentation
- Clarified that partition selection can target specific clusters using cluster resource keys.
Tests
- Updated tests to verify cluster constraint handling in partition selection logic.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-08T11:40:25Z

Walkthrough

The PR enhances the SLURM executor plugin's partition selection logic to support cluster-aware handling. The Partition class field cluster is renamed to partition_cluster. Config parsing now supports multiple cluster keys (slurm_cluster, cluster, clusters). Partition scoring enforces strict cluster matching—jobs with a cluster requirement only select partitions with a matching cluster. Cluster resolution is unified in submission string generation for backward compatibility.

Changes

Cohort / File(s)	Summary
Core partition selection logic `snakemake_executor_plugin_slurm/partitions.py`, `snakemake_executor_plugin_slurm/__init__.py`	Partition class field renamed from `cluster` to `partition_cluster`. Config parsing updated to support multiple cluster keys (`slurm_cluster`, `cluster`, `clusters`). Scoring logic enforces strict cluster matching: jobs with a cluster requirement only select partitions with matching `partition_cluster`; jobs without cluster requirement exclude partitions with `partition_cluster`. Partition selection fallback enhanced to validate cluster compatibility.
Submission string generation `snakemake_executor_plugin_slurm/submit_string.py`	Unified cluster key resolution replaces singular/plural handling. Resolves `cluster_val` from `job.resources.cluster`, `job.resources.clusters`, or `job.resources.slurm_cluster` in priority order; appends `--cluster <value>` if resolved. Maintains backward compatibility.
Documentation `docs/further.md`	Added clarification note that partition selection can target specific cluster using any of `cluster`, `clusters`, or `slurm_cluster` resource keys.
Tests `tests/test_partition_selection.py`	Updated attribute access from `.cluster` to `.partition_cluster` in existing tests. Added `test_job_with_cluster_does_not_select_partition_without_cluster` and `test_job_without_cluster_can_select_partition_without_cluster` to verify strict cluster matching behavior. Updated test expectations to reflect new ineligibility rules.

Sequence Diagram

sequenceDiagram
    participant Job as Job Resources
    participant PSel as Partition Selection<br/>(get_best_partition)
    participant Score as Scoring Logic<br/>(score_job_fit)
    participant Partitions as Partition Config

    Job->>PSel: Submit with cluster requirement
    PSel->>Partitions: Load partition config
    
    loop For each partition
        PSel->>Score: Evaluate fit
        alt Job has cluster & Partition has partition_cluster
            Score->>Score: Check: clusters match?
            alt Clusters match
                Score-->>PSel: Score > 0 ✓
            else Clusters don't match
                Score-->>PSel: Score ≤ 0 (ineligible)
            end
        else Job has cluster & Partition lacks partition_cluster
            Score-->>PSel: Score ≤ 0 (ineligible)
        else Job lacks cluster & Partition has partition_cluster
            Score-->>PSel: Score ≤ 0 (ineligible)
        else Neither has cluster constraint
            Score-->>PSel: Score > 0 ✓
        end
    end
    
    PSel->>PSel: Select best partition (score > 0)
    PSel-->>Job: Return partition or fallback

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Cluster matching logic in score_job_fit: Verify strict enforcement of cluster compatibility rules across all code paths and edge cases
Field renaming propagation: Confirm partition_cluster usage is consistent across Partition construction, scoring, and test assertions
Backward compatibility: Validate that all three cluster key variants (cluster, clusters, slurm_cluster) are handled correctly in config parsing and submission string generation
Partition fallback behavior: Ensure cluster-aware validation in get_partition_arg correctly handles explicit partitions vs. auto-selection vs. fallback scenarios
Test coverage: Review new test assertions for cluster mismatch scenarios and ineligibility logic

Possibly related PRs

fix: account lookup / test in multicluster environment #350: Modifies get_partition_arg with partition quoting and safer return behavior, directly related to main PR's partition selection refinements.
feat!: proposal for dynamic partition selection #321: Updates the same partition-selection subsystem, including read_partition_file, get_best_partition, Partition data structure renaming (cluster → partition_cluster), and get_partition_arg behavior.

Suggested reviewers

johanneskoester
cademirch

Poem

🐰 A tale of clusters, partitions, and kin,
Where rabbits now match jobs to their kin,
Cluster-aware hops through scoring so fine,
Three keys in harmony—all will align!
No mismatch shall pass, no stray job shall win. 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: partition cluster selection' directly and concisely describes the main change: fixing partition selection to properly handle cluster constraints in multicluster setups.
Docstring Coverage	✅ Passed	Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/partition_cluster_selection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a3e91cd and 77902fb.

📒 Files selected for processing (4)

snakemake_executor_plugin_slurm/__init__.py (1 hunks)
snakemake_executor_plugin_slurm/partitions.py (3 hunks)
snakemake_executor_plugin_slurm/submit_string.py (1 hunks)
snakemake_executor_plugin_slurm/validation.py (1 hunks)

🧰 Additional context used

🧠 Learnings (11)

📓 Common learnings

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.

Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.

📚 Learning: 2024-12-11T14:17:08.749Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 178
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2024-12-11T14:17:08.749Z
Learning: In the `snakemake-executor-plugin-slurm` project, when handling exceptions in `snakemake_executor_plugin_slurm/__init__.py`, prefer concise error messages and avoid unnecessary verbosity or exception chaining when it's not necessary.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py
snakemake_executor_plugin_slurm/__init__.py
snakemake_executor_plugin_slurm/validation.py

📚 Learning: 2025-03-10T13:34:59.270Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:34:59.270Z
Learning: In the Snakemake executor plugin for SLURM, all GPU resources (even simple numeric ones) should be translated to the `--gpus` flag (plural) to match SLURM's expectations, not `--gpu` (singular).

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-03-10T15:20:51.829Z

Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py
snakemake_executor_plugin_slurm/partitions.py
snakemake_executor_plugin_slurm/__init__.py

📚 Learning: 2025-03-10T13:34:59.270Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:34:59.270Z
Learning: In the Snakemake executor plugin for SLURM, all GPU resources (both simple numeric values and model:number specifications) should be translated to the `--gpus` flag (plural form) to match SLURM's command line interface expectations.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-03-31T11:52:05.373Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py
snakemake_executor_plugin_slurm/partitions.py
snakemake_executor_plugin_slurm/__init__.py

📚 Learning: 2025-02-18T14:40:27.064Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2025-02-18T14:40:27.064Z
Learning: In the Snakemake executor plugin for SLURM, the GPU resource is specified using the "gpus" (plural) resource name, not "gpu" (singular).

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-01-13T09:54:22.950Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-01-20T09:13:26.443Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py
snakemake_executor_plugin_slurm/__init__.py

📚 Learning: 2025-03-10T13:22:54.081Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:22:54.081Z
Learning: In the Snakemake executor plugin for SLURM, when using GPU resources, the function returns `--gpu=n` (singular) when only a number is provided, but `--gpus=model:n` (plural) when a model is specified in the implementation.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-09-09T15:55:55.669Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-09-09T15:55:55.669Z
Learning: In the Snakemake SLURM executor plugin, Snakemake submits itself as a job using the --wrap parameter in the sbatch command, rather than creating or submitting a separate job.sh script file. The actual implementation uses `call += f' --wrap="{exec_job}"'` where exec_job is the formatted snakemake execution command.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

🧬 Code graph analysis (1)

snakemake_executor_plugin_slurm/__init__.py (1)

snakemake_executor_plugin_slurm/partitions.py (1)

get_best_partition (58-83)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: testing

🔇 Additional comments (7)

snakemake_executor_plugin_slurm/validation.py (1)

132-139: Good bug fix for the validation logic.

The previous condition > 0 was inverted - it would reject all positive integers. The fix to < 1 now correctly accepts integers ≥ 1 and rejects 0 or negative values, which matches the error message stating "must be a positive integer."

snakemake_executor_plugin_slurm/partitions.py (2)

42-46: LGTM - cluster key extraction from partition config.

The logic correctly extracts cluster from multiple possible keys and removes it from the config before passing to PartitionLimits. The priority order (slurm_cluster → cluster → clusters) is appropriate for partition configuration files.

248-248: Good naming choice.

Renaming from cluster to partition_cluster avoids ambiguity with the job's cluster resource specification.

snakemake_executor_plugin_slurm/submit_string.py (1)

45-53: LGTM - cluster argument construction.

The logic correctly resolves cluster from multiple resource keys and safely quotes the value. The comment documents the backward-compatible priority order.

Note: As mentioned in the partitions.py review, the key resolution order here differs from score_job_fit. Consider aligning them for consistency.

snakemake_executor_plugin_slurm/__init__.py (3)

875-879: Consistent cluster key order with partitions.py.

The resolution order (slurm_cluster → cluster → clusters) matches the order used in score_job_fit, ensuring partition selection and cluster compatibility checks use the same cluster value.

881-894: Well-structured cluster-aware partition validation.

The logic correctly handles the key scenarios:

If partition is in config and belongs to a different cluster → auto-select

If partition is in config and matches cluster (or has no cluster) → use it

If partition isn't in config → trust user's specification

The short-circuit evaluation on line 888 safely handles the case where the requested partition isn't defined in the configuration.

896-898: LGTM - auto-selection fallback.

This correctly triggers partition auto-selection when no partition was specified by the user and a partition configuration is available.

snakemake_executor_plugin_slurm/partitions.py

…luster equal treating

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

snakemake_executor_plugin_slurm/partitions.py (1)

63-68: Confirm intentionality of score > 0 filter for unconstrained-but-compatible partitions

The condition if score is not None and score > 0: at line 65 excludes partitions where the job has no specific resource requirements and all partition limits are effectively unconstrained (all limits are inf). Such partitions would score 0.0 because the scoring algorithm only accumulates points when job_requirement > 0 AND limit is not inf.

This change was introduced in commit 4ca2f13 and represents a behavior shift from the prior version (commit 77902fb), which used if score is not None: and allowed score-0.0 partitions to be auto-selectable.

With the current > 0 filter:

Truly unconstrained-but-compatible partitions (all limits=inf) are dropped and trigger the "No suitable partition found" fallback path

The system returns None and relies on SLURM default partition selection instead

If this exclusion is intentional (i.e., you want auto-selection to require meaningful differentiation among partitions), please document it. If unconstrained partitions should remain auto-selectable, remove the > 0 check or ensure unconstrained-but-compatible partitions receive a small positive baseline score.

Please confirm this behavior change is intentional for deployments using mostly-unbounded partition definitions.

[Also applies to: 290-374]

♻️ Duplicate comments (1)

snakemake_executor_plugin_slurm/partitions.py (1)
41-52: Align cluster key resolution order with submit_string.py to avoid mismatches

Partition and job cluster resolution here uses ("slurm_cluster", "cluster", "clusters"), while submit_string.py (per learnings and current behavior) resolves cluster keys as cluster → clusters → slurm_cluster. If a job or partition config ever sets more than one of these keys with differing values, the partition-selection logic could use a different cluster than the one ultimately passed to sbatch, leading to subtle wrong-cluster scheduling.

To keep behavior consistent across the plugin, consider matching the priority order used in submit_string.py in both places that resolve clusters here:
-        # Extract optional cluster name from partition config
-        cluster = None
-        for key in ("slurm_cluster", "cluster", "clusters"):
+        # Extract optional cluster name from partition config
+        cluster = None
+        # Match submit_string.py: prefer `cluster`, then `clusters`, fallback `slurm_cluster`
+        for key in ("cluster", "clusters", "slurm_cluster"):
             if key in partition_config:
                 cluster = partition_config.pop(key)
                 break
@@
-        job_cluster = (
-            job.resources.get("slurm_cluster")
-            or job.resources.get("cluster")
-            or job.resources.get("clusters")
-        )
+        # Match submit_string.py key priority for job cluster resolution
+        job_cluster = (
+            job.resources.get("cluster")
+            or job.resources.get("clusters")
+            or job.resources.get("slurm_cluster")
+        )
This keeps Partition.partition_cluster and job-side cluster resolution in sync with the rest of the codebase.

Also applies to: 248-248, 272-278

🧹 Nitpick comments (1)

tests/test_partition_selection.py (1)
667-709: Tighten multicluster “no-cluster” tests and avoid mutating the fixture config in place

The new tests correctly exercise the intended semantics around jobs with and without cluster requirements, but a couple of small adjustments would make them more robust and future-proof:

Avoid mutating the nested partitions dict from the fixture

config = dict(multicluster_partition_config) only shallow-copies the outer dict; adding "no-cluster" mutates the shared inner partitions mapping. This is safe with the current function-scoped fixture, but could become brittle if the fixture scope changes or is reused elsewhere.

Consider building a fresh config instead:
-        config = dict(multicluster_partition_config)
-        config["partitions"]["no-cluster"] = {
+        config = {
+            "partitions": {
+                **multicluster_partition_config["partitions"],
+                "no-cluster": {
                     "max_runtime": 360,
                     "max_threads": 32,
                     "max_mem_mb": 64000,
-        }
+                },
+            }
+        }
Apply the same pattern in test_job_without_cluster_can_select_partition_without_cluster.

Strengthen the expectation for jobs without a cluster

Given the strict cluster eligibility implemented in Partition.score_job_fit (jobs without a cluster requirement should only see partitions without a cluster assignment), test_job_without_cluster_can_select_partition_without_cluster can assert that the job lands specifically on the no-cluster partition instead of using a broader in [...] check:
-            # Should be able to select 'no-cluster' partition
-            assert selected_partition in ["normal-small", "normal-large", "no-cluster"]
+            # With strict cluster eligibility, only the unclustered partition is eligible
+            assert selected_partition == "no-cluster"
This will better guard against future regressions that accidentally re-allow clustered partitions for jobs that omit a cluster.

Also applies to: 711-725

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 588f020 and 8000780.

📒 Files selected for processing (2)

snakemake_executor_plugin_slurm/partitions.py (4 hunks)
tests/test_partition_selection.py (3 hunks)

🧰 Additional context used

🧠 Learnings (4)

📓 Common learnings

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.

Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.

📚 Learning: 2025-03-31T11:52:05.373Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.

Applied to files:

snakemake_executor_plugin_slurm/partitions.py

📚 Learning: 2025-03-10T15:20:51.829Z

Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.

Applied to files:

snakemake_executor_plugin_slurm/partitions.py

📚 Learning: 2024-12-11T14:17:08.749Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 178
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2024-12-11T14:17:08.749Z
Learning: In the `snakemake-executor-plugin-slurm` project, when handling exceptions in `snakemake_executor_plugin_slurm/__init__.py`, prefer concise error messages and avoid unnecessary verbosity or exception chaining when it's not necessary.

Applied to files:

snakemake_executor_plugin_slurm/partitions.py

🧬 Code graph analysis (1)

tests/test_partition_selection.py (2)

tests/tests.py (3)

mock_job (226-247)

mock_job (457-478)

mock_job (795-816)

snakemake_executor_plugin_slurm/partitions.py (2)

read_partition_file (14-55)

get_best_partition (58-83)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: testing

🔇 Additional comments (1)

tests/test_partition_selection.py (1)

591-596: Good coverage of partition_cluster parsing

Asserting partition_cluster for named partitions via a lookup map avoids order assumptions and directly validates that the partition file parsing honors the configured cluster values. This is a solid sanity check for the new field.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

snakemake_executor_plugin_slurm/partitions.py (1)
274-290: Inconsistent cluster key priority order with submit_string.py.

The resolution order here (slurm_cluster → cluster → clusters at lines 274-277) differs from submit_string.py (lines 48-51) which uses (cluster → clusters → slurm_cluster). This was flagged in the previous review and marked as addressed, but the order still differs.

The strict eligibility logic (lines 280-290) correctly returns None for cluster mismatches in both directions, addressing the previous concern about returning 0.
         job_cluster = (
-            job.resources.get("slurm_cluster")
-            or job.resources.get("cluster")
+            job.resources.get("cluster")
             or job.resources.get("clusters")
+            or job.resources.get("slurm_cluster")
         )

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8000780 and 63961a7.

📒 Files selected for processing (4)

snakemake_executor_plugin_slurm/__init__.py (1 hunks)
snakemake_executor_plugin_slurm/partitions.py (4 hunks)
snakemake_executor_plugin_slurm/submit_string.py (1 hunks)
tests/test_partition_selection.py (3 hunks)

🧰 Additional context used

🧠 Learnings (11)

📓 Common learnings

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.

Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.

📚 Learning: 2024-12-11T14:17:08.749Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 178
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2024-12-11T14:17:08.749Z
Learning: In the `snakemake-executor-plugin-slurm` project, when handling exceptions in `snakemake_executor_plugin_slurm/__init__.py`, prefer concise error messages and avoid unnecessary verbosity or exception chaining when it's not necessary.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py
snakemake_executor_plugin_slurm/__init__.py
snakemake_executor_plugin_slurm/partitions.py

📚 Learning: 2025-03-10T13:34:59.270Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:34:59.270Z
Learning: In the Snakemake executor plugin for SLURM, all GPU resources (even simple numeric ones) should be translated to the `--gpus` flag (plural) to match SLURM's expectations, not `--gpu` (singular).

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-03-10T15:20:51.829Z

Learnt from: johanneskoester
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: docs/further.md:96-96
Timestamp: 2025-03-10T15:20:51.829Z
Learning: PR #173 in snakemake-executor-plugin-slurm implements GPU job support by adding resources: `gres` for generic resource specifications (e.g., 'gpu:1'), `gpu`/`gpus` for specifying GPU counts, and `gpu_model`/`gpu_manufacturer` for specifying GPU types, allowing users to request GPU resources directly rather than having to use slurm_extra.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py
snakemake_executor_plugin_slurm/__init__.py

📚 Learning: 2025-03-10T13:34:59.270Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:34:59.270Z
Learning: In the Snakemake executor plugin for SLURM, all GPU resources (both simple numeric values and model:number specifications) should be translated to the `--gpus` flag (plural form) to match SLURM's command line interface expectations.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-03-31T11:52:05.373Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 249
File: snakemake_executor_plugin_slurm/submit_string.py:21-29
Timestamp: 2025-03-31T11:52:05.373Z
Learning: In the snakemake-executor-plugin-slurm project, account and partition parameters are validated in the submit function before reaching the get_submit_command function. By design, these parameters are either empty strings or already properly formatted with SLURM parameter tags, making additional validation in get_submit_command unnecessary.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py
snakemake_executor_plugin_slurm/__init__.py
snakemake_executor_plugin_slurm/partitions.py

📚 Learning: 2025-02-18T14:40:27.064Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: snakemake_executor_plugin_slurm/__init__.py:0-0
Timestamp: 2025-02-18T14:40:27.064Z
Learning: In the Snakemake executor plugin for SLURM, the GPU resource is specified using the "gpus" (plural) resource name, not "gpu" (singular).

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-01-13T09:54:22.950Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 0
File: :0-0
Timestamp: 2025-01-13T09:54:22.950Z
Learning: PR #173 (adding gres resource specification) depends on PR #28 in snakemake-executor-plugin-slurm-jobstep repository, as changes were required in the cpu-settings function of the jobstep-Executor module.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-03-10T13:22:54.081Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 173
File: tests/tests.py:79-86
Timestamp: 2025-03-10T13:22:54.081Z
Learning: In the Snakemake executor plugin for SLURM, when using GPU resources, the function returns `--gpu=n` (singular) when only a number is provided, but `--gpus=model:n` (plural) when a model is specified in the implementation.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-01-20T09:13:26.443Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 193
File: .github/workflows/post_to_mastodon.yml:26-26
Timestamp: 2025-01-20T09:13:26.443Z
Learning: In the snakemake-executor-plugin-slurm repository, release PRs follow the naming pattern "chore(main): release X.Y.Z" where X.Y.Z is the version number.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

📚 Learning: 2025-05-26T12:22:07.005Z

Learnt from: cmeesters
Repo: snakemake/snakemake-executor-plugin-slurm PR: 221
File: snakemake_executor_plugin_slurm/efficiency_report.py:34-49
Timestamp: 2025-05-26T12:22:07.005Z
Learning: In the parse_reqmem function in snakemake_executor_plugin_slurm/efficiency_report.py, the conditional check for per_unit == "/node" is necessary for proper conversion of memory values in multi-node jobs, not redundant as it might initially appear.

Applied to files:

snakemake_executor_plugin_slurm/submit_string.py

🧬 Code graph analysis (2)

snakemake_executor_plugin_slurm/__init__.py (1)

snakemake_executor_plugin_slurm/partitions.py (1)

get_best_partition (58-83)

tests/test_partition_selection.py (2)

tests/tests.py (3)

mock_job (226-247)

mock_job (457-478)

mock_job (795-816)

snakemake_executor_plugin_slurm/partitions.py (2)

read_partition_file (14-55)

get_best_partition (58-83)

🔇 Additional comments (8)

snakemake_executor_plugin_slurm/submit_string.py (1)

45-54: Cluster resolution order is now consistent.

The key priority order (cluster → clusters → slurm_cluster) here matches the order used in partitions.py (lines 274-277), resolving the inconsistency flagged in the previous review. The backwards compatibility comment accurately describes the behavior.

snakemake_executor_plugin_slurm/__init__.py (1)

881-907: Cluster-aware partition validation logic looks correct.

The logic correctly handles the case where an explicit slurm_partition is specified but belongs to a different cluster than requested—it falls back to auto-selection via get_best_partition. The condition at line 894 also correctly triggers auto-selection when the partition has no cluster defined (partition_cluster is None) but the job requires a specific cluster.

tests/test_partition_selection.py (3)

594-595: LGTM!

Attribute access correctly updated from .cluster to .partition_cluster to match the renamed field.

667-711: Good test coverage for cluster eligibility rules.

These two new tests effectively validate the strict cluster eligibility logic:

Jobs with a cluster requirement cannot select partitions without a cluster.

Jobs without a cluster requirement can select partitions without a cluster.

This aligns well with the implementation in partitions.py lines 285-290.

713-732: Updated expectation correctly reflects new behavior.

The test now expects None when all partitions have cluster assignments and the job has no cluster specification. This matches the strict eligibility rule: jobs without a cluster can only select partitions without a cluster.

snakemake_executor_plugin_slurm/partitions.py (3)

42-51: Partition config parsing correctly handles cluster keys.

The extraction of cluster from partition configuration uses a reasonable priority order. Note that this order (slurm_cluster → cluster → clusters) is for parsing the YAML config file, which is a separate concern from job resource resolution.

67-68: Score threshold change is correct.

Adding score > 0 ensures only partitions with positive scores are considered. Combined with the cluster eligibility changes (returning None for mismatches), this properly excludes ineligible partitions.

248-248: LGTM!

Field rename from cluster to partition_cluster is clear and distinguishes the partition's cluster assignment from job cluster requirements.

coderabbitai · 2025-12-08T15:05:27Z

snakemake_executor_plugin_slurm/__init__.py

+            job_cluster = (
+                job.resources.get("slurm_cluster")
+                or job.resources.get("cluster")
+                or job.resources.get("clusters")
+            )


⚠️ Potential issue | 🟡 Minor

Inconsistent cluster key priority order with submit_string.py.

The resolution order here (slurm_cluster → cluster → clusters) differs from submit_string.py (lines 48-51) which uses (cluster → clusters → slurm_cluster). If a job specifies multiple cluster keys with different values, partition validation may use a different cluster than what's passed to sbatch.

job_cluster = ( - job.resources.get("slurm_cluster") - or job.resources.get("cluster") + job.resources.get("cluster") or job.resources.get("clusters") + or job.resources.get("slurm_cluster") )

🤖 Prompt for AI Agents

In snakemake_executor_plugin_slurm/__init__.py around lines 875 to 879, the cluster key lookup currently prefers slurm_cluster before cluster and clusters, which is inconsistent with submit_string.py; change the resolution order to match submit_string.py by checking job.resources for "cluster" then "clusters" then "slurm_cluster" so partition validation uses the same cluster value passed to sbatch, and update any related comments or docstrings to reflect the canonical priority.

🤖 I have created a release *beep* *boop* --- ## [2.0.2](v2.0.1...v2.0.2) (2025-12-08) ### Bug Fixes * partition cluster selection ([#385](#385)) ([2863f78](2863f78)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).  ## Summary by CodeRabbit ## Version 2.0.2 * **Bug Fixes** * Fixed an issue with partition cluster selection <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>

cmeesters added 3 commits December 1, 2025 10:27

fix: validator check for integers was broken.

a2273aa

fix: stricter checking

47a5d7c

feat: partition selection with cluster confinment working

77902fb

fix: linting

b7cce6d

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

snakemake_executor_plugin_slurm/partitions.py Outdated Show resolved Hide resolved

snakemake_executor_plugin_slurm/partitions.py Outdated Show resolved Hide resolved

cmeesters added 3 commits December 8, 2025 14:51

docs: updated documentation with a hint on cluster, clusters, slurm_c…

588f020

…luster equal treating

fix: testing

8000780

fix: formatting

4ca2f13

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

Merge branch 'main' into fix/partition_cluster_selection

63961a7

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

cmeesters merged commit 2863f78 into main Dec 8, 2025
6 checks passed

cmeesters deleted the fix/partition_cluster_selection branch December 8, 2025 15:10

snakemake-bot mentioned this pull request Dec 8, 2025

chore(main): release 2.0.2 #386

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: partition cluster selection #385

fix: partition cluster selection #385

cmeesters commented Dec 8, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: partition cluster selection #385

fix: partition cluster selection #385

Conversation

cmeesters commented Dec 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cmeesters commented Dec 8, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 8, 2025 •

edited

Loading