You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -106,7 +106,8 @@ The `Workload` object will allow kube-scheduler to be aware that pods are part o
106
106
- Implement the first version of `Workload` API necessary for defining a Gang
107
107
- Ensuring that we can extend `Workload` API in backward compatible way toward north-star API
108
108
- Ensuring that `Workload` API will be usable for both built-in and third-party workload controllers and APIs
109
-
- Implement first version of gang-scheduling in kube-scheduler
109
+
- Implement first version of gang-scheduling in kube-scheduler supporting (potentially in non-optimal way)
110
+
all existing scheduling features.
110
111
- Provide full backward compatibility for all existing scheduling features
111
112
112
113
### Non-Goals
@@ -117,6 +118,7 @@ The `Workload` object will allow kube-scheduler to be aware that pods are part o
117
118
118
119
The following are non-goals for this KEP but will probably soon appear to be goals for follow-up KEPs:
119
120
121
+
- Integrate cluster autoscaling with gang scheduling.
120
122
- Introduce a concept of `Reservation` that can be later consumed by pods.
121
123
- Workload-level preemption.
122
124
- Address resource contention between different schedulers (including possible deadlocks).
@@ -177,12 +179,11 @@ metadata:
177
179
namespace: ns-1
178
180
name: job-1
179
181
spec:
180
-
podGroups: # or gangGroups -- TBD
182
+
podGroups:
181
183
- name: "pg1"
182
-
gangMode: Single
183
-
gangSchedulingPolicy:
184
-
minCount: 100
185
-
schedulingTimeoutSeconds: 60
184
+
policy:
185
+
gang:
186
+
minCount: 100
186
187
```
187
188
188
189
@@ -223,29 +224,26 @@ usecases. You can read more about it in the [extended proposal] document.
223
224
* `Workload` is the resource Kind.
224
225
* `scheduling` is the ApiGroup.
225
226
* `spec.workload` is the name of the new field in pod.
226
-
* Within a Workload there is a list of groups of pods. Each group represents a top-level division of pods within a Workload. Each group can be independently gang scheduled (or not use gang scheduling). This group is named
227
-
<<[UNRESOLVED community feedback requested]>> `PodGroup` or `GangGroup` for the top level. <<[/UNRESOLVED]>>.
228
-
* In a future , we expect that this group can optionally specify further subdivision into sub groups. Each sub-group can have an index. The indexes go from 0 to N, without repeats or gaps. These subgroups are called
229
-
<<[UNRESOLVED depending on previous unresolved item]>> `PodSubGroup` if `PodGroup` is chosen, or else `RankedGroup` if `GangGroup` is chosen<<[/UNRESOLVED]>>.
230
-
* In subsequent KEPs, we expect that a sub-group can optionally specify further subdivision into pod equivalence classes. All pods in a pod equivalence class have the same values for all fields that affect scheduling feasibility. These pod equivalence classes are called
231
-
<<[UNRESOLVED depending on a previous unresolved item]>> `PodSet` if `PodGroup` is chosen, or else `EqGroup` if `GangGroup` is chosen<<[/UNRESOLVED]>>.
227
+
* Within a Workload there is a list of groups of pods. Each group represents a top-level division of pods within a Workload. Each group can be independently gang scheduled (or not use gang scheduling). This group is named `PodGroup`.
228
+
* In a future , we expect that this group can optionally specify further subdivision into sub groups. Each sub-group can have an index. The indexes go from 0 to N, without repeats or gaps. These subgroups are called `PodSubGroup`.
229
+
* In subsequent KEPs, we expect that a sub-group can optionally specify further subdivision into pod equivalence classes. All pods in a pod equivalence class have the same values for all fields that affect scheduling feasibility. These pod equivalence classes are called `PodSet`.
232
230
233
231
### Associating Pod into PodGroups
234
232
235
233
When a `Workload` consists of a single group of pods needing Gang Scheduling, it is clear which pods belong to the group from the `spec.workload.name` field of the pod. However `Workload` supports listing multiple list items, and a list item can represent a single group, or a set of identical replica groups.
236
234
In these cases, there needs to be additional information to indicate which group a pod belongs to.
237
235
238
236
We proposed to extend the newly introduced `pod.spec.workload` field with additional information
239
-
to include that information. More specifically, the `pod.spec.workload` field is of type `PodWorkload`
237
+
to include that information. More specifically, the `pod.spec.workload` field is of type `WorkloadReference`
240
238
and is defined as following:
241
239
242
240
```go
243
241
// WorkloadReference identifies the Workload object and PodGroup membership
244
242
// that a Pod belongs to. The scheduler uses this information to enforce
245
243
// gang scheduling semantics.
246
244
type WorkloadReference struct {
247
-
// Workload defines the name of the Workload object this pod belongs to.
248
-
Workload string
245
+
// Name defines the name of the Workload object this pod belongs to.
246
+
Name string
249
247
250
248
// PodGroup defines the name of the PodGroup within a Workload this pod belongs to.
251
249
PodGroup string
@@ -259,10 +257,10 @@ type WorkloadReference struct {
259
257
260
258
At least for Alpha, we start with `WorkloadReference` to be immutable field in the Pod.
261
259
In further phases, we may decide to relax validation and allow for setting some of the fields later.
262
-
Moreover, the visibility into issues (debuggability) will depend on [#5510], but we don't
260
+
Moreover, the visibility into issues (debuggability) will depend on [#5501], but we don't
0 commit comments