Skip to content

Commit 9f33cc0

Browse files
committed
BF: CS-1095 Better explain PE_HOSTFILE (#55)
1 parent e828c79 commit 9f33cc0

File tree

1 file changed

+21
-3
lines changed

1 file changed

+21
-3
lines changed

doc/markdown/man/man5/sge_pe.md

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ xxqs_name_sxx_shepherd(8) prior to executing the job script. Its purpose is to s
6464
correspondingly to its needs. An optional prefix "user@" specifies the user under which this procedure is to be
6565
started. The standard output of the start-up procedure is redirected to the file \<REQUEST>.po\<JID> in the job's
6666
working directory (see qsub(1)), with \<REQUEST> being the name of the job as displayed by qstat(1) and \<JID>
67-
being the job's identification number. Likewise, the standard error output is redirected to \<REQUEST>.pe\<JID.
67+
being the job's identification number. Likewise, the standard error output is redirected to \<REQUEST>.pe\<JID>.
6868
The following special variables being expanded at runtime can be used (besides any other strings which have to
6969
be interpreted by the start and stop procedures) to constitute a command line:
7070

@@ -79,7 +79,25 @@ be interpreted by the start and stop procedures) to constitute a command line:
7979
the start-up procedure. Each line of the file refers to a host on which parallel processes are to be run. The first
8080
entry of each line denotes the hostname, the second entry the number of parallel processes to be run on the host,
8181
the third entry the name of the queue, and the fourth entry a processor range to be used in case of a multiprocessor
82-
machine.
82+
machine. The first line in the PE hostfile always refers to the master task host.
83+
84+
Example PE hostfile contents:
85+
86+
```text
87+
execution-3.us-central1-a.c.internal 32 all.q@execution-3.us-central1-a.c.internal UNDEFINED
88+
execution-1.us-central1-a.c.internal 32 all.q@execution-1.us-central1-a.c.internal UNDEFINED
89+
execution-0.us-central1-a.c.internal 32 all.q@execution-0.us-central1-a.c.internal UNDEFINED
90+
execution-2.us-central1-a.c.internal 32 all.q@execution-2.us-central1-a.c.internal UNDEFINED
91+
```
92+
93+
Or with 1 slot per host:
94+
95+
```text
96+
execution-3.us-central1-a.c.internal 1 all.q@execution-3.us-central1-a.c.internal UNDEFINED
97+
execution-1.us-central1-a.c.internal 1 all.q@execution-1.us-central1-a.c.internal UNDEFINED
98+
execution-2.us-central1-a.c.internal 1 all.q@execution-2.us-central1-a.c.internal UNDEFINED
99+
execution-0.us-central1-a.c.internal 1 all.q@execution-0.us-central1-a.c.internal UNDEFINED
100+
```
83101

84102
* $host
85103
The name of the host on which the start-up or stop procedures are started.
@@ -184,7 +202,7 @@ The following methods are supported:
184202
assumed.
185203

186204
* max:
187-
The of the slot range maximum is used as prospective slot amount. If no upper bound is specified with the range
205+
The slot range maximum is used as prospective slot amount. If no upper bound is specified with the range
188206
the absolute maximum possible due to the PE's *slots* setting is assumed.
189207

190208
* avg:

0 commit comments

Comments
 (0)