You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| Command01SubmitterMountHeadNode | Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.
63
-
| Command02CreateUsersGroupsJsonConfigure | Create /opt/slurm/{{ClusterName}}/config/users_groups.json and create a cron job to refresh it hourly.
64
+
| Command01_MountHeadNodeNfs| Mounts the Slurm cluster's shared file system at /opt/slurm/{{ClusterName}}. This provides access to the configuration script used in the next step.
65
+
| Command02_CreateUsersGroupsJsonConfigure | Create /opt/slurm/{{ClusterName}}/config/users_groups.json and create a cron job to refresh it hourly. Update /etc/fstab with the mount in the previous step.
64
66
65
67
Before deleting the cluster you can undo the configuration by running the commands in the following outputs.
| command10CreateUsersGroupsJsonDeconfigure | Removes the crontab that refreshes users_groups.json.
71
+
| command10_CreateUsersGroupsJsonDeconfigure | Removes the crontab that refreshes users_groups.json.
70
72
71
73
Now the cluster is ready to be used by sshing into the head node or a login node, if you configured one.
72
74
@@ -75,6 +77,8 @@ in with their own ssh keys.
75
77
76
78
## Configure submission hosts to use the cluster
77
79
80
+
**NOTE**: If you are using RES and specify RESEnvironmentName in your configuration, these steps will automatically be done for you on all running DCV desktops.
81
+
78
82
ParallelCluster was built assuming that users would ssh into the head node or login nodes to execute Slurm commands.
79
83
This can be undesirable for a number of reasons.
80
84
First, users shouldn't be given ssh access to a critical infrastructure like the cluster head node.
@@ -90,14 +94,19 @@ Run them in the following order:
| Command01SubmitterMountHeadNode | Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.
94
-
| Command03SubmitterConfigure| Configure the submission host so it can directly access the Slurm cluster.
97
+
| Command01_MountHeadNodeNfs| Mounts the Slurm cluster's shared file system at /opt/slurm/{{ClusterName}}. This provides access to the configuration script used in the next step.
98
+
| Command03_SubmitterConfigure | Configure the submission host so it can directly access the Slurm cluster. Update /etc/fstab with the mount in the previous step.
95
99
96
100
The first command simply mounts the head node's NFS file system so you have access to the Slurm commands and configuration.
97
101
98
102
The second command runs an ansible playbook that configures the submission host so that it can run the Slurm commands for the cluster.
103
+
It will also compile the Slurm binaries for the OS distribution and CPU architecture of your host.
99
104
It also configures the modulefile that sets up the environment to use the slurm cluster.
100
105
106
+
**NOTE**: When the new modulefile is created, you need to refresh your shell environment before the modulefile
107
+
can be used.
108
+
You can do this by opening a new shell or by sourcing your .profile: `source ~/.profile`.
109
+
101
110
The clusters have been configured so that a submission host can use more than one cluster by simply changing the modulefile that is loaded.
102
111
103
112
On the submission host just open a new shell and load the modulefile for your cluster and you can access Slurm.
@@ -126,10 +135,20 @@ Then update your aws-eda-slurm-cluster stack by running the install script again
126
135
127
136
Run the following command in a shell to configure your environment to use your slurm cluster.
128
137
138
+
**NOTE**: When the new modulefile is created, you need to refresh your shell environment before the modulefile
139
+
can be used.
140
+
You can do this by opening a new shell or by sourcing your profile: `source ~/.bash_profile`.
141
+
129
142
```
130
143
module load {{ClusterName}}
131
144
```
132
145
146
+
If you want to get a list of all of the clusters that are available execute the following command.
0 commit comments