In this section, you will modify the cluster created in Lab I to enable Slurm Accounting resource limits.
Make sure that you are in your Cloud9 terminal to start this lab. To access Cloud9, please refer to the instructions under Connect to AWS Cloud9 Instance in the Access Cloud9 Environment section.
Load the terminal used to maintain your AWS ParallelCluster clusters. In earlier labs, Cloud9 was used; if following a workshop, go back to your Cloud9 terminal. If you have closed the Cloud9 terminal, go back to the Cloud9 console and re-open the terminal using the instructions found at Access Cloud9 Environment.
The cost control solution requires that you apply a CPU minutes Resource Limit to the cluster’s Slurm scheduler. Resource Limits are used in Slurm to restrict job execution after a resource (CPU, RAM, etc.) usage limit has been reached.
Run the command below to apply the PriorityType
and AccountingStorageEnforce
settings to the cluster configuration file. yq
is used to automate the update of the YAML cluster configuration file.
yq -i '(.Scheduling.SlurmSettings.CustomSlurmSettings[0].PriorityType="priority/multifactor") |
(.Scheduling.SlurmSettings.CustomSlurmSettings[1].AccountingStorageEnforce="limits")' \
~/environment/cluster-config.yaml
If you receive an error bash: yq: command not found
it can be installed on the Cloud9 instance with the command pip3 install yq
.
For ParallelCluster versions >= 3.6.0, you can define custom slurm.conf customizations as part of an AWS ParallelCluster configuration. See instructions here.
Run the following command to update the cluster configuration so that an additional IAM policy that grants access the AWS Price List service is applied to the head node.
yq -i '(.HeadNode.Iam.AdditionalIamPolicies[1].Policy="arn:aws:iam::aws:policy/AWSPriceListServiceFullAccess")' \
~/environment/cluster-config.yaml
You can define additional IAM policies for both your head and compute nodes by using the “AdditionalIamPolicies” option within your ParallelCluster configuration file. See details here
You have modified the configuration file in the previous steps for the required changes. However, these changes won’t be applied until the cluster is updated. The pcluster update-cluster
command below applies the changes in the configuration file to the cluster using the AWS CloudFormation service.
source ~/environment/env_vars
pcluster update-cluster -n hpc --region ${AWS_REGION} -c ~/environment/cluster-config.yaml
You can check the cluster update status using the pcluster describe-cluster
command below.
pcluster describe-cluster -n hpc --query clusterStatus --region ${AWS_REGION}
The cluster update will take about 3 minutes. You will know the cluster update is complete when you see an UPDATE_COMPLETE status.
You have successfully updated the cluster with the required configuration changes. In the next section, you will create cost controls on the cluster using Slurm Accounting and Resource Limits.