1. Home
  2. Cloudbursting
  3. How can I optimize HPC performance using Elastic Fabric Adapter for cloud nodes

How can I optimize HPC performance using Elastic Fabric Adapter for cloud nodes

The Amazon EC2 Elastic Fabric Adapter (EFA) is an Elastic Network Adapter with additional capabilities to bypass the OS network layer, so that it provides a low-latency, reliable transport functionality between nodes in the cloud.

Requirements

  1. Choose the supported instance type (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa.html#efa-instance-types)
  2. Choose a Bright image with CM 9.0-11 or newer for one of the supported Linux distributions (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa.html#efa-amis)

Prepare cloud nodes security group

To use EFA it is necessary to modify the security group for cloud nodes to allow all inbound and outbound traffic to and from the security group itself.

To find the ID of the security group that you will need to modify, you can use the following sequence of cmsh commands:

[head42]% cloud
[head42->cloud]% ls
Name     Type  Username  Account ID  Default region  Default type  Default AMI
-------- ----- --------- ----------- --------------- ------------- -------------
amazon   ec2   userfoo   123456789   eu-central-1    m3.medium     brigt-inst...
amazon2  ec2   userbar   321343256   eu-west-1       m3.medium     brigt-inst...
...

[head42->cloud]% use amazon  # Choose amazon provider
[head42->cloud[amazon]]% vpcs

[head42->cloud[amazon]->vpcs]% ls
Name (key)         Region        baseAddress  vpcID        subnets
-----------------  ------------- ------------ ------------ ---------------------
vpc-eu-central-1   eu-central-1  10.42.0.0    vpc-131fb5e+ vpc-eu-central-1-p...
vpc-eu-west-2      eu-west-2     10.42.0.0    vpc-32a5fa0+ vpc-eu-west-2-publ...
...

[head42->cloud[amazon]->vpcs]% get vpc-eu-west-2 secgroupn  # Choose VPC
sg-0b525465c7fe64b7d

Now you can use this security group ID to find it in the AWS Console. Follow the instructions at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-start.html#efa-start-security to configure the security group. You will probably only need to add an outbound rule.

Prepare software image

Run the following command to open a chroot session in the software image (e.g. default-image) that you are going to use for EFA-enabled cloud nodes:

cm-chroot-sw-img /cm/images/default-image

And then follow the steps 3, 4 and 5 from the official documentation at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-start.html#efa-start-enable to

  1. install EFA software
  2. disable ptrace protection (if it is enabled by default)
  3. and optionally to install Intel MPI, if software you are going to use for HPC needs it.

You can keep the aws-efa-installer directory created in the 3rd step to use efa_test.sh script later to test whether EFA works. After the steps are carried out, remember to exit from the chroot using exit command.

NOTE: On Ubuntu, the AWS efa_installer.sh script will try to install the distributions environment-modules package. This package conflicts with Brights cm-modules-init-client package. Before running the script, it is advised to patch the efa_installer.sh script by running: sed -i 's/environment-modules//g' efa_installer.sh

Set card type for cloud nodes

For Bright 9.2 and newer, to specify that instances have to use the EFA network interface, you need to set thecardtype property for cloud nodes. That can be done in a way similar to the following cmsh command:

[head42]% device
[head42->device]% foreach -c eu-west-2-cloud-node ( interfaces; set eth0 cardtype efa )
[head42->device]% commit

For older Bright versions, to specify that instances have to use the EFA network interface, you need to set NetworkInterfaceType=efa tags for cloud nodes. That can be done in a way similar to the following cmsh command:

[head42]% device
[head42->device]% foreach -c eu-west-2-cloud-node ( cloudsettings; append tags NetworkInterfaceType=efa )
[head42->device]% commit

Verify EFA works

To make sure that EFA works you need to run a compute node that you configured to use EFA.

Running the command $ fi_info -p efa makes sure that the EFA software components were installed successfully (details on this can be found in the 3rd step in the official documentation at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-start.html#efa-start-enable).

To test EFA itself, you can run the test script efa_test.sh from the aws-efa-installer directory (see “Prepare software image”).

If you prefer not to have the efa installer in the software image you can just repeat the 3rd step to download and extract it to test EFA.

If EFA has been configured properly, then the output of efa_test.sh command will be something like this:

[root@eu-west-2-cnode001 ~]# cd aws-efa-installer/
[root@eu-west-2-cnode001 aws-efa-installer]# ./efa_test.sh
Starting server...
Starting client...
bytes   #sent   #ack     total       time     MB/sec    usec/xfer   Mxfers/sec
64      10      =10      1.2k        0.03s      0.05    1251.20       0.00
256     10      =10      5k          0.00s     12.96      19.75       0.05
1k      10      =10      20k         0.00s     59.36      17.25       0.06
4k      10      =10      80k         0.00s    218.45      18.75       0.05
64k     10      =10      1.2m        0.00s    869.75      75.35       0.01
1m      10      =10      20m         0.01s   2001.29     523.95       0.00
Updated on November 15, 2022

Related Articles

Leave a Comment