This article is being updated. Please be aware the content herein, not limited to version numbers and slight syntax changes, may not match the output from the most recent versions of Bright. This notation will be removed when the content has been updated.
This article covers Bright Versions from before v7.3. For versions beyond this, the url: https://kb.brightcomputing.com/knowledge-base/how-to-configure-cluster-extension-to-aws-without-openvpn-7-3/
should be consulted.
This article describes how to configure a system managed with Bright Cluster Manager for bursting to EC2 using so-called “third-party” connectivity methods (e.g. hardware VPN, Amazon Direct Connect). This is an alternative way of performing cloudbursting to VPCs.
The default way to do cloudbursting — described in great detail in the Administrator Manual — is to use OpenVPN over the internet. Using OpenVPN means that no hardware VPN or Amazon Direct Connect is required.
Cloudbursting in Bright Cluster Manager defaults to establishing an OpenVPN connection between the headnode and the cloud director. This connection is used as a secure communication channel from the headnode to the EC2. In addition to that, there are OpenVPN connections established between the cloud compute nodes and the cloud director node. Those connections are used for managing the cloud compute nodes (but not for data transfer between jobs).
Making use of the EC2-VPN platform gives the users addtional ways of establishing connection with their resources in EC2. Besides the existing over-the-Internet TCP/IP connection, it is possible to establish a hardware VPN connection with an Amazon VPC gateway, or to have a dedicated communication channel leading to the VPC (the Amazon Direct Connect).
When third-party connectivity methods are used, there is typically no need to run an OpenVPN connection on top of them between the headnode and the cloud director. Likewise, with VPCs there is typically no need for the OpenVPN communication between cloud compute nodes and the cloud director, as the VPC subnet traffic is, unlike traffic inside EC2-Classic platform, isolated from other users.
What follows are instructions for setting up cloudbursting to EC2 with no OpenVPN set up, i.e. with no tunnel interfaces and no netmap network.
Prerequisites
Bright Cluster Manager 6.1, and CMDaemon binary version 17553, or higher. You can figure out if you have these by running:
[headnode ~]# rpm -q cmdaemon
cmdaemon-6.1-17553_cm6.1.x86_64
You start with a cluster managed by Bright Cluster Manager, with no cloudbursting facilities configured. I.e. there is no cloud provider account defined, there are no cloud nodes configured, there are no tunnel networks and no netmap network defined.
The instructions assume some pre-existing and pre-configured communication channel between the headnode and instances started inside the subnets of the private cloud (e.g. via “Direct Connect”, or a IPSec based Hardware VPN). Ie, the following must be true:
- An AWS EC2 account exists for the cloud,
- a Virtual Private Cloud (VPC) has been configured inside the EC2-VPC platform,
- at least one subnet is defined in the VPC
- VPC routing tables and gateways are configured properly and their setup allows for communication between the cloud instances and the local cluster,
- at least one VPC security group is configured,
- the existing security groups and network ACL configuration should not restrict any traffic coming from the cluster.
- The product key used must be registered with Bright Computing Customer Portal http://www.brightcomputing.com/Customer-Login.php
Manually Adding The Cloud Provider Account
Do not use the cmgui wizard, and do not use the cloud-setup script. These would create the netmap network and tunnel interfaces on the existing nodes, which use an OpenVPN-based setup.
Instead, the cloud provider has to be created from scratch. Watch out for leftover settings from previous configurations, which can interfere in odd ways. You really should create it from scratch:
Example:[head]% cloud
[head->cloud]% add ec2provider amazon
[head->cloud*[amazon*]]% set accesskeyid AKIDSCVDJC73DSCASD
[head->cloud*[amazon*]]% set accountid 123456789
[head->cloud*[amazon*]]% set secretaccesskey adF/ds238cvj4/g
[head->cloud*[amazon*]]% set username some.name@example.com
[head->cloud*[amazon*]]% commit
Wait a few minutes for the cluster to pull the information about the region and instance types.
The regions command should produce a list of available regions after a few seconds:[head->cloud]% regions
Name (key) Url Zones
---------------- ---------------------------------------- ----------------------------------------
ap-northeast-1 ec2.ap-northeast-1.amazonaws.com ap-northeast-1a, ap-northeast-1c
ap-southeast-1 ec2.ap-southeast-1.amazonaws.com ap-southeast-1a, ap-southeast-1b
ap-southeast-2 ec2.ap-southeast-2.amazonaws.com ap-southeast-2a, ap-southeast-2b
eu-west-1 ec2.eu-west-1.amazonaws.com eu-west-1a, eu-west-1b, eu-west-1c
sa-east-1 ec2.sa-east-1.amazonaws.com sa-east-1a, sa-east-1b
us-east-1 ec2.us-east-1.amazonaws.com us-east-1a, us-east-1b, us-east-1c
us-west-1 ec2.us-west-1.amazonaws.com us-west-1a, us-west-1b
us-west-2 ec2.us-west-2.amazonaws.com us-west-2a, us-west-2b, us-west-2c
Getting hold of available instance types takes can take longer, maybe 2 minutes.
The setup of the cloud provider account can then be done:
[head->cloud[amazon]]% set defaultdirectortype m1.medium
[head->cloud*[amazon*]]% set defaulttype t1.micro
[head->cloud*[amazon*]]% set defaultregion us-west-1
[head->cloud*[amazon*]]% commit
The default region does not have to be in the same region as the existing VPC.
Configure Subnets of the Private clouds
The next step is to create network objects to represent the existing subnets of your VPC.
Eg: adding a network for one of the subnets.[head->network]% add vpc-sn1
[head->network*[vpc-sn1*]]% set type cloud
[head->network*[vpc-sn1*]]% set baseaddress 10.220.1.0
[head->network*[vpc-sn1*]]% set netmaskbits 24
[head->network*[vpc-sn1*]]% set broadcastaddress 10.220.1.255
[head->network*[vpc-sn1*]]% set managementallowed yes
[head->network*[vpc-sn1*]]% set domainname cloud.cluster
[head->network*[vpc-sn1*]]% set ec2subnetid subnet-abcd1234
[head->network*[vpc-sn1*]]% set gateway <gateway-ip>
[head->network*[vpc-sn1*]]% commit
The gateway IP address should be the customer VPN gateway. If that is not present, then a headnode-facing IP address of a router standing on the route to EC2 can be used.
The baseaddress is the baseaddress of the VPC subnet. This arbitrary example assumes a VPC with a CIDR address of 10.220.0.0/16, and a subnet of 10.220.1.0/24 inside it.
The ‘ec2subnetid’ field is the subnet identifier assigned to the subnet by Amazon, and it should have a value given to it by the administrator. If the field is empty, then the CMDaemon will attempt to create a new subnet inside your VPC, and it will likely fail becase its range will conflict with an existing subnet.
It is only required to represent subnets with network objects for those existing subnets in which you want to be able to start an instance using Bright Cluster Manager. However, it is recommended to define network objects for all subnets actually existing in the private cloud.
If you have multiple subnets in which you want to start your instance, you can clone the first network. Set the ec2subnetid.Optionally, set the baseaddress broadcast address, and the number of netmask bits
[head->network]% clone vpc-sn1 vpc-sn2
[head->network*[vpc-sn2*]]% set baseaddress 10.220.2.0
[head->network*[vpc-sn2*]]% set broadcastaddress 10.220.2.255
[head->network*[vpc-sn2*]]% set ec2subnetid
[head->network*[vpc-sn2*]]% commit
Configure the Private Cloud Object
The goal of a privatecloud object is to represent a VPC which is already created inside your EC2 account. Eg:[head->cloud]% use amazon
[head->cloud[amazon]]% privateclouds
[head->cloud[amazon]->privateclouds]% add ec2privatecloud vpc
[head->cloud*[amazon*]->privateclouds*[vpc*]]% set vpcid vpc-da3Ht343s
[head->cloud*[amazon*]->privateclouds*[vpc*]]% set region eu-west-1
[head->cloud*[amazon*]->privateclouds*[vpc*]]% set secgroupd sg-abc1234
[head->cloud*[amazon*]->privateclouds*[vpc*]]% set secgroupn sg-def5678
[head->cloud*[amazon*]->privateclouds*[vpc*]]% set baseaddress 10.220.0.0
[head->cloud*[amazon*]->privateclouds*[vpc*]]% set netmaskbits 16
[head->cloud*[amazon*]->privateclouds*[vpc*]]% set options skipVpcRouteTableSetup dontUpdateSecurityGroups dontCreateSubnetWhenIDIsSet dontRemoveEC2Entities
[head->cloud*[amazon*]->privateclouds*[vpc*]]% set subnets vpc-sn1 vpc-sn2
[head->cloud*[amazon*]->privateclouds*[vpc*]]% commit
The ‘vpcid’ is the AWS-assigned id of the existing VPC which is to be managed via Bright.
The ‘baseaddress’ should be the base address of the entire VPC (i.e. the network IP part of its CIDR)
The ‘secgroupd’ and ‘secgroupn’ properties are the security groupd id of existing security groups which are to be usedfor the newly created cloud director and cloud compute node instances. In principle those two can be the exact same security group, but both fields must be filled in.
As shown in the example above, some special options must be configured for the VPC. It is important that these options are set *before* the private cloud object is committed for the first time.
skipVpcRouteTableSetup — ensure that CMDaemon does not attempt to alter the existing routing tables
dontUpdateSecurityGroups — ensure that CMDaemon does not attempt to alter the existing security groups
dontCreateSubnetWhenIDIsSet — new subnets will not be created if a already attached subnet has ‘ec2subnetid’ set.
dontRemoveEC2Entities — when removing the subnets or a VPC from the bright Cluster Manager configuration, the CMDaemon will not attempt to remove the actual VPC, and it will not remove the subnets inside EC2.
The ‘region’ should be the actual EC2 region inside which the existing VPC is located.
Create Cloud Director Category
[head]% category
[head->category]% add director
[head->category*[director*]]% roles
[head->category*[director*]->roles]% list
Name (key)
----------------------------
sgeclient# Remove pre-existing role
[head->category*[director*]->roles]% unassign sgeclient
[head->category*[director*]->roles*]% assign storage
[head->category*[director*]->roles[storage*]]% assign provisioning
[head->category*[director*]->roles[provisioning*]]% set localimages default-image# use the software image of your choice, the one which will be used for cloud compute nodes
[head->category*[director*]->roles[provisioning*]]% commit
[head->category*[director*]->roles[provisioning*]]% ..
[head->category*[director*]->roles]% ..
[head->category*[director*]]% fsexports
[head->category*[director*]->fsexports]% list # if there are /home or /cm/shared exports defined here, remove them
[head->category*[director*]->fsexports]% ..
[head->category*[director*]] commit
Create The Cloud Director Node
[head]% device
[head->device]% add cloudnode vpc-director
[head->device[vpc-director]]% fsexports
[head->device[vpc-director]->fsexports]% add /home@vpc
[head->device*[vpc-director*]->fsexports*[/home@vpc*]]% set write 1
[head->device*[vpc-director*]->fsexports*[/home@vpc*]]% set hosts 10.220.0.0/16
# the base IP of the whole VPC
[head->device*[vpc-director*]->fsexports*[/home@vpc*]]% set path /home
[head->device*[vpc-director*]->fsexports*[/home@vpc*]]% ..
[head->device*[vpc-director*]->fsexports*]% clone /home@vpc /cm/shared@vpc
[head->device*[vpc-director*]->fsexports*[/cm/shared@vpc*]]%
[head->device*[vpc-director*]->fsexports*[/cm/shared@vpc*]]% set path /cm/shared
[head->device*[vpc-director*]->fsexports*[/cm/shared@vpc*]]% ..
[head->device*[vpc-director*]->fsexports*]% ..
[head->device*[vpc-director*]]% cloudsettings
[head->device*[vpc-director*]->cloudsettings]% set region eu-west-1
# The region of the existing VPC
[head->device*[vpc-director*]]% get disksetup
# validate that the disk setup fits your requirements
# if required, change it. The defaults are sufficient to start
# the cloud director.
[head->device*[vpc-director*]->cloudsettings*]%
[head->device*[vpc-director*]->cloudsettings*]% storage
[head->device*[vpc-director*]->cloudsettings*->storage]% list
Type Name (key) Drive Size Volume ID
------------ ------------ ------------ ---------------- ----------------
ebs ebs0 sdb 40GB
# validate that this space is enough relative to the disk setup
[head->device*[vpc-director*]->cloudsettings*]% ..
[head->device*[vpc-director*]]% set managementnetwork vpc-sn1
[head->device*[vpc-director*]]% interfaces
[head->device*[vpc-director*]->interfaces]% list
Type Network device name IP Network
------------ -------------------- ---------------- ----------------
physical eth0 [prov,dhcp] 0.0.0.0 cloud-ec2classic
tunnel tun0 0.0.0.0
[head->device*[vpc-director*]->interfaces]% remove tun0
[head->device*[vpc-director*]->interfaces*]% set eth0 network vpc-sn1
[head->device*[vpc-director*]->interfaces*]% ..
# Note, that the network you assign to the eth0 interface determines
# inside which subnet the cloud node will be created
[head->device*[vpc-director*]]% roles
[head->device*[vpc-director*]->roles]% assign clouddirector
[head->device*[vpc-director*]->roles*[clouddirector*]]% ..
[head->device*[vpc-director*]->roles*]% ..
[head->device*[vpc-director*]]% set category director
[head->device*[vpc-director*]]% commit
[head->device*[vpc-director]->roles]% use clouddirector
[head->device*[vpc-director]->roles[clouddirector]]% show
...Dependents vpc-director-dependents
...# The special dependents group visible in the output of the show command must be assigned as the group for the provisioning node
[head->device*[vpc-director]->roles]% assign provisioning
[head->device*[vpc-director*]->roles*[provisioning*]]% set nodegroups vpc-director-dependents
[head->device*[vpc-director*]->roles*[provisioning*]]% ..
[head->device*[vpc-director*]]% commit
# commit the cloud director again
Power On The Cloud Director
[head->device[vpc-director]]% power on
Once the cloud director boots and requests a certificate, you will have to issue a certificate (this step will be automated in the future).[head->device[vpc-director]]% cert
[head->cert]% listrequests
<request-id>...<instance-id>
[head->cert]% issue <request-id>
Create Cloud Nodes
[head->device]% add cloudnode cnode001
[head->device*[cnode001*]]% interfaces
[head->device*[cnode001*]->interfaces]% remove tun0
[head->device*[cnode001*]->interfaces*]% set eth0 network vpc-sn2
[head->device*[cnode001*]->interfaces*]% ..
[head->device*[cnode001*]]% set managementnetwork vpc-sn2
[head->device*[cnode001*]]% commit
Note, that the network you assign to the eth0 interface determines inside which subnet the cloud node will be created.
Once the cloud director has booted and is in the UP state, the cloud node can be powered on.
Finalizing setup
Preparing /etc/hosts
It might be necessary to alter the /etc/hosts on the provisioned node’s localdisk and change the IP address of the headnode from the IP on the local network, to the headnode’s IP on the external network. This step is necessary if the cloud nodes get provisioned, but then “fail when switching to local root”.
This can be done by storing a specially prepared copy of /etc/hosts inside the cloud node’s software image, and then copying it to the proper location of /etc/hosts from the node’s finalize script, e.g. via this command:
cp -f /localdisk/etc/hosts.mycopy /localhost/etc/hosts
The copy should have the headnode’s external IP address (as visible from within VPC) resolving to headnode’s hostnames in the first line of the file (i.e. before the CMDaemon generated section).