Categories

ID #1311

How do I install Mesos on a Bright Cluster?

The installation of Mesos with the Marathon and Chronos frameworks on a Bright Cluster is described in this article.

 

Upstream documentation for these applications can be found at:

 

The master node has two physical interfaces connected to it, for example:

  • one on an internal network, eth0

  • one on an external network, eth1.

 

The same is true for each compute node that is part of the Mesos cluster. Each compute node has two interfaces connected to it, for example:

  • one on the internal network, bootif

  • one on the external network, eth1

 

The DNS service that serves the external domain must contain a record A for each node (head node or compute nodes) pointing to the IP address of the interface connected on the external network (eth1).

 

To create a software image, we can start from a clean “default-image”:

 

cmsh

softwareimage

clone default-image mesos-image

commit

 

We can also create a new category called “mesos”. This category will contain some customization that is to be carried out later on this KB article. We set its software image to be  the one created previously:

 

cmsh
category

clone default mesos

set softwareimage mesos-image

commit

 

Install the required rpms so that the Mesosphere repository can be set up:

 

rpm -Uvh http://repos.mesosphere.com/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm
rpm -Uvh http://repos.mesosphere.com/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm --root=/cm/images/mesos-image

 

Now install the required rpms on the head node and on the software image:

 

yum -y install mesos marathon mesosphere-zookeeper mod_ldap chronos
yum -y install mesos docker --installroot=/cm/images/mesos-image

 

Configure zookeeper:

 

echo 1 > /var/lib/zookeeper/myid
echo zk://10.141.255.254:2181/mesos > /etc/mesos/zk
echo zk://10.141.255.254:2181/mesos > /cm/images/mesos-image/etc/mesos/zk


cat >> /etc/zookeeper/conf/zoo.cfg << __EOF__
server.1=10.141.255.254:2888:3888
__EOF__

 

Configure the Mesos master, replacing <HEAD_NODE_FQDN> with the external fully-qualified domain name of the head node (eg. cluster1.brightcomputing.com). Also replace <CLUSTER_NAME> with the name of the Mesos cluster (eg. cluster1):

echo <HEAD_NODE_FQDN>  > /etc/mesos-master/hostname

echo <CLUSTER_NAME> > /etc/mesos-master/cluster

echo 1 > /etc/mesos-master/quorum

 

Configure the Mesos slave:

 

echo ports:[1-65000] > /cm/images/mesos-image/etc/mesos-slave/resources

echo docker,mesos > /cm/images/mesos-image/etc/mesos-slave/containerizers
echo 5mins > /cm/images/mesos-image/etc/mesos-slave/executor_registration_timeout

 

Set up a finalize script to set the nodes to use the public FQDN instead of the private one. That is, replace <DOMAIN_NAME> with the public domain name managed by the external DNS, eg., in my case: brightcomputing.com.

 

echo 'echo "$CMD_HOSTNAME.<DOMAIN_NAME>" > /localdisk/etc/mesos-slave/hostname' > /tmp/mesos_hostname

cmsh -c "category use mesos; set finalizescript /tmp/mesos_hostname; commit"

 

Update the exclude list insert an entry for the “/etc/mesos-slave/hostname” file:

 

cmsh

category use mesos

set excludelistupdate

 

The above command will open a text editor, append at the end of the file the following:

 

  • /etc/mesos-slave/hostname

 

Configure the Marathon service:

 

cat > /etc/systemd/system/marathon.service << __EOF__
[Unit]
Description=Marathon
After=network.target
Wants=network.target

[Service]
EnvironmentFile=-/etc/sysconfig/marathon
ExecStart=/usr/bin/marathon --http_port 9080 --task_launch_timeout 300000 --http_address 127.0.0.1
Restart=always
RestartSec=20

[Install]
WantedBy=multi-user.target
__EOF__

 

Configure the Chronos service:

 

cat > /etc/systemd/system/chronos.service << __EOF__
[Unit]
Description=Chronos
After=network.target
Wants=network.target

[Service]
EnvironmentFile=-/etc/sysconfig/chronos
ExecStart=/usr/bin/chronos --http_address 127.0.0.1
Restart=always
RestartSec=20

[Install]
WantedBy=multi-user.target
__EOF__

 

Reload systemd:

 

systemctl daemon-reload

 

Configure the services for the slave nodes, disabling the Mesos master. Let CMDaemon manage the Mesos slave:

 

chroot /cm/images/mesos-image

systemctl disable mesos-master
systemctl disable mesos-slave

exit
cmsh -c "category use mesos; services; add mesos-slave; set autostart yes; set monitored yes; commit"
cmsh -c "category use mesos; services; add docker; set autostart yes; set monitored yes; commit"

 

Configure the services for the master node:

 

cmsh -c "device use master; services; add mesos-master; set autostart yes; set monitored yes; commit"
cmsh -c "device use master; services; add marathon; set autostart yes; set monitored yes; commit"
cmsh -c "device use master; services; add zookeeper; set autostart yes; set monitored yes; commit"

cmsh -c "device use master; services; add chronos; set autostart yes; set monitored yes; commit"

 

Configure Apache httpd:

 

cat > /etc/httpd/conf.d/marathon.conf << __EOF__
RewriteEngine  on
RewriteRule ^/marathon$ http://%{SERVER_NAME}/marathon/ [R,L]

ProxyRequests Off
ProxyPass /marathon/ http://127.0.0.1:9080/
ProxyPassReverse /marathon/ http://127.0.0.1:9080/

<Location /marathon>
   Order Allow,Deny
   Allow from all
   AuthName "Mesos LDAP Authentication"
   AuthType Basic
   AuthBasicProvider ldap
   AuthLDAPURL ldap://localhost/dc=cm,dc=cluster?uid?sub?(objectClass=*)
   Require ldap-group cn=mesos,ou=Group,dc=cm,dc=cluster
   AuthLDAPGroupAttributeIsDN off
   AuthLDAPGroupAttribute memberUid
</Location>
__EOF__

cat > /etc/httpd/conf.d/mesos.conf << __EOF__
RewriteEngine  on
RewriteRule ^/mesos$ http://%{SERVER_NAME}/mesos/ [R,L]

ProxyRequests Off
ProxyPreserveHost On
ProxyPass /static/ http://127.0.0.1:5050/static/
ProxyPassReverse /static/ http://127.0.0.1:5050/
ProxyPass /mesos/ http://127.0.0.1:5050/
ProxyPassReverse /mesos/ http://127.0.0.1:5050/

<Location /mesos>
   Order Allow,Deny
   Allow from all
   AuthName "Mesos LDAP Authentication"
   AuthType Basic
   AuthBasicProvider ldap
   AuthLDAPURL ldap://localhost/dc=cm,dc=cluster?uid?sub?(objectClass=*)
   Require ldap-group cn=mesos,ou=Group,dc=cm,dc=cluster
   AuthLDAPGroupAttributeIsDN off
   AuthLDAPGroupAttribute memberUid
</Location>
__EOF__

cat > /etc/httpd/conf.d/chronos.conf << __EOF__
RewriteEngine  on
RewriteRule ^/chronos$ http://%{SERVER_NAME}/chronos/ [R,L]

ProxyRequests Off
ProxyPreserveHost On
ProxyPass /chronos/ http://127.0.0.1:4400/
ProxyPassReverse /chronos/ http://127.0.0.1:4400/

<Location /chronos>
   Order Allow,Deny
   Allow from all
   AuthName "Mesos LDAP Authentication"
   AuthType Basic
   AuthBasicProvider ldap
   AuthLDAPURL ldap://localhost/dc=cm,dc=cluster?uid?sub?(objectClass=*)
   Require ldap-group cn=mesos,ou=Group,dc=cm,dc=cluster
   AuthLDAPGroupAttributeIsDN off
   AuthLDAPGroupAttribute memberUid
</Location>
__EOF__


systemctl restart httpd

 

Set the category “mesos” (for example using a foreach command) on the compute nodes and reboot the mesos nodes:

 

cmsh

device

foreach -n nodeXXX..nodeXXX (set category mesos)

commit

device reboot -c mesos

 

Access the Marathon, Chronos and Mesos web applications using the following addresses (Replace <HEADNODE_PUB_FQDN> with the head node public FQDN). A basic authentication form will ask for a username and password.

 

http://<HEADNODE_PUB_FQDN>/marathon

http://<HEADNODE_PUB_FQDN>/mesos

http://<HEADNODE_PUB_FQDN>/chronos

 

Create the “mesos” group with the following commands:

 

cmsh

group

add mesos

commit

 

A user can be created with the following commands (Replace <USERNAME> and <PASSWORD> with a valid username/password and set the mesos group membership:

 

cmsh

user

add <USERNAME>

set <PASSWORD>

commit

group

append mesos groupmembers <USERNAME>

commit

 

The following are JSON examples to test the deployment of the container using Marathon:

 

Bridged networking with an HTTP healthcheck exposing a random host port that maps to port 8000 in the container:

 

{  
 "id": "webapp-bridge",
 "cmd": "python3 -m http.server 8000",
 "cpus": 0.5,
 "mem": 64.0,
 "instances": 2,
 "requirePorts": true,
 "container": {
   "type": "DOCKER",
   "docker": {
     "image": "python:3",
     "network": "BRIDGE",
     "portMappings": [
       { "containerPort": 8000, "hostPort": 0, "servicePort": 9001, "protocol": "tcp" }
     ]
   }
 },
 "healthChecks": [
   {   
     "protocol": "HTTP",
     "portIndex": 0,
     "path": "/",
     "gracePeriodSeconds": 5,
     "intervalSeconds": 20,
     "maxConsecutiveFailures": 3
   }
 ]
}

 

Bridged networking with an HTTP healthcheck exposing a fixed host port 8000 that maps to port 8000 in the container:

 

{  
 "id": "webapp-bridge-port-8000",
 "cmd": "python3 -m http.server 8000",
 "cpus": 0.5,
 "mem": 64.0,
 "instances": 2,
 "requirePorts": true,
 "container": {
   "type": "DOCKER",
   "docker": {
     "image": "python:3",
     "network": "BRIDGE",
     "portMappings": [
       { "containerPort": 8000, "hostPort": 8000, "servicePort": 9002, "protocol": "tcp" }
     ]
   }
 },
 "healthChecks": [
   {
     "protocol": "HTTP",
     "portIndex": 0,
     "path": "/",
     "gracePeriodSeconds": 5,
     "intervalSeconds": 20,
     "maxConsecutiveFailures": 3
   }
 ]
}

 

Host-based networking that exposes container port 9090 on the host:

 

{
"id": "test-host-port-9090",
"cmd": "python3 -m http.server 9090",
"cpus": 0.1,
"mem": 128,
"disk": 0,
"instances": 1,
"ports": [9090],
"requirePorts" : true,
"container": {
  "docker": {
    "image": "python:3",
    "network": "HOST"
  },
  "type": "DOCKER",
  "volumes": []
}
}

 

It is possible to submit the task definitions to Marathon as in the above raw forms. But it is also possible to fill a form using a GUI, which is easier and more intuitive. A GUI is available from:

 

http://<HEADNODE_PUB_FQDN>/marathon

 

Some screenshots of the GUI in action:

 

 

26-05-2016_11-31-05_1349x655_scrot_selection.png.orig  26-05-2016_11-33-12_1350x656_scrot_selection.png.orig

 

Configure mesos DNS

 

Configure the Bind DNS on the head node to forward the request to resolv the “mesos” domain to the mesos DNS listening on port 8053:

 

cat >> /etc/named.conf.include << __EOF__
include "/etc/named.conf.mesos";
__EOF__

cat > /etc/named.conf.mesos << __EOF__
zone "mesos" {
 type forward;
 forward only;
 forwarders { 10.141.255.254 port 8053; };
};
__EOF__

 

Install the mesos DNS:

 

curl -L https://github.com/mesosphere/mesos-dns/releases/download/v0.5.2/mesos-dns-v0.5.2-linux-amd64 -o /usr/bin/mesos-dns

chmod 755 /usr/bin/mesos-dns

 

A mesos DNS configuration file is created:

 

cat > /etc/mesos/mesos-dns-config.json << __EOF__
{
 "zk": "zk://10.141.255.254:2181/mesos",
 "masters": ["10.141.255.254:5050"],
 "refreshSeconds": 60,
 "ttl": 60,
 "domain": "mesos",
 "port": 8053,
 "resolvers": ["10.141.255.254"],
 "timeout": 5,
 "httpon": true,
 "dnson": true,
 "httpport": 8123,
 "externalon": true,
 "listener": "10.141.255.254",
 "SOAMname": "ns1.mesos",
 "SOARname": "root.ns1.mesos",
 "SOARefresh": 60,
 "SOARetry":   600,
 "SOAExpire":  86400,
 "SOAMinttl": 60,
 "IPSources": ["netinfo", "mesos", "host"]
}
__EOF__

 

A systemd unit file is created, and CMDaemon is configured with cmsh to manage the service:

 

cat > /etc/systemd/system/mesos-dns.service << __EOF__
[Unit]
Description=Mesos DNS
After=network.target
Wants=network.target

[Service]
ExecStart=/usr/bin/mesos-dns --config=/etc/mesos/mesos-dns-config.json
Restart=always
RestartSec=20

[Install]
WantedBy=multi-user.target
__EOF__


systemctl daemon-reload

cmsh -c "device use master; services; add mesos-dns; set autostart yes; set monitored yes; commit"

 

Test the DNS:

 

dig leader.mesos

dig _leader._tcp.mesos SRV

 

The following example uses marathon, and has an application named “test” deployed using marathon:

 

{

 "id": "test",

 "cmd": "python3 -m http.server",

 "cpus": 0.1,

 "mem": 128,

 "disk": 0,

 "instances": 1,

 "container": {

   "docker": {

     "image": "python:3",

     "network": "BRIDGE",

     "portMappings": [

       {

         "containerPort": 8000,

         "protocol": "tcp",

         "name": null

       }

     ]

   },

   "type": "DOCKER",

   "volumes": []

 },

 "env": {},

 "labels": {},

 "healthChecks": [

   {

     "protocol": "HTTP",

     "path": "/",

     "portIndex": 0,

     "gracePeriodSeconds": 300,

     "intervalSeconds": 60,

     "timeoutSeconds": 20,

     "maxConsecutiveFailures": 3

   }

 ]

}

 

Several records are added to the DNS automatically:

 

# dig test.marathon.mesos +short
172.17.0.2


# dig test.marathon.slave.mesos +short
10.141.0.3


# dig _test._tcp.marathon.mesos SRV +short
0 0 39205 test-tfc85-s3.marathon.mesos.


# dig test-tfc85-s3.marathon.mesos A +short
172.17.0.2
# dig test-tfc85-s3.marathon.slave.mesos A +short
10.141.0.3




Tags: marathon, mesos

Related entries:

You cannot comment on this entry