How do I install HTCondor on top of a Bright Cluster?
HTCondor can be installed on top of a Bright Cluster as follows:
On the head node
1. Install Condor RPM package and its dependencies:
# yum install setools-console-3.3.7-4.el6.x86_64 policycoreutils-python-2.0.83-19.39.el6.x86_64 perl-Date-Manip.noarch
# rpm -ivh condor-8.2.1-256063.rhel6.5.x86_64.rpm
2. configure the head node to be the manager and submit host:
# condor_configure --type=manager,submit --verbose
3. Copy Condor environment variables scripts
# cp /usr/condor.* /etc/profile.d/
4. Modify the configuration script to expand the Condor pool beyond a single host:
(set ALLOW_WRITE to match all of the hosts)
# cat /etc/condor/condor_config | grep ALLOW_WRITE
ALLOW_WRITE = *
5. Start the Condor service:
# service condor start
In the software image
This is assuming default-image is the image currently used by the compute nodes.
1. Install Condor RPM package and its dependencies:
# yum --installroot=/cm/images/default-image/ install setools-console-3.3.7-4.el6.x86_64 policycoreutils-python-2.0.83-19.39.el6.x86_64 perl-Date-Manip.noarch libvirt-client-0.10.2-29.el6_5.2.x86_64
# rpm --root=/cm/images/default-image/ -ivh condor-8.2.1-256063.rhel6.5.x86_64.rpm
2. configure the nodes to be execute hosts:
# cat /cm/images/default-image/etc/condor/condor_config.local
[...]
CONDOR_HOST = master.cm.cluster
[...]
DAEMON_LIST = MASTER, STARTD
3. Modify the configuration script to expand the Condor pool beyond a single host:
# cat /cm/images/default-image/etc/condor/condor_config
[...]
ALLOW_WRITE = *
4. reboot the compute nodes to be provisioned using the modified software image
Check status from the head node after the nodes are up
# condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
node001.cm.cluster LINUX X86_64 Unclaimed Idle 0.530 490 0+00:00:02
node002.cm.cluster LINUX X86_64 Unclaimed Idle 0.210 490 0+00:00:04
Total Owner Claimed Unclaimed Matched Preempting Backfill
X86_64/LINUX 2 0 0 2 0 0 0
Total 2 0 0 2 0 0 0
Submitting a job
Like for most other workload managers, submitting jobs as root is not allowed by Condor. So switching to any other user allows jobs to be submitted.
# su - cmsupport
[cmsupport@adel70-c6 ~]$ cat hostname.sh
#!/bin/bash
hostname -f
sleep 20
date
echo "exit"
[cmsupport@adel70-c6 ~]$ cat hostname.condor
############
#
# Example job file
#
############
Universe = vanilla
Executable = hostname.sh
input = /dev/null
output = hostname.out
error = hostname.error
Queue
[cmsupport@adel70-c6 ~]$ condor_submit hostname.condor
[cmsupport@adel70-c6 ~]$ condor_q
-- Submitter: adel70-c6 : <10.150.8.241:40768> : adel70-c6
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
6.0 cmsupport 8/1 05:08 0+00:00:24 R 0 17.1 hostname.sh
1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
[cmsupport@adel70-c6 ~]$ cat hostname.out
node002.cm.cluster
Fri Aug 1 05:09:22 PDT 2014
exit