How do I install HTCondor on top of a Bright Cluster?

This article is being updated. Please be aware the content herein, not limited to version numbers and slight syntax changes, may not match the output from the most recent versions of Bright. This notation will be removed when the content has been updated.

HTCondor can be installed on top of a Bright Cluster as follows:

On the head node

1. Install Condor RPM package and its dependencies:

# yum install setools-console-3.3.7-4.el6.x86_64 policycoreutils-python-2.0.83-19.39.el6.x86_64 perl-Date-Manip.noarch

# rpm -ivh condor-8.2.1-256063.rhel6.5.x86_64.rpm

2. configure the head node to be the manager and submit host:

# condor_configure --type=manager,submit --verbose

3. Copy Condor environment variables scripts

# cp /usr/condor.* /etc/profile.d/

4. Modify the configuration script to expand the Condor pool beyond a single host:

(set ALLOW_WRITE to match all of the hosts)

# cat /etc/condor/condor_config | grep ALLOW_WRITE

ALLOW_WRITE = *

5. Start the Condor service:

# service condor start

In the software image

This is assuming default-image is the image currently used by the compute nodes.

1. Install Condor RPM package and its dependencies:

# yum --installroot=/cm/images/default-image/ install setools-console-3.3.7-4.el6.x86_64 policycoreutils-python-2.0.83-19.39.el6.x86_64 perl-Date-Manip.noarch libvirt-client-0.10.2-29.el6_5.2.x86_64

# rpm --root=/cm/images/default-image/ -ivh condor-8.2.1-256063.rhel6.5.x86_64.rpm

2. configure the nodes to be execute hosts:

# cat /cm/images/default-image/etc/condor/condor_config.local
[...]
CONDOR_HOST = master.cm.cluster
[...]
DAEMON_LIST = MASTER, STARTD

3. Modify the configuration script to expand the Condor pool beyond a single host:

# cat /cm/images/default-image/etc/condor/condor_config
[...]
ALLOW_WRITE = *

4. reboot the compute nodes to be provisioned using the modified software image

Check status from the head node after the nodes are up

# condor_status
Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime
node001.cm.cluster LINUX      X86_64 Unclaimed Idle      0.530  490  0+00:00:02
node002.cm.cluster LINUX      X86_64 Unclaimed Idle      0.210  490  0+00:00:04
                    Total Owner Claimed Unclaimed Matched Preempting Backfill
       X86_64/LINUX     2     0       0         2       0          0        0
                        Total     2     0       0         2       0          0        0

Submitting a job

Like for most other workload managers, submitting jobs as root is not allowed by Condor. So switching to any other user allows jobs to be submitted.

# su - cmsupport
[cmsupport@adel70-c6 ~]$ cat hostname.sh
#!/bin/bash
hostname -f
sleep 20
date
echo "exit"
[cmsupport@adel70-c6 ~]$ cat hostname.condor
############
#
# Example job file
#
############
Universe       = vanilla
Executable     = hostname.sh
input   = /dev/null
output  = hostname.out                
error   = hostname.error       
Queue
[cmsupport@adel70-c6 ~]$ condor_submit hostname.condor
[cmsupport@adel70-c6 ~]$ condor_q
-- Submitter: adel70-c6 : <10.150.8.241:40768> : adel70-c6
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
  6.0   cmsupport       8/1  05:08   0+00:00:24 R  0   17.1 hostname.sh       
1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
[cmsupport@adel70-c6 ~]$ cat hostname.out
node002.cm.cluster
Fri Aug  1 05:09:22 PDT 2014
exit

Updated on November 2, 2020

Tagged: HTCondor install

Related Articles

Leave a Comment Cancel