On the head node
1. Install the HTCondor packages:
# yum install condor
2. Modify /etc/condor/condor_config so that the following configuration settings are in place:
CONDOR_HOST = master.cm.cluster
ALLOW_READ = *.cm.cluster
ALLOW_WRITE = *.cm.cluster
DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD
3. Remove the 00personal_condor.config from /etc/condor/config.d because it is unnecessary and may cause conflicts with condor_config:
# rm -f /etc/condor/config.d/00personal_condor.config
4. Enable and start the condor service:
# systemctl enable condor.service
# systemctl start condor.service
In the software image
This is assuming default-image is the image currently used by the compute nodes.
1. Install the HTCondor packages:
# yum --installroot=/cm/images/default-image install condor
2.Modify /cm/images/default-image/etc/condor/condor_config so that the following configuration settings are in place:
CONDOR_HOST = master.cm.cluster
ALLOW_READ = *.cm.cluster
ALLOW_WRITE = *.cm.cluster
DAEMON_LIST = MASTER, STARTD
3. Remove the 00personal_condor.config from /cm/images/default-image/etc/condor/config.d because it is unnecessary and may cause conflicts with condor_config:
# rm -f /cm/images/default-image/etc/condor/config.d/00personal_condor.config
4. Reboot the compute nodes to be provisioned using the modified software image.
Add condor service to compute nodes in Bright# cmsh -c "device foreach -n node001..node003 (services; add condor; set autostart yes; set monitored yes; commit)"
Check status from the head node after the nodes are up
# condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@node001.cm.c LINUX X86_64 Unclaimed Idle 0.000 1976 0+00:24:41
slot2@node001.cm.c LINUX X86_64 Unclaimed Idle 0.000 1976 0+00:25:05
slot1@node002.cm.c LINUX X86_64 Unclaimed Idle 0.000 1976 0+04:22:46
slot2@node002.cm.c LINUX X86_64 Unclaimed Idle 0.100 1976 0+15:10:09
slot1@node003.cm.c LINUX X86_64 Unclaimed Idle 0.010 1976 0+04:45:06
slot2@node003.cm.c LINUX X86_64 Unclaimed Idle 0.000 1976 0+15:10:10
Machines Owner Claimed Unclaimed Matched Preempting
X86_64/LINUX 6 0 0 6 0 0
Total 6 0 0 6 0 0
Submitting a job
As with most other workload managers, submitting jobs as root is not allowed by HTCondor; therefore, switching to any other user allows jobs to be submitted.
# su - cmsupport
$ cat hostname.sh
#!/bin/bash
hostname -f
sleep 20
date
echo "exit"
$ cat hostname.condor
############
#
# Example job file
#
############
Universe = vanilla
Executable = hostname.sh
input = /dev/null
output = hostname.out
error = hostname.error
Queue
$ condor_submit hostname.condor
$ condor_q
-- Schedd: sme-b80devc7u3.cm.cluster : <10.141.255.254:23723?...
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
2.0 cmsupport 5/31 17:13 0+00:00:03 R 0 0.0 hostname.sh
1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended$ cat hostname.out
node002.cm.cluster
Wed May 31 17:14:13 CEST 2017
exit