1. Home
  2. Workload Management
  3. How do I install HTCondor from EPEL repository on Bright?

How do I install HTCondor from EPEL repository on Bright?

On the head node

1. Install the HTCondor packages:

# yum install condor

2. Modify /etc/condor/condor_config so that the following configuration settings are in place:

CONDOR_HOST = master.cm.cluster
ALLOW_READ = *.cm.cluster
ALLOW_WRITE = *.cm.cluster
DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD

3. Remove the 00personal_condor.config from /etc/condor/config.d because it is unnecessary and may cause conflicts with condor_config:

# rm -f /etc/condor/config.d/00personal_condor.config

4. Enable and start the condor service:

# systemctl enable condor.service
# systemctl start condor.service

In the software image

This is assuming default-image is the image currently used by the compute nodes.

1. Install the HTCondor packages:

# yum --installroot=/cm/images/default-image install condor

2.Modify /cm/images/default-image/etc/condor/condor_config so that the following configuration settings are in place:

CONDOR_HOST = master.cm.cluster
ALLOW_READ = *.cm.cluster
ALLOW_WRITE = *.cm.cluster
DAEMON_LIST = MASTER, STARTD

3. Remove the 00personal_condor.config from /cm/images/default-image/etc/condor/config.d because it is unnecessary and may cause conflicts with condor_config:

# rm -f /cm/images/default-image/etc/condor/config.d/00personal_condor.config

4. Reboot the compute nodes to be provisioned using the modified software image.

Add condor service to compute nodes in Bright
# cmsh -c "device foreach -n node001..node003 (services; add condor; set autostart yes; set monitored yes; commit)"

Check status from the head node after the nodes are up

# condor_status
Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@node001.cm.c LINUX      X86_64 Unclaimed Idle      0.000 1976  0+00:24:41
slot2@node001.cm.c LINUX      X86_64 Unclaimed Idle      0.000 1976  0+00:25:05
slot1@node002.cm.c LINUX      X86_64 Unclaimed Idle      0.000 1976  0+04:22:46
slot2@node002.cm.c LINUX      X86_64 Unclaimed Idle      0.100 1976  0+15:10:09
slot1@node003.cm.c LINUX      X86_64 Unclaimed Idle      0.010 1976  0+04:45:06
slot2@node003.cm.c LINUX      X86_64 Unclaimed Idle      0.000 1976  0+15:10:10
                     Machines Owner Claimed Unclaimed Matched Preempting

        X86_64/LINUX        6     0       0         6       0          0

               Total        6     0       0         6       0          0

Submitting a job

As with most other workload managers, submitting jobs as root is not allowed by HTCondor; therefore, switching to any other user allows jobs to be submitted.

# su - cmsupport
$ cat hostname.sh
#!/bin/bash
hostname -f
sleep 20
date
echo "exit"
$ cat hostname.condor
############
#
# Example job file
#
############
Universe       = vanilla
Executable     = hostname.sh
input   = /dev/null
output  = hostname.out                
error   = hostname.error       
Queue
$ condor_submit hostname.condor
$ condor_q
-- Schedd: sme-b80devc7u3.cm.cluster : <10.141.255.254:23723?...
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
   2.0   cmsupport       5/31 17:13   0+00:00:03 R  0   0.0  hostname.sh
1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended$ cat hostname.out
node002.cm.cluster
Wed May 31 17:14:13 CEST 2017
exit
Updated on November 2, 2020

Related Articles

Leave a Comment