Categories

ID #1359

How do I install HTCondor from EPEL repository on Bright?

How do I install HTCondor from EPEL repository on Bright?


On the head node


1. Install the HTCondor packages:

# yum install condor

 

2. Modify /etc/condor/condor_config so that the following configuration settings are in place:

CONDOR_HOST = master.cm.cluster

ALLOW_READ = *.cm.cluster

ALLOW_WRITE = *.cm.cluster

DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD


3. Remove the 00personal_condor.config from /etc/condor/config.d because it is unnecessary and may cause conflicts with condor_config:


# rm -f /etc/condor/config.d/00personal_condor.config


4. Enable and start the condor service:

# systemctl enable condor.service

# systemctl start condor.service


In the software image

This is assuming default-image is the image currently used by the compute nodes.


1. Install the HTCondor packages:

# yum --installroot=/cm/images/default-image install condor


2.Modify /cm/images/default-image/etc/condor/condor_config so that the following configuration settings are in place:

CONDOR_HOST = master.cm.cluster

ALLOW_READ = *.cm.cluster

ALLOW_WRITE = *.cm.cluster

DAEMON_LIST = MASTER, STARTD

 

 

 

3. Remove the 00personal_condor.config from /cm/images/default-image/etc/condor/config.d because it is unnecessary and may cause conflicts with condor_config:


 

# rm -f /cm/images/default-image/etc/condor/config.d/00personal_condor.config


 

 

 

4. Reboot the compute nodes to be provisioned using the modified software image.

 

 

 

Add condor service to compute nodes in Bright

 

# cmsh -c "device foreach -n node001..node003 (services; add condor; set autostart yes; set monitored yes; commit)"


Check status from the head node after the nodes are up


# condor_status
Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@node001.cm.c LINUX      X86_64 Unclaimed Idle      0.000 1976  0+00:24:41
slot2@node001.cm.c LINUX      X86_64 Unclaimed Idle      0.000 1976  0+00:25:05
slot1@node002.cm.c LINUX      X86_64 Unclaimed Idle      0.000 1976  0+04:22:46
slot2@node002.cm.c LINUX      X86_64 Unclaimed Idle      0.100 1976  0+15:10:09
slot1@node003.cm.c LINUX      X86_64 Unclaimed Idle      0.010 1976  0+04:45:06
slot2@node003.cm.c LINUX      X86_64 Unclaimed Idle      0.000 1976  0+15:10:10
                     Machines Owner Claimed Unclaimed Matched Preempting

        X86_64/LINUX        6     0       0         6       0          0

               Total        6     0       0         6       0          0


Submitting a job


As with most other workload managers, submitting jobs as root is not allowed by HTCondor; therefore, switching to any other user allows jobs to be submitted.


# su - cmsupport


$ cat hostname.sh

#!/bin/bash


hostname -f

sleep 20

date

echo "exit"


$ cat hostname.condor

############

#

# Example job file

#

############


Universe       = vanilla

Executable     = hostname.sh


input   = /dev/null

output  = hostname.out                

error   = hostname.error       

                                                 

Queue


$ condor_submit hostname.condor


$ condor_q


-- Schedd: sme-b80devc7u3.cm.cluster : <10.141.255.254:23723?...
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
   2.0   cmsupport       5/31 17:13   0+00:00:03 R  0   0.0  hostname.sh

1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended


$ cat hostname.out

node002.cm.cluster
Wed May 31 17:14:13 CEST 2017
exit



Tags: -

Related entries:

You cannot comment on this entry