Lab Exercise: CondorG

Purpose:

During this lab the user will become familiar with using condor to run jobs. 

  1. Configuring and Starting Condor
  2. Displaying Condor Information
  3. Single Job Submission
  4. Multiple Job Submission
  5. Multiple Job Submission using Separate Directories
  6. Diagnosing & Releasing Held Jobs
  7. Shutting Down Condor

 

Configuring and Starting Condor

  1. All of the following commands will be executed on your laptop.  Please be sure to exit if you still happen to be SSHed into the server

    When submitting jobs in the grid environment, whether they be globus-based commands or Condor-based jobs, the server needs to be able to open TCP/IP network connections back to the client.  During this workshop, the local DNS system used by the server will not contain the correct IP addresses for our laptop hostnames.  Therefore, we need to tell the server what our client IP address is.  When submitting jobs via Condor, there are three different configuration parameters that need to be set.  They are:
  1. Confirm that globus-hostname is still set to your laptop's IP address:
$ globus-hostname
192.168.0.203
$

If it is not, export it as follows:

$ export GLOBUS_HOSTNAME=192.168.0.203

$ echo $GLOBUS_HOSTNAME
192.168.0.203
$
  1. Edit your condor_config.local file.  It is located at:

    /home/<userid>/ldg-3.0/condor/local.<hostname>/condor_config.local

Update the values for NETWORK_INTERFACE and CONDOR_HOST to your IP address.  The other lines should remain unchanged.

.
.
NETWORK_INTERFACE = 192.168.0.203 
.
.

CONDOR_HOST =
192.168.0.203
.
.
  1. Start Condor with the condor_master command.  You should then see a number of condor-related processes.
$ condor_master

$ ps -ef | grep condor

mfreemon 24678 1 0 22:34 ? 00:00:00 condor_master
mfreemon 24679 24678 0 22:34 ? 00:00:00 condor_collector -f
mfreemon 24680 24678 0 22:34 ? 00:00:00 condor_negotiator -f
mfreemon 24681 24678 8 22:34 ? 00:00:05 condor_startd -f
mfreemon 24682 24678 0 22:34 ? 00:00:00 condor_schedd -f
mfreemon 24742 23491 0 22:36 pts/3 00:00:00 grep condor
$
  1. (Optional) Here is a script that you can put in your .bash_profile as a simple reminder of these configuration values:
echo "*********************************************"
my_ip_address=`/sbin/ifconfig eth0 | grep "inet addr" | \
    cut -d: -f2 | cut -d ' ' -f1`
echo "my local hostname        is " `hostname`
echo "my local IP address      is " $my_ip_address
echo "globus-hostname          is " `globus-hostname`
echo "condor host              is " `condor_config_val condor_host`
echo "condor network interface is " `condor_config_val network_interface`  
echo "*********************************************"

(Optional) You may also want to update your local /etc/hosts file so that local name lookups for your local hostname resolve correctly.  Otherwise, Condor will not be able to send job notification emails.

 

 

Displaying Condor Information

  1. The condor_version command is a good starting point.
$ condor_version

$CondorVersion: 6.6.6 Jul 26 2004 $
$CondorPlatform: I386-LINUX_RH9 $
$
  1. The condor_status command will show the status of the nodes in the Condor pool.
$ condor_status

Name          OpSys       Arch   State      Activity   LoadAv Mem   ActvtyTime
ldas-grid.n   LINUX       INTEL  Unclaimed  Idle       0.000   501  0+00:05:04
                     Machines Owner Claimed Unclaimed Matched Preempting
         INTEL/LINUX        1     0       0         1        0         0
               Total        1     0       0         1        0         0
$ 
  1. The condor_q command will display the job queues.
$ condor_q

-- Submitter: ldas-grid.ligo-la.caltech.edu : <141.142.96.171:35056> : ldas-grid.ligo-la.caltech.edu  
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD 
0 jobs; 0 idle, 0 running, 0 held
$ 

If you're logged into to server and want to see just your jobs, you can specify your userid as follows:

$ condor_q mfreemon

-- Submitter: mfreemon@ligo : <10.13.0.12:32772> : ldas-grid.ligo-la.caltech.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD 
28098.0   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
28098.1   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
28098.2   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
28098.3   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
28098.4   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
28098.5   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
28098.6   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
28098.7   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
28098.8   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
28098.9   mfreemon       2/24 15:50    0+00:00:00 I  0    0.0 hostname 
29105.0   mfreemon       2/25 15:44    0+00:00:00 I  0    0.0 condor_simple.sh E
11 jobs; 11 idle, 0 running, 0 held
$ 

Click here for complete documentation on the condor_q command.

 
 

Single Job Submission

  1. Two files will need to be created on your local client machine.  The first file will be a program that will be submitted to Condor for execution.  The second file is the condor submission script.  This script contains information for condor on how the executable is to be run.

Start by creating a directory in your home directory called lab7 and cd into it:

$ cd

$ mkdir lab7

$ cd lab7

This new lab7 directory should be used to contain any files we create during the remainder of this lab exercise.

  1. Create a file called  lab7.sh and copy the following code into the file:
#! /bin/sh

echo "I'm process id $$ on" `hostname`
echo "This is sent to standard error" 1>&2
date
echo "Running as binary $0" "$@"
echo "My name (argument 1) is $1"
echo "My sleep duration (argument 2) is $2"
sleep $2
echo "Sleep of $2 seconds finished. Exiting"  
exit 42
  1. Create a file called  lab7.submit and copy the following into it:
executable=lab7.sh
globusscheduler = ldas-grid.ligo-la.caltech.edu/jobmanager-condor  
universe=globus
arguments=Example.$(Cluster).$(Process) 100
output=z.lab7.output.$(Process)
error=z.lab7.error.$(Process)
log=z.lab7.log
notification=never
should_transfer_files=YES
when_to_transfer_output = ON_EXIT
queue

Looking at the submit file you should note several tags.  The executable tag tells condor the name of the program to run.  In this case it is the shell script we just created.  Also, there is a tag called: arguments.  These arguments will be passed to the running executable.  Looking at our shell script we see it takes 2 arguments.  The first is a string with two predefined values.  Cluster and Process are values that Condor provides referring to the cluster the program is running on and the programs process id (PID).  The second argument is the value which will be used by the sleep command.  This tells the program how long to sleep before continuing.  In this case, it is set for 100 seconds.

  1. Now submit the job to Condor.  This is done using the condor_submit command.
$ condor_submit lab7.submit 

Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 13.

$ 
  1. Once the job has been submitted we can look at it status with the utility condor_q.  This application gives us information about the condor job queue.  We can see what jobs are on the queue and what their status is.  By using condor_q you can follow the progress of our submitted job.  Running condor_q several times you should see output similar to what is shown below.  First the job is entered onto the queue, then it begins to run and finally it completes and is removed from the queue.
$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.171:36350> : ligo-client.ncsa.uiuc.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
13.0 mfreemon 3/9 22:21 0+00:00:00 I 0 0.0 lab7.sh Example.13

1 jobs; 1 idle, 0 running, 0 held

$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.171:36350> : ligo-client.ncsa.uiuc.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
13.0 mfreemon 3/9 22:21 0+00:00:36 R 0 0.0 lab7.sh Example.13

1 jobs; 0 idle, 1 running, 0 held

$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.171:36350> : ligo-client.ncsa.uiuc.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD

0 jobs; 0 idle, 0 running, 0 held

$
  1. Looking back at our submission script you will note that there were several files defined:
output=z.lab7.output.$(Process)  
error=z.lab7.error.$(Process)
log=z.lab7.log

The output file will contain the output of the executable.  The error file will contain any error output that the program might director to stderr.  The log file is condor's log of the job.  Look at each file in turn.

$ cat z.lab7.error.0
This is sent to standard error $ cat z.lab7.log 000 (015.000.000) 12/15 10:38:06 Job submitted from host: <141.142.96.174:33149> ... 017 (015.000.000) 12/15 10:38:19 Job submitted to Globus     RM-Contact: ligo-server.ncsa.uiuc.edu/jobmanager-condor     JM-Contact: https://ligo-server.ncsa.uiuc.edu:38307/24309/1103128689/     Can-Restart-JM: 1 ... 001 (015.000.000) 12/15 10:38:19 Job executing on host: ligo-server.ncsa.uiuc.edu ... 005 (015.000.000) 12/15 10:40:11 Job terminated.         (1) Normal termination (return value 0)                 Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage                 Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage                 Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage                 Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage         0 - Run Bytes Sent By Job         0 - Run Bytes Received By Job         0 - Total Bytes Sent By Job         0 - Total Bytes Received By Job ... $ cat z.lab7.output.0 I'm process id 25399 on node3 Wed Mar 9 23:18:32 CST 2005 Running as binary /data2/mfreemon/.globus/.gass_cache/local/md5/25/ c7a5f6954e29c32437d6f95efdd3bd/md5/aa/fba59e77460a1ee1668459ceeb5fb0/ data Example.13.0 100 My name (argument 1) is Example.13.0 My sleep duration (argument 2) is 100 Sleep of 100 seconds finished. Exiting $

 

Multiple Job Submission

  1. Create a new submission file called: condor_lots.submit

Copy the following information into the file:

executable=lab7.sh
globusscheduler = ldas-grid.ligo-la.caltech.edu/jobmanager-condor  
universe=globus
arguments=Example.$(Cluster).$(Process) 5
output=z.lots.output.$(Process)
error=z.lots.error.$(Process)
log=z.lots.log
notification=never
should_transfer_files=YES
when_to_transfer_output = ON_EXIT
Queue 15

Looking at this file you should see that it is almost exactly the same as the previous submission file.  The only things that have changed are the names of the files that will be written and the last tag: Queue.  The queue tag tells condor how many instances of the executable to run.  In this case 15 instances of lab7.sh will be run simultaneously.  One thing to keep in mind when telling condor to rerun multiple instances of a executable is what will happen to the output.  In the above set of instructions we have added the process id to the end of the file name.  Condor will now create 15 different files each being unique because of the id number.  If we had not done this, condor would have used the same file for all 15 processes.

  1. Submit the job to condor.
$ condor_submit condor_lots.submit

Submitting job(s)...............
Logging submit event(s)...............
15 job(s) submitted to cluster 29.
$
  1. Now take a look at the condor queue.  You should see all of our jobs queued up.
$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.171:35056> : ligo-client.ncsa.uiuc.edu  
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD              
   9.0   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.1   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.2   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.3   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.4   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.5   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.6   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.7   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.8   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.9   mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.10  mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.11  mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.12  mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.13  mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
   9.14  mfreemon        3/7 00:02    0+00:00:00 I  0   0.0 lab7.sh E
16 jobs; 16 idle, 0 running, 0 held

$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.171:35056> : ligo-client.ncsa.uiuc.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD              
   9.2   mfreemon        3/7 00:02    0+00:00:12 R  0   0.0 lab7.sh E
   9.9   mfreemon        3/7 00:02    0+00:00:10 C  0   0.0 lab7.sh E
2 jobs; 1 idle, 1 running, 0 held
$
  1. Once the jobs complete, do a listing on the directory and see if all of our files where created.
$ ls

drwxrwxr-x   2 mfreemon mfreemon  4096 Mar 7 00:02 .
drwx------  21 mfreemon mfreemon  4096 Mar 6 23:41 ..
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.0
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.1
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:02 lots.error.10
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.11
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.12
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.13
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.14
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.2
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.3
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.4
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.5
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.6
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.7
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:02 lots.error.8
-rw-rw-r--   1 mfreemon mfreemon    31 Mar 7 00:03 lots.error.9
-rw-rw-r--   1 mfreemon mfreemon 12585 Mar 7 00:04 lots.log
-rw-rw-r--   1 mfreemon mfreemon   324 Mar 7 00:03 lots.output.0
-rw-rw-r--   1 mfreemon mfreemon   324 Mar 7 00:03 lots.output.1
-rw-rw-r--   1 mfreemon mfreemon   326 Mar 7 00:02 lots.output.10
-rw-rw-r--   1 mfreemon mfreemon   326 Mar 7 00:03 lots.output.11
-rw-rw-r--   1 mfreemon mfreemon   325 Mar 7 00:03 lots.output.12
-rw-rw-r--   1 mfreemon mfreemon   325 Mar 7 00:03 lots.output.13
-rw-rw-r--   1 mfreemon mfreemon   326 Mar 7 00:03 lots.output.14
-rw-rw-r--   1 mfreemon mfreemon   323 Mar 7 00:03 lots.output.2
-rw-rw-r--   1 mfreemon mfreemon   324 Mar 7 00:03 lots.output.3
-rw-rw-r--   1 mfreemon mfreemon   324 Mar 7 00:03 lots.output.4
-rw-rw-r--   1 mfreemon mfreemon   324 Mar 7 00:03 lots.output.5
-rw-rw-r--   1 mfreemon mfreemon   324 Mar 7 00:03 lots.output.6
-rw-rw-r--   1 mfreemon mfreemon   324 Mar 7 00:03 lots.output.7
-rw-rw-r--   1 mfreemon mfreemon   324 Mar 7 00:02 lots.output.8
-rw-rw-r--   1 mfreemon mfreemon   323 Mar 7 00:03 lots.output.9
$

 

 

Multiple Job Submission using Separate Directories

  1. The first copy will run in directory run_1, and the second will run in directory run_2.  There will be two sets of files written, as the files are each written to their own directories. This is a convenient way to organize data if you have a large group of Condor jobs to run.

Copy lab7.submit to, or create a new file called, lab7a.submit, and edit it as follows:

executable=lab7.sh
globusscheduler = ldas-grid.ligo-la.caltech.edu/jobmanager-condor  
universe=globus
arguments=Example.$(Cluster).$(Process) 10
output=z.lab7a.output.$(Process)
error=z.lab7a.error.$(Process)
log=z.lab7a.log
notification=never
should_transfer_files=YES
when_to_transfer_output = ON_EXIT

Initialdir = run_1
queue

Initialdir = run_2
queue

The Initialdir tag is "Used to give jobs a directory with respect to file input and output. Also provides a directory (on the machine from which the job is submitted) for the user log, when a full path is not specified."

For this and other tags used by condor_submit, look in the condor manual.

http://www.cs.wisc.edu/condor/ manual/v6.6/ condor_submit.html

  1. In order for this to run correctly, we will also need to create 2 new directories.
$ mkdir run_1

$ mkdir run_2

$

  1. Now submit the job to condor and watch it run.
$ condor_submit lab7a.submit

Submitting job(s)..
Logging submit event(s)..
2 job(s) submitted to cluster 28.

$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.171:36350> : ligo-client.ncsa.uiuc.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 
15.0 mfreemon 3/9 22:52 0+00:00:00 I 0 0.0 lab7.sh Example.15
15.1 mfreemon 3/9 22:52 0+00:00:00 I 0 0.0 lab7.sh Example.15

2 jobs; 2 idle, 0 running, 0 held

$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.171:36350> : ligo-client.ncsa.uiuc.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 
15.0 mfreemon 3/9 22:52 0+00:01:14 R 0 0.0 lab7.sh Example.15
15.1 mfreemon 3/9 22:52 0+00:01:24 R 0 0.0 lab7.sh Example.15

2 jobs; 0 idle, 2 running, 0 held

$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.171:36350> : ligo-client.ncsa.uiuc.edu
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 

0 jobs; 0 idle, 0 running, 0 held
$

Look at the contents of both directories run_1 and run_2.

 

 

Diagnosing & Releasing Held Jobs

  1. When an problem occurs in the middleware, Condor will place your job on "Hold".  Held jobs remain in the queue, but wait for user intervention. When you resolve the problem, you can use the utility condor_release to free job to continue.

For this example, we'll make the output file non-writable. The job will be unable to copy the results back and will be placed on hold.

  1. Create a new file called  condor_hold.submit  and edit it as follows:
executable=lab7.sh
globusscheduler = ldas-grid.ligo-la.caltech.edu/jobmanager-condor  
universe=globus
arguments=Example.$(Cluster).$(Process) 5
output=z.hold.output.$(Process)
error=z.hold.error.$(Process)
log=z.hold.log
notification=never
should_transfer_files=YES
when_to_transfer_output = ON_EXIT
queue 15
  1. Now submit the job.  Be sure to include the chmod command on the same line after a semicolon.  This will modify the file permissions and cause the executable to be held.
$ condor_submit condor_hold.submit ; chmod -w z.hold.output.0

Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 18.
$
  1. Use condor_q to track the status of the job.  You should see the job move into the held state.
$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.174:33149> : ligo-client.ncsa.uiuc.edu
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMDD
  18.0   btest          12/15 14:16   0+00:00:00 I  0   0.0  condor_hold.sh Exa
1 jobs; 1 idle, 0 running, 0 held

$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.174:33149> : ligo-client.ncsa.uiuc.edu
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  18.0   btest          12/15 14:16   0+00:00:16 R  0   0.0  condor_hold.sh Exa
1 jobs; 0 idle, 1 running, 0 held

$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.174:33149> : ligo-client.ncsa.uiuc.edu
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  18.0   btest          12/15 14:16   0+00:00:53 H  0   0.0  condor_hold.sh Exa
1 jobs; 0 idle, 0 running, 1 held
$

  1. Once the job has moved into the held state, you can use condor_q -held or examine the job log file to see why the job was held.

condor_q -held information.

$ condor_q -held

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.174:33149> : ligo-client.ncsa.uiuc.edu
 ID      OWNER           HELD_SINCE HOLD_REASON
 18.0    btest       12/15 14:17:20 Globus error 129: the standard output/error size is different

1 jobs; 0 idle, 0 running, 1 held
$

Log file information.

$ cat z.hold.log

000 (018.000.000) 12/15 14:16:14 Job submitted from host: <141.142.96.174:33149>
...
017 (018.000.000) 12/15 14:16:27 Job submitted to Globus
    RM-Contact: ldas-grid.ligo-la.caltech.edu/jobmanager-condor
    JM-Contact: https://ldas-grid.ligo-la.caltech.edu:40046/13167/1110434858/
    Can-Restart-JM: 1
...
001 (018.000.000) 12/15 14:16:27 Job executing on host: ldas-grid.ligo-la.caltech.edu
...
012 (018.000.000) 12/15 14:17:20 Job was held.
        Globus error 129: the standard output/error size is different
        Code 2 Subcode 129
...
  1. Change the permissions on the output file:
$ chmod +w z.hold.output.0

$
  1. Release the job and have condor rerun it.
$ condor_release -all

All jobs released.

$
  1. Monitor the running program and then examine the log file when the job has finished.
$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.174:33149> : ligo-client.ncsa.uiuc.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  18.0   btest          12/15 14:16   0+00:00:53 I  0   0.0  condor_hold.sh Exa
1 jobs; 1 idle, 0 running, 0 held

$ condor_q

-- Submitter: ligo-client.ncsa.uiuc.edu : <141.142.96.174:33149> : ligo-client.ncsa.uiuc.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
0 jobs; 0 idle, 0 running, 0 held

$
  1. Check the log file
$ cat z.hold.log

013 (018.000.000) 12/15 14:30:23 Job was released.
             via condor_release (by user mfreemon)
...
001 (018.000.000) 12/15 14:30:38 Job executing on host: ldas-grid.ligo-la.caltech.edu
...
005 (018.000.000) 12/15 14:30:43 Job terminated.
        (1) Normal termination (return value 0)
                Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
                Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
        0 - Run Bytes Sent By Job
        0 - Run Bytes Received By Job
        0 - Total Bytes Sent By Job
        0 - Total Bytes Received By Job
...

 

 

Shutting Down Condor

  1. To shutdown the Condor-G services on your workstation, use condor_off as follows:
$ condor_off -master

Sent "Kill-Daemon" command for "master" to local master

$