SGE 6 is significantly different from SGE 5.3. With SGE 6 come cluster-queues and host-groups — SGE 5.3's queues still exist and are now called queue-instances.
In SGE 5.3 a queue lived on a particular machine/node. In contrast, with SGE 6, we have cluster-queues which can be thought of as both a front-end to a cluster of nodes to which jobs can be submitted for execution on one of the nodes in the cluster, and as a type/class of which instances live on particular machines/nodes.
With SGE 6, unless you are actually dealing with MPI, PVM, etc, there is no need to handle parallel-environments, at all. Good.
To aid in the set up of cluster-queues, host-groups exist, which are simply a group (list) of machines/nodes.
...for a user to be able to submit a bunch (e.g., a dozen) of single-processor jobs to a "queue" which is a front-end to a number of machines/nodes each of which has say, 1 or 2 CPUs on it, and run say 3 or 4 jobs at once on each node (think hyperthreading, multicore) and for 3 × nodes (or 4 × nodes) jobs to run at once, with remaining jobs waiting their turn.
N.B. Ignore parallel-environments a la SGE 5.3; use a cluster-queue and instances of it.
qname simonh.q
hostlist @allhosts ## instances of this queue will
## exist on these hosts
seq_no 0
load_thresholds np_load_avg=1.75
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors 1
qtype BATCH INTERACTIVE ##
ckpt_list NONE
pe_list NONE
rerun FALSE
slots 1 ## number of jobs expect to run in
## an instance (i.e., on a compute
## node) usually the number of CPUs
## or cores (on a node)
tmpdir /tmp
shell /bin/bash
.
.
algorithm default
schedule_interval 0:0:5
maxujobs 10 ## IMPORTANT! Max num
## of total jobs per user
## across all queues
queue_sort_method load
job_load_adjustments np_load_avg=0.50
load_adjustment_decay_time 0:7:30
load_formula np_load_avg
schedd_job_info true
flush_submit_sec 1 ## default is 0
flush_finish_sec 1 ## default is 0
params none
reprioritize_interval 0:1:0
halftime 168
usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000
.
.
Configure, modify the state of, monitor queues using a gui. The most significant windows:
Show or modify the configuration of the given queue.
-mq <queuename> modify a queue config -sq <queuename> show queue config -sql list all queues -ssconf show schedular config
Modify the state of the given queue, e.g., disable it, clear an error state...
-cq <queuename> clear error state (E) of given queue (instance)
It seems that the schedular that comes with SGE is quite limited:
Use either qmon -> Scheduler Configuration or
qconf | grep -i schedul
[-k{m|s}] shutdown master|scheduling daemon
[-msconf] modify scheduler configuration
[-Msconf fname] modify scheduler configuration from file
[-sss] show scheduler state
[-ssconf] show scheduler configuration
[-tsm] trigger scheduler monitoring
Have a look in
$SGE_ROOT/default/spool/qmaster/messages