Stuff

UoM::RCS::Talby::Danzek::SGE



Page Contents:


Page Group

How can a user influence job priority?

 -- deadline jobs
 -- posix priority
 -- resource reservation
 -- advance reservation

Bugs/Features

Troubleshooting

Job Scheduling







Scheduling Overview

More

Survey

Queue Sorting
Queues are sorted into the order in which they should be filled — a waiting job is considered for execution in each of these queues in order until resources are found (or the job has to wait). The order can be fixed, based on seq_no (e.g., qconf -sq <queue> | grep seq_no) or on load. More. . .
Job Sorting
Waiting jobs are sorted by priority, highest priority first. Jobs are considered for execution in priority order: if resources can be found for a job it is run. N.B. This means that it is likely that jobs will NOT run in priority order (see, for example, Resource Reservation).
Job Priority
Ticket-based Job Priority: Dynamic Resource Management/Scheduler Policy
  • Ticket-based job priority depends on a weighted combination of three policies: Share-based/Fairshare, Functional (aka Priority) and Override.
  • SGE can be configured to use an exclusively share-based policy, a combination of share-based and functional/priority, or exclusively functional/priority. (Think of these as routine policies.)
  • SGE admins can override share-based and functional policies temporarily (without reconfiguring/editing them), or permanently e.g. in "fast" queues, via the override policy.
  • All of these three policies are ticket-based.
More. . .
Tickets
  • Each ticket-based policy has a pool of tickets.
  • Each allocates a number of tickets to each job. (This number may change with time.)
  • The relative size of the each policy's pool corresponds to its weight relative to the other policies.
  • And override is triggered by a temporary injection of a number of extra tickets into the system and allocation of these to certain jobs.
More. . .
Override Policy
  • Override scheduling enables the SGE admin change the priority of a job, or all jobs belonging to a user/owner, department or project, by adding tickets to the appropriate user/owner, department or project.
  • Question: the Oracle docs, in addition to jobs, users, department and project, mention job classes, i.e., queues. Is this in SGE v8?
  • Override tickets can be used to temporarily override share-based and/or functional policy without the need to make changes (and, of course, undo these changes — assuming they are not unwittingly forgotten).
  • Override tickets can also be used to establish a (Hector-like) fixed/capped resource entitlement (e.g., number of CPU-hours) for a user, department and/or project: set both of Total Share Tree Tickets and Total Functional Tickets to zero; set each user, department or project their granted override tickets.
  • Override tickets assigned to a job vanish when the job ends; other tickets are inflated back to their original value.
  • Override tickets can be added to a pending job by using qalter -ot <number-of-tickets>.
  • Override tickets assigned to a department or project remain until explicitly removed by the SGE admin.
Urgency Policy
The urgency policy defines an urgency value for each job. Contributions to this value are:
  • resource requirement — the sum of all hard-resource-requests (e.g., -l highmem)
  • waiting time
  • deadline — jobs may have a deadline (date, time) associated with them.
More. . .
Deadline Contribution/Jobs
  • Only SGE admins and specifically-configured users can specifiy deadlines for jobs.
  • Such a job is submitted with -dl [[CC]YY]MMDDhhmm[.SS], e.g., -dl 201202291900.00.
  • The dealine contribution for jobs with no deadline is zero.
  • For those with a specified deadline, the contribution is the quotient of weight_deadline (e.g., qconf -msconf) and the time in seconds until the deadline.
POSIX Priority
  • The default value is zero.
  • Users can specify a POSIX priority when submitting a job (e.g., qsub -p <value>; possible values are -1023 to zero, i.e., users can only decrease the POSIX priority of their jobs.
  • Users can use POSIX priority to change the relative priority of their jobs.
  • SGE admins can specify values in the range -1023 to 1024, i.e., can both increase and decrease the POSIX priority of a job.
Monitoring Job Priority
  qstat –prio    overall, including POSIX priority.
  qstat –ext     ticket-based stuff
  qstat –urg     urgency policy.
  qstat –prito   diagnose job priority issues when urgency, ticket-based and POSIX are all in use!
Resource Reservation and Backfilling
More. . .