28 lines
1.6 KiB
Markdown
28 lines
1.6 KiB
Markdown
# queue manager
|
|
|
|
compute cluster queue manager: a wrapper for [Son of Grid Engine](https://wiki.archlinux.org/title/Son_of_Grid_Engine)'s `qmod ` application
|
|
|
|
The goals of this tool are:
|
|
- to allow the job scheduler manager to provide a reason that explains the reason when disabling the queue
|
|
- to provide a reference counting mechanism that allows to disable the same queue for multiple reasons. The queue becomes enabled only when all disabling reasons have been removed. For example, a queue can be disabled because both following reasons simultaneously:
|
|
1. there is an undergoing automatic update
|
|
2. the sys admin decides to disable the queue to change a faulty ram
|
|
then when the automatic update system completes, it would ask for a queue enabling, but because the queue is still disabled for reason 2, the queue is not actually enabled
|
|
- to provide an agnostic abstraction layer able to interface any job manager (sge, slurm, etc.)
|
|
|
|
As a result, this tool can be used to provide information regrading the reason why a queue is disabled, but it also helps the sys admin to remember why a queue was disabled.
|
|
|
|
|
|
## example
|
|
|
|
```sh
|
|
bob@bobland~> quman --get-disable-reasons main.q@alambix42.ipr.univ-rennes.fr
|
|
bob@bobland~> quman --disable-queue main.q@alambix42.ipr.univ-rennes.fr --message 'requires maintenance for ram replacement'
|
|
maco@alambix42~> quman --disable-queue main.q@alambix42.ipr.univ-rennes.fr --message 'requires a security update'
|
|
bob@bobland~> quman --get-disable-reasons main.q@alambix42.ipr.univ-rennes.fr
|
|
2024-03-13 17:54:18 bob@bobland requires maintenance for ram replacement
|
|
2024-03-14 08:42:23 maco@alambix42 requires a security update
|
|
```
|
|
|
|
|