quman/README.md

28 lines
1.6 KiB
Markdown

# queue manager
compute cluster queue manager: a wrapper for [Son of Grid Engine](https://wiki.archlinux.org/title/Son_of_Grid_Engine)'s `qmod ` application
The goals of this tool are:
- to allow the job scheduler manager to provide a reason that explains the reason when disabling the queue
- to provide a reference counting mechanism that allows to disable the same queue for multiple reasons. The queue becomes enabled only when all disabling reasons have been removed. For example, a queue can be disabled because both following reasons simultaneously:
1. there is an undergoing automatic update
2. the sys admin decides to disable the queue to change a faulty ram
then when the automatic update system completes, it would ask for a queue enabling, but because the queue is still disabled for reason 2, the queue is not actually enabled
- to provide an agnostic abstraction layer able to interface any job manager (sge, slurm, etc.)
As a result, this tool can be used to provide information regrading the reason why a queue is disabled, but it also helps the sys admin to remember why a queue was disabled.
## example
```sh
bob@bobland~> quman --get-disable-reasons main.q@alambix42.ipr.univ-rennes.fr
bob@bobland~> quman --disable-queue main.q@alambix42.ipr.univ-rennes.fr --message 'requires maintenance for ram replacement'
maco@alambix42~> quman --disable-queue main.q@alambix42.ipr.univ-rennes.fr --message 'requires a security update'
bob@bobland~> quman --get-disable-reasons main.q@alambix42.ipr.univ-rennes.fr
2024-03-13 17:54:18 bob@bobland requires maintenance for ram replacement
2024-03-14 08:42:23 maco@alambix42 requires a security update
```