Commit Graph

154 Commits

Author SHA1 Message Date
Guillaume Raffy a6b4ccd81d improved documentation of retry mode which wasn't clear, while trying to work out if croconaus is supposed to restart maco automatically or not (well it is) 2024-09-19 18:03:08 +02:00
Guillaume Raffy 3f371e27c1 fixed bug in the cluster node automatic update system that caused the apt security updates to be ignored by the mechanism that checks if an update is required.
this misfunction was picked up by xymon/apt, which comainted that the Last apt update was too old

This regression was introduced in commit [29ce88975f], which replaced the unattended upgrade mechanism with the cluster cron based autoupdate mechanism (but only for alambix clustern, for some reason)

The bug in the cluster cron based autoupdate system was caused by the fact that the code forgot to update the package list before calling apt list --upgradable. As a result, the package list was never updated on alambix, and therefore scurity updates were never seen. This problem was not present on physix, which does still have the unattended upgrade mechansim along with the cluster cron based autoupdate system.

fixes [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3810]
2024-04-10 17:04:02 +02:00
Jeremy Gardais 3986af300d
Use new store.ipr.univ-rennes.fr 2023-10-20 10:14:08 +02:00
Jeremy Gardais 01a248e374
Update to @univ-rennes.fr domain 2023-08-16 05:58:32 +02:00
Jeremy Gardais 0a65d45c63
Use a real sender for email
See bugzilla 3582
https://bugzilla.ipr.univ-rennes1.fr/show_bug.cgi?id=3582
2023-06-23 11:14:31 +02:00
Jeremy Gardais 87b26d6c6d
Don't need --force option to disable SGE queue
Before restarting SGE service, "--force" will only prevent SGE to be
re-enabled after unwanted reboots…
2022-08-17 17:22:37 +02:00
Jeremy Gardais 0b0f678837
Manage return of disable function 2022-08-17 17:14:22 +02:00
Jeremy Gardais a4c49e9f85
Accept options 2022-07-11 11:34:47 +02:00
Guillaume Raffy 93d8a5a395 fixed typo 2022-05-09 18:51:52 +02:00
Guillaume Raffy ce4d2af94f reverted last change because it caused more problems than good 2022-05-09 18:50:35 +02:00
Guillaume Raffy 36cbdf6ba8 fixed bug that caused the displayed disk not to be the path that the users see 2022-05-09 18:39:06 +02:00
Guillaume Raffy 9282bd5ac5 made disk-watchdog executable 2022-05-09 18:16:20 +02:00
Guillaume Raffy 9c8b3933d9 improved disk-watchdog : the e-mail now includes the user triggering the script 2022-05-09 18:14:14 +02:00
Guillaume Raffy 3c0c41d142 added a script to send a report on /opt/ipr/cluster/work.global usage when it's full
This is to address https://bugzilla.ipr.univ-rennes1.fr/show_bug.cgi?id=3193 but this script will need to be triggered by cron.daily on work.ipr.univ-rennes1.fr
2022-05-09 17:48:58 +02:00
Jeremy Gardais f2f4bf82aa
Check full path of directories 2022-03-28 14:22:49 +02:00
Jeremy Gardais d50bb47358
New script to list dir without owner 2022-03-28 14:18:47 +02:00
Jeremy Gardais 700d6d2c5c
Exit if SGE is not configured on hosts 2021-12-22 11:03:26 +01:00
Jeremy Gardais 15992d9393
Ignore error message for qconf result 2021-12-22 10:53:33 +01:00
Jeremy Gardais 1cd20da9b6
Use localhost as default master to allow host lookup 2021-12-15 05:59:02 +01:00
Jeremy Gardais 1d1bf88bd6
Get SGE master from config file 2021-12-09 15:33:27 +01:00
Jeremy Gardais c7d3012d0c
Test sge_execd in two steps 2021-11-24 07:36:32 +01:00
Jeremy Gardais 1279e28e83
Disable SGE queue only if sge_execd is absent 2021-11-24 07:35:44 +01:00
Jeremy Gardais 2340ced9b8
Ensure queue is disable before starting sge_execd 2021-11-23 17:14:48 +01:00
Jeremy Gardais 9d31d4ab02
FORCE_MODE: Don't manage host as a localhost 2021-11-23 17:07:02 +01:00
Jeremy Gardais 4b0ee6da93
Add FORCE_MODE 2021-11-23 16:35:43 +01:00
Jeremy Gardais f56e2b067c
Add a delay after the start of sge_execd 2021-11-18 12:11:04 +01:00
Jeremy Gardais 900cc0611d
Start sge_execd earlier 2021-11-18 11:58:30 +01:00
Jeremy Gardais 123f5afaa2
Also verify that node is a SGE submit host 2021-11-16 10:18:59 +01:00
Jeremy Gardais 350cb2941f
Add TODO: replace curl --silent option 2021-06-09 08:08:27 +02:00
Jeremy Gardais 5f8ffa900f
Add web_mode to get maco's version with curl 2021-06-09 08:06:30 +02:00
Jeremy Gardais 36621b4f62
New mode that check update if last update failed 2021-05-21 14:27:57 +02:00
Jeremy Gardais 6c23714fee
Fix last exit status
If no urgent is required, the script can reach the end. Thise doesn't
means it's an error.
2021-04-20 15:32:30 +02:00
Jeremy Gardais 194d488340
Test earlier if SGE Master is reachable 2021-04-20 07:41:38 +02:00
Jeremy Gardais 903143a00e
Check APT upgrade if SGE queue is already disable 2021-04-12 09:26:55 +02:00
Jeremy Gardais ecea1311cc
Exit earlier if APT temp file exists 2021-04-12 09:13:00 +02:00
Jeremy Gardais ba359c4f75
Exit if sge_qmaster host is not reachable 2021-03-11 14:50:25 +01:00
Jeremy Gardais fd63db37d2
No longer remove benchmarks! 2021-02-15 12:06:16 +01:00
Jeremy Gardais 3126864255
Use `find … -delete` to remove file with pattern 2021-01-28 11:27:19 +01:00
Jeremy Gardais 9617abc968
Test is_sge_host earlier 2021-01-26 20:02:22 +01:00
Jeremy Gardais 259fbd8531
Exit if SGE isn't available on a host 2021-01-26 18:23:01 +01:00
Jeremy Gardais e02b666049
Test if MACO_TMP_FILE exists… 2021-01-06 13:35:52 +01:00
Jeremy Gardais d8c34f6fcc
Clean temp files before run maco_upgrade 2021-01-06 12:12:55 +01:00
Jeremy Gardais 5222bdf3e5
Maco upgrade temp file is now stored in /opt/maco 2021-01-06 12:07:31 +01:00
Jeremy Gardais 85c4950d6b
Check maco status before re-enable SGE queue 2021-01-06 11:52:24 +01:00
Jeremy Gardais 3795fcbeea
Reboot after successful upgrade 2021-01-06 11:46:19 +01:00
Jeremy Gardais c7093dd177
Re-add APT_TMP_FILE…
Due to a previous unwanted remove…
Otherwise the SGE queue is re-enable in the hour.
2020-12-29 11:40:13 +01:00
Jeremy Gardais 1595b625cd
Let Maco reboot the host with atd 2020-12-22 15:42:16 +01:00
Jeremy Gardais 036975b8fc
Fix EMPTY_ONLY_MODE definition 2020-12-04 08:36:04 +01:00
Jeremy Gardais 4e87760e43
Add EMPTY_ONLY_MODE (exit if slots not empty) 2020-12-03 12:49:55 +01:00
Jeremy Gardais 06f7dadcd8
Remove too complicated tests (attempts, % used slots) 2020-12-03 12:38:50 +01:00