Commit Graph

56 Commits

Author SHA1 Message Date
Guillaume Raffy 011d4eddf9 refactored iprbench to separate ipr benchmark framework from the actual benchmarks
This decoupling allows to write benchmarks as modules that can be used in various situations (from a benchmark job or directly from a user), but this design will allow automatic registering of the benchmark results in a user selectable form (sql database, stdout, etc.)

- separated `hibenchonphysix.py` into `clusterbench.py` (tool to run a benchmark on a cluster) and `hibench.py` (hibridon benchmark module) so that `clusterbench.py` no longer has a knowledge about hibridon.
- there are currently 2 ways to run a bechmark:
  1. as a simple run through `clusterbench-run` command (which will eventually be renamed as iprbench-run since it might be completely independent from the concept of cluster)
  2. as cluster jobs  through `clusterbench-submit` command
- added unit test
- added another benchmark `mamul1` that is used as a unittest because it has 2 benefits over `hibench` benchmark:
   1. it's standalone (no external resources needed)
   2. it's quicker to execute

note: this refactoring work is not complete yet, but the concept  proof is complete (the 2 unittests pass):
- still need to provide the user a way to switch between IpRCluster and DummyCluster(which is only intended to only be used for testing clusterbench))
- still need to run multiple configs of the same benchmark in one run (as hibenchonphysix did)

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3958] and [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-22 09:16:41 +02:00
Guillaume Raffy fe4a07a67e refactored all iprbench code found in `usecases/ipr/hibench` into a `iprbench` python package
The main motivation for this is to allow the code executed by jobs to benefit from multiple packages (eg iprbench, [stargemm](https://github.com/g-raffy/starbench), cocluto) to perform common missing tasks such as registering the results output in the iprbench database.

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3958] and [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-10 18:06:09 +02:00
Guillaume Raffy fb6f926cb1 improvements to hibenchonphysix:
- although still hardcoded, made it easier to switch between alambix and physix clusters
- although still hardcoded, made it easier to switch to test mode (quick test)
- removed hardcoded value for all_hosts_groups as it is retreived from cluster node database

nb: changes made on 08/10/2024

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-09 15:16:02 +02:00
Guillaume Raffy 3ca66d0063 adapted code to cope with recent change in univ rennes policy that caused github access to require the use of a proxy
As a result, I finally managed to get hibench working on alambix

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-08 17:44:22 +02:00
Guillaume Raffy cf1235f62c added the option --cmake-path to allow the user to choose the cmake executable to use.
As a result, the cmake path is no longer hardcoded (the hardcoded one was not suitable for alambix)

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-08 17:41:18 +02:00
Guillaume Raffy 5d59650e84 added the --arch-regexp option to allow the user to restrict benchmarks on some architectures
As an example, this allowed me to run the benchmark on alambix50 only for testing purposes.

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-08 17:27:05 +02:00
Guillaume Raffy c534d7e135 improved the hosts table to ease adaptation to alambix instead of physix
This hosts description table is still hardcoded, though... at some point it will be better to use external data to make the code more generic (an attempt was made with pandas, but this introduced complexity in the setup so I decided to keep it hardcoded for the moment).

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-08 17:22:26 +02:00
Guillaume Raffy 350785bfee fixed regressions introduced in commit [b8c8a1b0e6]
since starbench is now an installable package, there is no starbench.py script anymore; starbench is no longer a simple python script; the existing code that assumed starbench was a simple python needed to be adapted.

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-08 17:12:22 +02:00
Guillaume Raffy 2ba66a498d fixed bugs introduced in commit [4e0e3b60bc]
work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-08 16:58:40 +02:00
Guillaume Raffy f2ceeb2cdb added a mechanism to prevent starbench to hang in case the executed command fails
I had the case where on_exit() was never called because proc had no value and therefore the attempt to pass proc.pid to on_exit caused an exception before on_exit was called. As a result, the mater thread was waiting its children threads forever, as these child never signaled that they finished.

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-08 16:46:53 +02:00
Guillaume Raffy d71bf3f67f added the --results-dir option
- choosing different rsults dir allows to run the same benchmark more than once to have more than one measure
- this makes `hibenchonphysix.py` more decoupled from hibench, in the hope that it will be completely unaware of hibench at some poing (therefore reusable with other benchmarks)
2024-10-07 12:16:47 +02:00
Guillaume Raffy 49aebf38a5 updated documentation regarding last commit's change (rewriting in python)
work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-07 12:06:45 +02:00
Guillaume Raffy 4e0e3b60bc rewrote hibench-on-physix.sh in python
this work is preliminary work to add options to make it more generic (eg run something else than hibridon)

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-04 18:27:21 +02:00
Guillaume Raffy 46d7fd1fa7 improved showresults.py:
- added the ability to show the performance per clock cycle (eg to see if avx512 has an effect on the computation)
- the cpus are now sorted on x axis depending on their generation

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-04 18:24:02 +02:00
Guillaume Raffy 1423090328 showresult.py now dislpays a graphic comparing the performance of hibridon on different cpus
work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-01 14:25:56 +02:00
Guillaume Raffy ccb0acd817 improved hibench results parser to retrieve details of configuration (eg mkl version)
work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-09-30 17:47:15 +02:00
Guillaume Raffy 1d66fc1edb added a tool to parse benchmark output files to summarize it into a table, so that it can be exploited to create graphs
work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-09-27 23:58:14 +02:00
Guillaume Raffy c4732fee87 improved usecase file hierarchy to accomodate multiple uses cases 2024-09-27 23:53:24 +02:00
Guillaume Raffy 0693b2c948 refactoring: documented mamul1 source code related to clocks to clarify the meansured times 2024-07-01 09:30:26 +02:00
Guillaume Raffy fb7608ecdd removed hardcoded path 2024-06-30 17:16:39 +02:00
Guillaume Raffy de7c1fb2dd fixed styling errors 2024-06-30 17:15:45 +02:00
Guillaume Raffy 249ef1f3e7 refactored by isolated all core functions and classes into a core.py source file 2024-06-30 16:20:34 +02:00
Guillaume Raffy 68cb7169c7 fixed dangerous-default-value / W0102 potential problem 2024-06-30 16:08:10 +02:00
Guillaume Raffy c05ff89d29 added a complete standalone usage example (mamul1: mutiplication of matrices) in the form of a unit test
note: [matmul] is a copy of [https://github.com/g-raffy/flobe/tree/main/benchmarks/mamul1]
2024-06-30 15:53:03 +02:00
Guillaume Raffy 3dc0d12307 decoupled starbench_cmake_app from git repos information, so that starbench_cmake_app can now be used with any source code provider, not only from git repositories (eg an existing directory tree)
work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3878]
2024-06-28 15:07:37 +02:00
Guillaume Raffy b8c8a1b0e6 turned starbench into an installable package
work related to  [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3878]
2024-06-21 14:48:00 +02:00
Guillaume Raffy a43eb68db5 fixed all styling warnings and comments, and documented the code
work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3872] as I'm planning to reuse starbench to add new automatic benchmarks
2024-06-21 08:50:36 +02:00
Guillaume Raffy dc897e9225 fixed styling errors detected by pylint
work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3878]
2024-06-20 11:51:53 +02:00
Guillaume Raffy f87e564528 reorganized files and documented 2022-06-10 08:39:22 +02:00
g-raffy 165da43619
Create README.md 2022-06-09 15:59:52 +02:00
Guillaume Raffy 2c349315cd the job script is now saved to a location that should avoid it to be written by multiple jobs at the same time, thus avoiding race conditions issues. 2022-06-09 11:16:36 +02:00
Guillaume Raffy 5bc51c78ca fixed bug that caused starbench.py source code to be included twice (and one of them causing syntax errors) 2022-06-09 11:15:00 +02:00
Guillaume Raffy 43ae0a3068 cleanup and documented 2022-06-09 09:09:32 +02:00
Guillaume Raffy 733fda5517 added mechanism to download benchmark results on work.global on success 2022-06-09 08:58:04 +02:00
Guillaume Raffy 75c4b98be0 improved name of job
- also joined job outputs for easier reading
2022-06-09 08:56:47 +02:00
Guillaume Raffy 58dbfc9be6 added ifort build to hibridon's benchmark 2022-06-09 08:54:50 +02:00
Guillaume Raffy f2b8d6cdb4 now the hibridon benchmark runs one job for each machine type on physix
- also tuned memry requirements so that the `representative_test` succeeds
2022-06-07 19:01:30 +02:00
Guillaume Raffy 936dfa793a made changes needed to get hibridon benchmark running on physix (ipr's cluster):
- added in `starbench` the option to choose which cmake executable to use
- fixed typos in sge environment variables
- added job submit mechanism

With these changes, hibridon's benchmark succeeded on physix48
2022-06-07 14:52:56 +02:00
Guillaume Raffy 6715cd1714 made starbench compatible with python 3.5 + :
- removed type hints for vraiable (requires python 3.6 +)
- converted `Path` into `str` where Path type is not supported
- use `_ForwardRef` instead of `ForwardRef`

note: python3.5 was chosen as a target because that's the version of python3 on ipr's cluster
2022-06-07 12:44:25 +02:00
Guillaume Raffy fab373f3c1 made `hibench.job` hibridon independent so that:
- `hibench.job` could be reused as is by another project
- it gives more flexibility to lauch-perf-jobs.sh
2022-06-03 16:10:47 +02:00
Guillaume Raffy 176db9f719 now that `hibench.py` has been made generic (it has no hibridon related code anymore), I renamed it as `starbench.py` 2022-06-03 14:13:32 +02:00
Guillaume Raffy 959fba13b3 the test command is no longer hardcoded and can be chosen via command line arguments 2022-06-02 17:52:38 +02:00
Guillaume Raffy 226871547f removed compiler loop as the used compiler can now be defined by the user via cmake defines 2022-06-02 17:34:16 +02:00
Guillaume Raffy 431f728793 the cmake options are no longer hardcoded
This way, it's more flexible and it will allow this code to be independant from hibridon
2022-06-02 15:59:27 +02:00
Guillaume Raffy ac639b3a08 fixed bug that coused the checkout to fait because it was performed from a non-git repository
also removed unneed hibridon dependency (it's planned to make this bench code work for any git/cmake repository)
2022-06-02 15:58:00 +02:00
Guillaume Raffy 97a04d2831 it's now possible to choose aversion of the code to test
note: previously, it didn't work
2022-06-02 14:34:14 +02:00
Guillaume Raffy 05ec8a5181 made `hibench.py` (better name than `starbench.py`) parse options so that it is more generic.
This change will allow jobs to perform hibridon benchmarks
2022-06-02 13:01:16 +02:00
Guillaume Raffy c65cd60a54 measure_hibridon_perf now takes the name of the test as a parameter to make it more generic 2022-06-01 16:17:10 +02:00
Guillaume Raffy ddcb9f1175 now the number of threads per run is restricted to what the user asks 2022-06-01 12:06:58 +02:00
Guillaume Raffy 0ecddcf8f8 minor flake fix 2022-06-01 12:06:10 +02:00