performance benchmarks used to automate the measurements of the performance of compute nodes of ipr cluster.
Go to file
Guillaume Raffy 011d4eddf9 refactored iprbench to separate ipr benchmark framework from the actual benchmarks
This decoupling allows to write benchmarks as modules that can be used in various situations (from a benchmark job or directly from a user), but this design will allow automatic registering of the benchmark results in a user selectable form (sql database, stdout, etc.)

- separated `hibenchonphysix.py` into `clusterbench.py` (tool to run a benchmark on a cluster) and `hibench.py` (hibridon benchmark module) so that `clusterbench.py` no longer has a knowledge about hibridon.
- there are currently 2 ways to run a bechmark:
  1. as a simple run through `clusterbench-run` command (which will eventually be renamed as iprbench-run since it might be completely independent from the concept of cluster)
  2. as cluster jobs  through `clusterbench-submit` command
- added unit test
- added another benchmark `mamul1` that is used as a unittest because it has 2 benefits over `hibench` benchmark:
   1. it's standalone (no external resources needed)
   2. it's quicker to execute

note: this refactoring work is not complete yet, but the concept  proof is complete (the 2 unittests pass):
- still need to provide the user a way to switch between IpRCluster and DummyCluster(which is only intended to only be used for testing clusterbench))
- still need to run multiple configs of the same benchmark in one run (as hibenchonphysix did)

work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3958] and [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
2024-10-22 09:16:41 +02:00
iprbench refactored iprbench to separate ipr benchmark framework from the actual benchmarks 2024-10-22 09:16:41 +02:00
test refactored iprbench to separate ipr benchmark framework from the actual benchmarks 2024-10-22 09:16:41 +02:00
.gitignore refactored iprbench to separate ipr benchmark framework from the actual benchmarks 2024-10-22 09:16:41 +02:00
README.md refactored iprbench to separate ipr benchmark framework from the actual benchmarks 2024-10-22 09:16:41 +02:00
pyproject.toml refactored iprbench to separate ipr benchmark framework from the actual benchmarks 2024-10-22 09:16:41 +02:00
setup.py turned starbench into an installable package 2024-06-21 14:48:00 +02:00

README.md

iprbenchmark

This example illustrates how starbench is used at IPR (Institut de Physique de Rennes) to measure the performance of hibridon on IPR's cluster (alambix)

usage:

20241007-15:08:10 graffy@graffy-ws2:~/work/starbench/starbench.git$ rsync --exclude .git --exclude starbench.venv --exclude tmp --exclude usecases/ipr/hibench/results  -va ./ graffy@alambix.ipr.univ-rennes.fr:/opt/ipr/cluster/work.global/graffy/starbench.git/
sending incremental file list

sent 1,416 bytes  received 25 bytes  960.67 bytes/sec
total size is 140,225  speedup is 97.31
last command status : [0]

install iprbench

graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ python3 -m venv iprbench.venv
graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ source ./iprbench.venv/bin/activate
(iprbench.venv) graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ pip install ./iprbench.git
Processing ./iprbench.git
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting starbench@ git+https://github.com/g-raffy/starbench
  Cloning https://github.com/g-raffy/starbench to /tmp/user/59825/pip-install-uw5i22k1/starbench_890d53070dec47738060b57fdd29b001
  Running command git clone --filter=blob:none --quiet https://github.com/g-raffy/starbench /tmp/user/59825/pip-install-uw5i22k1/starbench_890d53070dec47738060b57fdd29b001
  Resolved https://github.com/g-raffy/starbench to commit 3ca66d00636ad055506f6b4e2781b498cc7487ac
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting pandas
  Downloading pandas-2.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.1/13.1 MB 23.6 MB/s eta 0:00:00
Collecting matplotlib
  Downloading matplotlib-3.9.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.3/8.3 MB 20.1 MB/s eta 0:00:00
Collecting sqlalchemy
  Downloading SQLAlchemy-2.0.35-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 17.6 MB/s eta 0:00:00
Collecting contourpy>=1.0.1
  Downloading contourpy-1.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (323 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 323.2/323.2 kB 3.4 MB/s eta 0:00:00
Collecting cycler>=0.10
  Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)
Collecting fonttools>=4.22.0
  Downloading fonttools-4.54.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 20.9 MB/s eta 0:00:00
Collecting kiwisolver>=1.3.1
  Downloading kiwisolver-1.4.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 8.9 MB/s eta 0:00:00
Collecting numpy>=1.23
  Downloading numpy-2.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.3/16.3 MB 19.9 MB/s eta 0:00:00
Collecting packaging>=20.0
  Downloading packaging-24.1-py3-none-any.whl (53 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.0/54.0 kB 1.0 MB/s eta 0:00:00
Collecting pillow>=8
  Downloading pillow-10.4.0-cp311-cp311-manylinux_2_28_x86_64.whl (4.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 20.9 MB/s eta 0:00:00
Collecting pyparsing>=2.3.1
  Downloading pyparsing-3.1.4-py3-none-any.whl (104 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 104.1/104.1 kB 1.3 MB/s eta 0:00:00
Collecting python-dateutil>=2.7
  Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 kB 933.1 kB/s eta 0:00:00
Collecting pytz>=2020.1
  Downloading pytz-2024.2-py2.py3-none-any.whl (508 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 508.0/508.0 kB 3.9 MB/s eta 0:00:00
Collecting tzdata>=2022.7
  Downloading tzdata-2024.2-py2.py3-none-any.whl (346 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 346.6/346.6 kB 2.0 MB/s eta 0:00:00
Collecting typing-extensions>=4.6.0
  Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Collecting greenlet!=0.4.17
  Downloading greenlet-3.1.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (602 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 602.4/602.4 kB 4.9 MB/s eta 0:00:00
Collecting six>=1.5
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Building wheels for collected packages: iprbench, starbench
  Building wheel for iprbench (pyproject.toml) ... done
  Created wheel for iprbench: filename=iprbench-0.0.1-py3-none-any.whl size=19188 sha256=0ece4a9b1b44434c0f033a253aae301a5d53039955f1f6b182c0b833d44c3e93
  Stored in directory: /tmp/user/59825/pip-ephem-wheel-cache-8yw7rwk6/wheels/84/c9/82/e72d13fb7df12a8004ca6383b185a00f9ab3ddd8695e9e6cd8
  Building wheel for starbench (pyproject.toml) ... done
  Created wheel for starbench: filename=starbench-1.0.0-py3-none-any.whl size=9612 sha256=18968f356bb3d6f6c2337b4bbaf709510c50a6a9902c3363f4b568c409846ac0
  Stored in directory: /tmp/user/59825/pip-ephem-wheel-cache-8yw7rwk6/wheels/cf/73/d3/14e4830d3e06c2c3ab71fdf68e0f14b50132ec23eaa6b2aa65
Successfully built iprbench starbench
Installing collected packages: pytz, tzdata, typing-extensions, starbench, six, pyparsing, pillow, packaging, numpy, kiwisolver, greenlet, fonttools, cycler, sqlalchemy, python-dateutil, contourpy, pandas, matplotlib, iprbench
Successfully installed contourpy-1.3.0 cycler-0.12.1 fonttools-4.54.1 greenlet-3.1.1 iprbench-0.0.1 kiwisolver-1.4.7 matplotlib-3.9.2 numpy-2.1.2 packaging-24.1 pandas-2.2.3 pillow-10.4.0 pyparsing-3.1.4 python-dateutil-2.9.0.post0 pytz-2024.2 six-1.16.0 sqlalchemy-2.0.35 starbench-1.0.0 typing-extensions-4.12.2 tzdata-2024.2

run unit tests

20241018-16:56:05 graffy@graffy-ws2:~/work/starbench/iprbench.git$ python3 -m unittest test.test_iprbench
2024-10-18 16:57:42,589 - INFO - test_iprbench_run
creating build directory /tmp/mamul1_out/output/worker<worker_id>
executing the following command in parallel (2 parallel runs) : '['mkdir', '-p', '/tmp/mamul1_out/output/worker<worker_id>/build']'
mean duration : 0.004 s (2 runs)
configuring /home/graffy/work/starbench/iprbench.git/test/mamul1 into /tmp/mamul1_out/output/worker<worker_id>/build ...
executing the following command in parallel (2 parallel runs) : '['/usr/bin/cmake', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_Fortran_COMPILER=gfortran', '/home/graffy/work/starbench/iprbench.git/test/mamul1']'
mean duration : 0.098 s (2 runs)
building /tmp/mamul1_out/output/worker<worker_id>/build ...
executing the following command in parallel (2 parallel runs) : '['make']'
mean duration : 0.073 s (2 runs)
benchmarking /tmp/mamul1_out/output/worker<worker_id>/build ...
executing the following command in parallel (2 parallel runs) : '['./mamul1', '1024', '10']'
mean duration : 0.660 s (2 runs)
duration : 0.660 s
.
----------------------------------------------------------------------
Ran 1 test in 1.035s

OK
last command status : [0]
20241018-16:56:05 graffy@graffy-ws2:~/work/starbench/iprbench.git$ python3 -m unittest test.test_clusterbench

launch a benchmark on the current system

iprbench-run --benchmark-id 'mamul1' --config '{"compiler_id": "gfortran", "matrix_size": 1024, "num_loops":10, "num_cores":2}' --results-dir /tmp/mamul1_out

launch benchmark jobs on alambix cluster

(iprbench.venv) graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ hibenchonphysix --commit-id 53894da48505892bfa05693a52312bacb12c70c9 --results-dir $GLOBAL_WORK_DIR/graffy/hibridon/benchmarks/starbench/hibench/$(date --iso=seconds) --arch-regexp 'intel_xeon_x5650' --cmake-path /usr/bin/cmake
INFO:root:available host groups: dict_keys(['intel_xeon_x5550', 'intel_xeon_x5650', 'intel_xeon_e5-2660', 'intel_xeon_e5-2660v2', 'intel_xeon_e5-2660v4', 'intel_xeon_gold_6140', 'intel_xeon_gold_6154', 'intel_xeon_gold_5220', 'intel_xeon_gold_6226r', 'intel_xeon_gold_6248r', 'intel_xeon_gold_6348', 'amd_epyc_7282', 'amd_epyc_7452'])
INFO:root:requested host groups: ['intel_xeon_x5650']
INFO:root:using test arch4_quick for benchmarking
INFO:root:creating /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/iprbench.venv.tgz (the virtual environment that will be used in this bench by all its jobs at some point)
Collecting virtualenv-clone
  Using cached virtualenv_clone-0.5.7-py3-none-any.whl (6.6 kB)
Installing collected packages: virtualenv-clone
Successfully installed virtualenv-clone-0.5.7
DEBUG:root:command = /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/gfortran/starbench.job "https://github.com/hibridon/hibridon" "g-raffy" "/mnt/home.ipr/graffy/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" "53894da48505892bfa05693a52312bacb12c70c9" "-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DCMAKE_Fortran_COMPILER=gfortran" "ctest --output-on-failure -L ^arch4_quick$" "" "/usr/bin/cmake"
DEBUG:root:qsub_command = qsub -pe smp 12 -l "hostname=alambix50.ipr.univ-rennes.fr" -S /bin/bash -cwd -m ae -l mem_available=1G -j y -N hibench_intel_xeon_x5650_gfortran_53894da48505892bfa05693a52312bacb12c70c9 /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/gfortran/starbench.job "https://github.com/hibridon/hibridon" "g-raffy" "/mnt/home.ipr/graffy/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" "53894da48505892bfa05693a52312bacb12c70c9" "-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DCMAKE_Fortran_COMPILER=gfortran" "ctest --output-on-failure -L ^arch4_quick$" "" "/usr/bin/cmake"
Your job 17357 ("hibench_intel_xeon_x5650_gfortran_53894da48505892bfa05693a52312bacb12c70c9") has been submitted
INFO:root:using test arch4_quick for benchmarking
INFO:root:skipping the creation of /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/iprbench.venv.tgz because it already exists (probably created for other jobs of the same bench)
DEBUG:root:command = /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/ifort/starbench.job "https://github.com/hibridon/hibridon" "g-raffy" "/mnt/home.ipr/graffy/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" "53894da48505892bfa05693a52312bacb12c70c9" "-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DCMAKE_Fortran_COMPILER=ifort -DBLA_VENDOR=Intel10_64lp" "ctest --output-on-failure -L ^arch4_quick$" "module load compilers/ifort/latest" "/usr/bin/cmake"
DEBUG:root:qsub_command = qsub -pe smp 12 -l "hostname=alambix50.ipr.univ-rennes.fr" -S /bin/bash -cwd -m ae -l mem_available=1G -j y -N hibench_intel_xeon_x5650_ifort_53894da48505892bfa05693a52312bacb12c70c9 /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/ifort/starbench.job "https://github.com/hibridon/hibridon" "g-raffy" "/mnt/home.ipr/graffy/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" "53894da48505892bfa05693a52312bacb12c70c9" "-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DCMAKE_Fortran_COMPILER=ifort -DBLA_VENDOR=Intel10_64lp" "ctest --output-on-failure -L ^arch4_quick$" "module load compilers/ifort/latest" "/usr/bin/cmake"
Your job 17358 ("hibench_intel_xeon_x5650_ifort_53894da48505892bfa05693a52312bacb12c70c9") has been submitted

hibenchonphysix script launches two sge jobs for each machine type in alambix cluster:

  • one job that performs a benchmark of hibridon with gfortran compiler
  • one job that performs a benchmark of hibridon with ifort compiler

When the job successfully completes, it puts the results of the benchmark on alambix's global work directory (eg /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/gfortran)

graph the results of benchmarks

showresults is a command line tool that graphs the results after they've been downloaded from the results directory (for example /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00) to the hardcoded (at the moment) path /home/graffy/work/starbench/starbench.git/usecases/ipr/hibench/results

20241010-16:30:54 graffy@graffy-ws2:~/work/starbench/iprbench.git$ showresults