refactored all iprbench code found in `usecases/ipr/hibench` into a `iprbench` python package
The main motivation for this is to allow the code executed by jobs to benefit from multiple packages (eg iprbench, [stargemm](https://github.com/g-raffy/starbench), cocluto) to perform common missing tasks such as registering the results output in the iprbench database. work related to [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3958] and [https://bugzilla.ipr.univ-rennes.fr/show_bug.cgi?id=3372]
This commit is contained in:
parent
fb6f926cb1
commit
fe4a07a67e
|
@ -1,6 +1,3 @@
|
||||||
**/__pycache__/
|
iprbench.venv/
|
||||||
dist/
|
results/
|
||||||
src/starbench.egg-info/
|
iprbench/__pycache__/__init__.cpython-38.pyc
|
||||||
tmp/
|
|
||||||
starbench.venv/
|
|
||||||
usecases/ipr/hibench/results/
|
|
||||||
|
|
158
README.md
158
README.md
|
@ -1,53 +1,129 @@
|
||||||
# starbench
|
# iprbenchmark
|
||||||
a tool to benchmark a git cmake application using embarassingly parallel runs
|
|
||||||
|
|
||||||
`starbench` is a tool designed to build and test the performance of an application versioned in a `git` repository and using the `cmake` build system.
|
This example illustrates how `starbench` is used at IPR (Institut de Physique de Rennes) to measure the performance of [hibridon](https://github.com/hibridon/hibridon) on IPR's cluster (`alambix`)
|
||||||
|
|
||||||
In order to measure the performance of the code in *hpc* (high performance computing) environment, `starbench` is designed to make all the cores busy. For this, it uses the same technique as in `hpl`'s `stardgemm` test (that's where the 'star' prefix comes from): the same code is run on each `CPU` core. This way, we performances measures are expected to be more realistic, as the cores won't benefit from the unrealistic boost provided by the memory cache of unused cores.
|
|
||||||
|
|
||||||
If the user provides:
|
usage:
|
||||||
- the `url` of the repository
|
|
||||||
- the commit number of the version to test
|
|
||||||
- the number of cores the benchmark should use (usually the number of cores of the machine that executes the benchmark)
|
|
||||||
- the benchmark command to use
|
|
||||||
|
|
||||||
then `starbench` will do the rest:
|
|
||||||
1. clone the repository to a temporary location
|
|
||||||
2. checkout the requested version
|
|
||||||
3. configure the build
|
|
||||||
4. build the code
|
|
||||||
5. run the benchmark command for each core
|
|
||||||
6. output the average duration of the benchmark
|
|
||||||
|
|
||||||
## example
|
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
bob@bob-ws2:~/work/starbench$ python3 -m venv ./starbench.venv
|
20241007-15:08:10 graffy@graffy-ws2:~/work/starbench/starbench.git$ rsync --exclude .git --exclude starbench.venv --exclude tmp --exclude usecases/ipr/hibench/results -va ./ graffy@alambix.ipr.univ-rennes.fr:/opt/ipr/cluster/work.global/graffy/starbench.git/
|
||||||
bob@bob-ws2:~/work/starbench$ source ./starbench.venv/bin/activate
|
sending incremental file list
|
||||||
bob@bob-ws2:~/work/starbench$ pip install wheel
|
|
||||||
Collecting wheel
|
sent 1,416 bytes received 25 bytes 960.67 bytes/sec
|
||||||
Using cached wheel-0.43.0-py3-none-any.whl (65 kB)
|
total size is 140,225 speedup is 97.31
|
||||||
Installing collected packages: wheel
|
last command status : [0]
|
||||||
Successfully installed wheel-0.43.0
|
```
|
||||||
bob@bob-ws2:~/work/starbench$ pip install ./starbench.git
|
## install iprbench
|
||||||
Processing ./starbench.git
|
|
||||||
|
```sh
|
||||||
|
graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ python3 -m venv iprbench.venv
|
||||||
|
graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ source ./iprbench.venv/bin/activate
|
||||||
|
(iprbench.venv) graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ pip install ./iprbench.git
|
||||||
|
Processing ./iprbench.git
|
||||||
Installing build dependencies ... done
|
Installing build dependencies ... done
|
||||||
WARNING: Missing build requirements in pyproject.toml for file:///home/bob/work/starbench/starbench.git.
|
|
||||||
WARNING: The project does not specify a build backend, and pip cannot fall back to setuptools without 'wheel'.
|
|
||||||
Getting requirements to build wheel ... done
|
Getting requirements to build wheel ... done
|
||||||
Preparing wheel metadata ... done
|
Preparing metadata (pyproject.toml) ... done
|
||||||
Building wheels for collected packages: starbench
|
Collecting starbench@ git+https://github.com/g-raffy/starbench
|
||||||
Building wheel for starbench (PEP 517) ... done
|
Cloning https://github.com/g-raffy/starbench to /tmp/user/59825/pip-install-uw5i22k1/starbench_890d53070dec47738060b57fdd29b001
|
||||||
Created wheel for starbench: filename=starbench-1.0.0-py3-none-any.whl size=8011 sha256=a98c590fbc481722aed3512ae6345cce741615a17c24e67dc88070f85b616c4c
|
Running command git clone --filter=blob:none --quiet https://github.com/g-raffy/starbench /tmp/user/59825/pip-install-uw5i22k1/starbench_890d53070dec47738060b57fdd29b001
|
||||||
Stored in directory: /tmp/pip-ephem-wheel-cache-m_0xpm10/wheels/67/41/37/debf4c9251b719f84456398e144dffaa34d18ab336b529dc53
|
Resolved https://github.com/g-raffy/starbench to commit 3ca66d00636ad055506f6b4e2781b498cc7487ac
|
||||||
Successfully built starbench
|
Installing build dependencies ... done
|
||||||
Installing collected packages: starbench
|
Getting requirements to build wheel ... done
|
||||||
Successfully installed starbench-1.0.0
|
Preparing metadata (pyproject.toml) ... done
|
||||||
bob@bob-ws2:~/work/starbench$ starbench --git-repos-url https://github.com/hibridon/hibridon --code-version a3bed1c3ccfbca572003020d3e3d3b1ff3934fad --git-user g-raffy --git-pass-file "$HOME/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" --num-cores 2 --output-dir=/tmp/hibench --cmake-path=/opt/cmake/cmake-3.23.0/bin/cmake --cmake-option=-DCMAKE_BUILD_TYPE=Release --cmake-option=-DBUILD_TESTING=ON --benchmark-command='ctest --output-on-failure -L ^arch4_quick$'
|
Collecting pandas
|
||||||
|
Downloading pandas-2.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 MB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.1/13.1 MB 23.6 MB/s eta 0:00:00
|
||||||
|
Collecting matplotlib
|
||||||
|
Downloading matplotlib-3.9.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.3 MB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.3/8.3 MB 20.1 MB/s eta 0:00:00
|
||||||
|
Collecting sqlalchemy
|
||||||
|
Downloading SQLAlchemy-2.0.35-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 17.6 MB/s eta 0:00:00
|
||||||
|
Collecting contourpy>=1.0.1
|
||||||
|
Downloading contourpy-1.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (323 kB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 323.2/323.2 kB 3.4 MB/s eta 0:00:00
|
||||||
|
Collecting cycler>=0.10
|
||||||
|
Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)
|
||||||
|
Collecting fonttools>=4.22.0
|
||||||
|
Downloading fonttools-4.54.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.9 MB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 20.9 MB/s eta 0:00:00
|
||||||
|
Collecting kiwisolver>=1.3.1
|
||||||
|
Downloading kiwisolver-1.4.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 8.9 MB/s eta 0:00:00
|
||||||
|
Collecting numpy>=1.23
|
||||||
|
Downloading numpy-2.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.3 MB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.3/16.3 MB 19.9 MB/s eta 0:00:00
|
||||||
|
Collecting packaging>=20.0
|
||||||
|
Downloading packaging-24.1-py3-none-any.whl (53 kB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.0/54.0 kB 1.0 MB/s eta 0:00:00
|
||||||
|
Collecting pillow>=8
|
||||||
|
Downloading pillow-10.4.0-cp311-cp311-manylinux_2_28_x86_64.whl (4.5 MB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 20.9 MB/s eta 0:00:00
|
||||||
|
Collecting pyparsing>=2.3.1
|
||||||
|
Downloading pyparsing-3.1.4-py3-none-any.whl (104 kB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 104.1/104.1 kB 1.3 MB/s eta 0:00:00
|
||||||
|
Collecting python-dateutil>=2.7
|
||||||
|
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 kB 933.1 kB/s eta 0:00:00
|
||||||
|
Collecting pytz>=2020.1
|
||||||
|
Downloading pytz-2024.2-py2.py3-none-any.whl (508 kB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 508.0/508.0 kB 3.9 MB/s eta 0:00:00
|
||||||
|
Collecting tzdata>=2022.7
|
||||||
|
Downloading tzdata-2024.2-py2.py3-none-any.whl (346 kB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 346.6/346.6 kB 2.0 MB/s eta 0:00:00
|
||||||
|
Collecting typing-extensions>=4.6.0
|
||||||
|
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
|
||||||
|
Collecting greenlet!=0.4.17
|
||||||
|
Downloading greenlet-3.1.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (602 kB)
|
||||||
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 602.4/602.4 kB 4.9 MB/s eta 0:00:00
|
||||||
|
Collecting six>=1.5
|
||||||
|
Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
|
||||||
|
Building wheels for collected packages: iprbench, starbench
|
||||||
|
Building wheel for iprbench (pyproject.toml) ... done
|
||||||
|
Created wheel for iprbench: filename=iprbench-0.0.1-py3-none-any.whl size=19188 sha256=0ece4a9b1b44434c0f033a253aae301a5d53039955f1f6b182c0b833d44c3e93
|
||||||
|
Stored in directory: /tmp/user/59825/pip-ephem-wheel-cache-8yw7rwk6/wheels/84/c9/82/e72d13fb7df12a8004ca6383b185a00f9ab3ddd8695e9e6cd8
|
||||||
|
Building wheel for starbench (pyproject.toml) ... done
|
||||||
|
Created wheel for starbench: filename=starbench-1.0.0-py3-none-any.whl size=9612 sha256=18968f356bb3d6f6c2337b4bbaf709510c50a6a9902c3363f4b568c409846ac0
|
||||||
|
Stored in directory: /tmp/user/59825/pip-ephem-wheel-cache-8yw7rwk6/wheels/cf/73/d3/14e4830d3e06c2c3ab71fdf68e0f14b50132ec23eaa6b2aa65
|
||||||
|
Successfully built iprbench starbench
|
||||||
|
Installing collected packages: pytz, tzdata, typing-extensions, starbench, six, pyparsing, pillow, packaging, numpy, kiwisolver, greenlet, fonttools, cycler, sqlalchemy, python-dateutil, contourpy, pandas, matplotlib, iprbench
|
||||||
|
Successfully installed contourpy-1.3.0 cycler-0.12.1 fonttools-4.54.1 greenlet-3.1.1 iprbench-0.0.1 kiwisolver-1.4.7 matplotlib-3.9.2 numpy-2.1.2 packaging-24.1 pandas-2.2.3 pillow-10.4.0 pyparsing-3.1.4 python-dateutil-2.9.0.post0 pytz-2024.2 six-1.16.0 sqlalchemy-2.0.35 starbench-1.0.0 typing-extensions-4.12.2 tzdata-2024.2
|
||||||
```
|
```
|
||||||
|
|
||||||
## how to test
|
## launch benchmark jobs on alambix cluster
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
(starbench.venv) graffy@graffy-ws2:~/work/starbench/starbench.git$ python3 -m unittest test.test_starbench
|
(iprbench.venv) graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ hibenchonphysix --commit-id 53894da48505892bfa05693a52312bacb12c70c9 --results-dir $GLOBAL_WORK_DIR/graffy/hibridon/benchmarks/starbench/hibench/$(date --iso=seconds) --arch-regexp 'intel_xeon_x5650' --cmake-path /usr/bin/cmake
|
||||||
|
INFO:root:available host groups: dict_keys(['intel_xeon_x5550', 'intel_xeon_x5650', 'intel_xeon_e5-2660', 'intel_xeon_e5-2660v2', 'intel_xeon_e5-2660v4', 'intel_xeon_gold_6140', 'intel_xeon_gold_6154', 'intel_xeon_gold_5220', 'intel_xeon_gold_6226r', 'intel_xeon_gold_6248r', 'intel_xeon_gold_6348', 'amd_epyc_7282', 'amd_epyc_7452'])
|
||||||
|
INFO:root:requested host groups: ['intel_xeon_x5650']
|
||||||
|
INFO:root:using test arch4_quick for benchmarking
|
||||||
|
INFO:root:creating /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/iprbench.venv.tgz (the virtual environment that will be used in this bench by all its jobs at some point)
|
||||||
|
Collecting virtualenv-clone
|
||||||
|
Using cached virtualenv_clone-0.5.7-py3-none-any.whl (6.6 kB)
|
||||||
|
Installing collected packages: virtualenv-clone
|
||||||
|
Successfully installed virtualenv-clone-0.5.7
|
||||||
|
DEBUG:root:command = /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/gfortran/starbench.job "https://github.com/hibridon/hibridon" "g-raffy" "/mnt/home.ipr/graffy/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" "53894da48505892bfa05693a52312bacb12c70c9" "-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DCMAKE_Fortran_COMPILER=gfortran" "ctest --output-on-failure -L ^arch4_quick$" "" "/usr/bin/cmake"
|
||||||
|
DEBUG:root:qsub_command = qsub -pe smp 12 -l "hostname=alambix50.ipr.univ-rennes.fr" -S /bin/bash -cwd -m ae -l mem_available=1G -j y -N hibench_intel_xeon_x5650_gfortran_53894da48505892bfa05693a52312bacb12c70c9 /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/gfortran/starbench.job "https://github.com/hibridon/hibridon" "g-raffy" "/mnt/home.ipr/graffy/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" "53894da48505892bfa05693a52312bacb12c70c9" "-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DCMAKE_Fortran_COMPILER=gfortran" "ctest --output-on-failure -L ^arch4_quick$" "" "/usr/bin/cmake"
|
||||||
|
Your job 17357 ("hibench_intel_xeon_x5650_gfortran_53894da48505892bfa05693a52312bacb12c70c9") has been submitted
|
||||||
|
INFO:root:using test arch4_quick for benchmarking
|
||||||
|
INFO:root:skipping the creation of /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/iprbench.venv.tgz because it already exists (probably created for other jobs of the same bench)
|
||||||
|
DEBUG:root:command = /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/ifort/starbench.job "https://github.com/hibridon/hibridon" "g-raffy" "/mnt/home.ipr/graffy/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" "53894da48505892bfa05693a52312bacb12c70c9" "-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DCMAKE_Fortran_COMPILER=ifort -DBLA_VENDOR=Intel10_64lp" "ctest --output-on-failure -L ^arch4_quick$" "module load compilers/ifort/latest" "/usr/bin/cmake"
|
||||||
|
DEBUG:root:qsub_command = qsub -pe smp 12 -l "hostname=alambix50.ipr.univ-rennes.fr" -S /bin/bash -cwd -m ae -l mem_available=1G -j y -N hibench_intel_xeon_x5650_ifort_53894da48505892bfa05693a52312bacb12c70c9 /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/ifort/starbench.job "https://github.com/hibridon/hibridon" "g-raffy" "/mnt/home.ipr/graffy/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" "53894da48505892bfa05693a52312bacb12c70c9" "-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DCMAKE_Fortran_COMPILER=ifort -DBLA_VENDOR=Intel10_64lp" "ctest --output-on-failure -L ^arch4_quick$" "module load compilers/ifort/latest" "/usr/bin/cmake"
|
||||||
|
Your job 17358 ("hibench_intel_xeon_x5650_ifort_53894da48505892bfa05693a52312bacb12c70c9") has been submitted
|
||||||
```
|
```
|
||||||
|
|
||||||
|
`hibenchonphysix` script launches two `sge` jobs for each machine type in `alambix` cluster:
|
||||||
|
- one job that performs a benchmark of hibridon with `gfortran` compiler
|
||||||
|
- one job that performs a benchmark of hibridon with `ifort` compiler
|
||||||
|
|
||||||
|
When the job successfully completes, it puts the results of the benchmark on `alambix`'s global work directory (eg `/opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00/53894da48505892bfa05693a52312bacb12c70c9/arch4_quick/intel_xeon_x5650/gfortran`)
|
||||||
|
|
||||||
|
|
||||||
|
## graph the results of benchmarks
|
||||||
|
|
||||||
|
`showresults` is a command line tool that graphs the results after they've been downloaded from the results directory (for example `/opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-10T12:11:44+02:00`) to the hardcoded (at the moment) path `/home/graffy/work/starbench/starbench.git/usecases/ipr/hibench/results`
|
||||||
|
|
||||||
|
```sh
|
||||||
|
20241010-16:30:54 graffy@graffy-ws2:~/work/starbench/iprbench.git$ showresults
|
||||||
|
```
|
||||||
|
|
||||||
|
|
|
@ -2,13 +2,14 @@
|
||||||
# this script launches jobs to run hibridon benchmarks on physix cluster for the given version of hibridon (commit number)
|
# this script launches jobs to run hibridon benchmarks on physix cluster for the given version of hibridon (commit number)
|
||||||
from typing import List, Tuple, Dict
|
from typing import List, Tuple, Dict
|
||||||
from argparse import ArgumentParser
|
from argparse import ArgumentParser
|
||||||
import os
|
|
||||||
from os import getenv, makedirs
|
from os import getenv, makedirs
|
||||||
import shutil
|
import shutil
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
import subprocess
|
import subprocess
|
||||||
import re
|
import re
|
||||||
import logging
|
import logging
|
||||||
|
import importlib.resources
|
||||||
|
import venv
|
||||||
|
|
||||||
|
|
||||||
HostFqdn = str # eg 'physix90.ipr.univ-rennes1.fr'
|
HostFqdn = str # eg 'physix90.ipr.univ-rennes1.fr'
|
||||||
|
@ -168,6 +169,25 @@ class ClusterNodeDb:
|
||||||
return (hosts, num_cores)
|
return (hosts, num_cores)
|
||||||
|
|
||||||
|
|
||||||
|
def duplicate_this_virtualenv_to(duplicate_virtualenv_path: Path):
|
||||||
|
this_virtualenv_path = Path(getenv('VIRTUAL_ENV')) # eg /home/graffy/work/starbench/iprbench.git/iprbench.venv
|
||||||
|
assert this_virtualenv_path.exists(), f'failed to find the root the virtual environment in use (VIRTUAL_ENV environment variable has the value {this_virtualenv_path})'
|
||||||
|
|
||||||
|
if duplicate_virtualenv_path.exists():
|
||||||
|
shutil.rmtree(duplicate_virtualenv_path)
|
||||||
|
|
||||||
|
cloner_virtualenv_path = Path('/tmp/venv_cloner.venv')
|
||||||
|
venv.create(cloner_virtualenv_path, with_pip=True)
|
||||||
|
subprocess.run(f'source {cloner_virtualenv_path}/bin/activate; pip install virtualenv-clone', shell=True, check=True, executable='/bin/bash')
|
||||||
|
subprocess.run(f'source {cloner_virtualenv_path}/bin/activate; virtualenv-clone {this_virtualenv_path} {duplicate_virtualenv_path}', shell=True, check=True, executable='/bin/bash')
|
||||||
|
shutil.rmtree(cloner_virtualenv_path)
|
||||||
|
|
||||||
|
|
||||||
|
def archive_this_virtualenv_to(venv_archive_path: Path, venv_hardcoded_path: Path):
|
||||||
|
duplicate_this_virtualenv_to(venv_hardcoded_path)
|
||||||
|
subprocess.run(f'tar czvf {venv_archive_path} {venv_hardcoded_path.relative_to(venv_hardcoded_path.parent)}', shell=True, check=True, cwd=venv_hardcoded_path.parent, stdout=subprocess.DEVNULL)
|
||||||
|
|
||||||
|
|
||||||
def launch_job_for_host_group(hibridon_version: GitCommitTag, host_group_id: HostGroupId, results_dir: Path, compiler_id: CompilerId, cmake_path: str):
|
def launch_job_for_host_group(hibridon_version: GitCommitTag, host_group_id: HostGroupId, results_dir: Path, compiler_id: CompilerId, cmake_path: str):
|
||||||
|
|
||||||
cluster_db = ClusterNodeDb()
|
cluster_db = ClusterNodeDb()
|
||||||
|
@ -179,7 +199,7 @@ def launch_job_for_host_group(hibridon_version: GitCommitTag, host_group_id: Hos
|
||||||
|
|
||||||
quick_test = 'arch4_quick' # about 2s on a core i5 8th generation
|
quick_test = 'arch4_quick' # about 2s on a core i5 8th generation
|
||||||
representative_test = 'nh3h2_qma_long' # about 10min on a core i5 8th generation
|
representative_test = 'nh3h2_qma_long' # about 10min on a core i5 8th generation
|
||||||
use_test_mode = False
|
use_test_mode = True
|
||||||
if use_test_mode:
|
if use_test_mode:
|
||||||
benchmark_test = quick_test
|
benchmark_test = quick_test
|
||||||
else:
|
else:
|
||||||
|
@ -220,24 +240,30 @@ def launch_job_for_host_group(hibridon_version: GitCommitTag, host_group_id: Hos
|
||||||
makedirs(this_bench_dir, exist_ok=True)
|
makedirs(this_bench_dir, exist_ok=True)
|
||||||
|
|
||||||
starbench_job_path = this_bench_dir / 'starbench.job'
|
starbench_job_path = this_bench_dir / 'starbench.job'
|
||||||
this_file_path = Path(os.path.realpath(__file__))
|
|
||||||
scripts_dir = this_file_path.parent
|
|
||||||
starbench_root_path = scripts_dir.parent.parent.parent # TODO: beurk
|
|
||||||
|
|
||||||
# create a copy of stargemm for use by the jobs (so that starbench_root_path can be modified without affecting the jobs)
|
job_venv_archive_path = results_dir / 'iprbench.venv.tgz'
|
||||||
jobs_starbench_dir = results_dir / 'starbench' # the location of starbench source code for use by the jobs run by this command
|
iprbench_venv_hardcoded_path = Path('/tmp') / 'iprbench.venv'
|
||||||
shutil.copytree(starbench_root_path, jobs_starbench_dir, dirs_exist_ok=True)
|
if job_venv_archive_path.exists():
|
||||||
|
logging.info('skipping the creation of %s because it already exists (probably created for other jobs of the same bench)', job_venv_archive_path)
|
||||||
|
else:
|
||||||
|
# freeze this virtualenv so that all jobs related to this benchmark will use the same version of iprbench
|
||||||
|
logging.info('creating %s (the virtual environment that will be used in this bench by all its jobs at some point)', job_venv_archive_path)
|
||||||
|
archive_this_virtualenv_to(job_venv_archive_path, iprbench_venv_hardcoded_path)
|
||||||
|
|
||||||
# create the job file (which embeds starbench.py)
|
# create the job file (which embeds starbench.py)
|
||||||
tags_dict = {
|
tags_dict = {
|
||||||
# '<include:starbench.py>': scripts_dir / 'starbench.py',
|
# '<include:starbench.py>': scripts_dir / 'starbench.py',
|
||||||
'<starbench_job_path>': str(starbench_job_path)
|
'<starbench_job_path>': str(starbench_job_path),
|
||||||
|
'<iprbench_venv_hardcoded_path>': str(iprbench_venv_hardcoded_path),
|
||||||
|
'<iprbench_venv_archive_path>': str(job_venv_archive_path)
|
||||||
}
|
}
|
||||||
substitute_tags(input_file_path=scripts_dir / 'starbench-template.job', tags_dict=tags_dict, output_file_path=starbench_job_path)
|
with importlib.resources.path('iprbench.resources', 'starbench-template.job') as job_template_path:
|
||||||
|
# job_template_path = importlib.resources..files('iprbench.resources') / 'hibench' / 'starbench-template.job'
|
||||||
|
substitute_tags(input_file_path=job_template_path, tags_dict=tags_dict, output_file_path=starbench_job_path)
|
||||||
subprocess.run(['chmod', 'a+x', starbench_job_path], check=True)
|
subprocess.run(['chmod', 'a+x', starbench_job_path], check=True)
|
||||||
|
|
||||||
command = f'{starbench_job_path} "{git_repos_url}" "{git_user}" "{git_pass_file}" "{hibridon_version}" "{" ".join(cmake_options)}" "{benchmark_command}" "{env_vars_bash_commands}" "{starbench_root_path}" "{cmake_path}"'
|
command = f'{starbench_job_path} "{git_repos_url}" "{git_user}" "{git_pass_file}" "{hibridon_version}" "{" ".join(cmake_options)}" "{benchmark_command}" "{env_vars_bash_commands}" "{cmake_path}"'
|
||||||
print(f'command = {command}')
|
logging.debug('command = %s', command)
|
||||||
|
|
||||||
qsub_command = 'qsub'
|
qsub_command = 'qsub'
|
||||||
qsub_command += f' -pe smp {num_cores}'
|
qsub_command += f' -pe smp {num_cores}'
|
||||||
|
@ -249,7 +275,7 @@ def launch_job_for_host_group(hibridon_version: GitCommitTag, host_group_id: Hos
|
||||||
qsub_command += ' -j y' # merge stderr file into stdout file for easier reading of history of events
|
qsub_command += ' -j y' # merge stderr file into stdout file for easier reading of history of events
|
||||||
qsub_command += f' -N hibench_{host_group_id}_{compiler_id}_{hibridon_version}'
|
qsub_command += f' -N hibench_{host_group_id}_{compiler_id}_{hibridon_version}'
|
||||||
qsub_command += f' {command}'
|
qsub_command += f' {command}'
|
||||||
print(f'qsub_command = {qsub_command}')
|
logging.debug('qsub_command = %s', qsub_command)
|
||||||
|
|
||||||
subprocess.run(qsub_command, cwd=this_bench_dir, check=True, shell=True)
|
subprocess.run(qsub_command, cwd=this_bench_dir, check=True, shell=True)
|
||||||
|
|
||||||
|
@ -268,9 +294,9 @@ def launch_perf_jobs(hibridon_version: GitCommitTag, results_dir: Path, arch_reg
|
||||||
cluster_db = ClusterNodeDb()
|
cluster_db = ClusterNodeDb()
|
||||||
all_host_groups = cluster_db.cpu_defs.keys()
|
all_host_groups = cluster_db.cpu_defs.keys()
|
||||||
|
|
||||||
print(f'available host groups: {all_host_groups}')
|
logging.info('available host groups: %s', all_host_groups)
|
||||||
host_groups = [host_group for host_group in all_host_groups if re.match(arch_regexp, host_group) is not None]
|
host_groups = [host_group for host_group in all_host_groups if re.match(arch_regexp, host_group) is not None]
|
||||||
print(f'requested host groups: {host_groups}')
|
logging.info('requested host groups: %s', host_groups)
|
||||||
|
|
||||||
for compiler in compilers:
|
for compiler in compilers:
|
||||||
for host_group in host_groups:
|
for host_group in host_groups:
|
||||||
|
@ -290,6 +316,7 @@ def path_is_reachable_by_compute_nodes(path: Path):
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
|
logging.basicConfig(level=logging.DEBUG)
|
||||||
arg_parser = ArgumentParser(description='launches hibridon benchmark jobs on IPR\'s physix cluster', epilog='example:\n --commit-id a3bed1c3ccfbca572003020d3e3d3b1ff3934fad')
|
arg_parser = ArgumentParser(description='launches hibridon benchmark jobs on IPR\'s physix cluster', epilog='example:\n --commit-id a3bed1c3ccfbca572003020d3e3d3b1ff3934fad')
|
||||||
arg_parser.add_argument('--commit-id', type=str, required=True, help='the commit id of the version of code to benchmark')
|
arg_parser.add_argument('--commit-id', type=str, required=True, help='the commit id of the version of code to benchmark')
|
||||||
arg_parser.add_argument('--results-dir', type=Path, required=True, help='the root directory of the tree where the results of the benchmarks are stored (eg $GLOBAL_WORK_DIR/graffy/benchmarks/hibench)')
|
arg_parser.add_argument('--results-dir', type=Path, required=True, help='the root directory of the tree where the results of the benchmarks are stored (eg $GLOBAL_WORK_DIR/graffy/benchmarks/hibench)')
|
||||||
|
@ -311,6 +338,3 @@ def main():
|
||||||
raise ValueError('the results path is expected to be on a disk that is accessible to all cluster nodes, and it doesn\'t seem to be the case for {results_dir}')
|
raise ValueError('the results path is expected to be on a disk that is accessible to all cluster nodes, and it doesn\'t seem to be the case for {results_dir}')
|
||||||
|
|
||||||
launch_perf_jobs(hibridon_version, results_dir, arch_regexp, cmake_path)
|
launch_perf_jobs(hibridon_version, results_dir, arch_regexp, cmake_path)
|
||||||
|
|
||||||
|
|
||||||
main()
|
|
|
@ -397,5 +397,3 @@ def main():
|
||||||
|
|
||||||
create_graphs(engine)
|
create_graphs(engine)
|
||||||
|
|
||||||
|
|
||||||
main()
|
|
|
@ -0,0 +1 @@
|
||||||
|
__version__ = '0.0.1'
|
|
@ -7,8 +7,7 @@ code_version="$4" # git branch id or commit id eg : 'a3bed1c3ccfbca572003020d3e
|
||||||
cmake_options="$5" # eg '-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON'
|
cmake_options="$5" # eg '-DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON'
|
||||||
benchmark_command="$6" # eg 'ctest -L ^arch4_quick$'
|
benchmark_command="$6" # eg 'ctest -L ^arch4_quick$'
|
||||||
env_vars_bash_commands="$7" # defines extra environment variables prior to launch starbench. eg "export MKLROOT=/opt/intel/compilers_and_libraries_2020.1.217/linux/mkl"
|
env_vars_bash_commands="$7" # defines extra environment variables prior to launch starbench. eg "export MKLROOT=/opt/intel/compilers_and_libraries_2020.1.217/linux/mkl"
|
||||||
starbench_src_url="$8" # location of starbench source (eg /opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibench/2024-10-08T16:39:52+02:00/starbench)
|
cmake_path="$8" # eg '/opt/cmake/cmake-3.23.0/bin/cmake'
|
||||||
cmake_path="$9" # eg '/opt/cmake/cmake-3.23.0/bin/cmake'
|
|
||||||
executed_by_sge=''
|
executed_by_sge=''
|
||||||
|
|
||||||
if [ "${JOB_ID}" = '' ]
|
if [ "${JOB_ID}" = '' ]
|
||||||
|
@ -33,27 +32,29 @@ then
|
||||||
fi
|
fi
|
||||||
mkdir -p "${temp_dir}"
|
mkdir -p "${temp_dir}"
|
||||||
|
|
||||||
# create a virtual environment to install starbench
|
iprbench_venv_path='<iprbench_venv_hardcoded_path>'
|
||||||
venv_path="${temp_dir}/starbench.venv"
|
iprbench_venv_parent=$(dirname "$iprbench_venv_path")
|
||||||
python3 -m virtualenv "$venv_path"
|
iprbench_venv_archive_path='<iprbench_venv_archive_path>'
|
||||||
if [ $? != 0 ]
|
echo "unarchiving virtual environment ${iprbench_venv_archive_path} to ${iprbench_venv_parent}"
|
||||||
|
pushd "${iprbench_venv_parent}"
|
||||||
|
tar xzvf "${iprbench_venv_archive_path}"
|
||||||
|
popd
|
||||||
|
if [ ! -d "${iprbench_venv_path}" ]
|
||||||
then
|
then
|
||||||
echo "failed to create the virtual environment $venv_path"
|
echo "failed to find expected directory ${iprbench_venv_path}"
|
||||||
exit 1
|
|
||||||
fi
|
|
||||||
source "$venv_path/bin/activate"
|
|
||||||
if [ $? != 0 ]
|
|
||||||
then
|
|
||||||
echo "failed to activate the virtual environment $venv_path"
|
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
pip install $starbench_src_url
|
echo "using the iprbench virtual environment that has been created for this bench: ${iprbench_venv_path}"
|
||||||
|
source "$iprbench_venv_path/bin/activate"
|
||||||
if [ $? != 0 ]
|
if [ $? != 0 ]
|
||||||
then
|
then
|
||||||
echo "failed to install starbench ($starbench_src_url) in the virtual environment $venv_path"
|
echo "failed to activate the virtual environment $iprbench_venv_path"
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
echo "VIRTUAL_ENV = $VIRTUAL_ENV"
|
||||||
|
# show the list of packages installed in the virtual environment
|
||||||
|
pip list
|
||||||
|
|
||||||
output_dir="${temp_dir}"
|
output_dir="${temp_dir}"
|
||||||
num_cores=${NSLOTS}
|
num_cores=${NSLOTS}
|
||||||
|
@ -94,14 +95,14 @@ eval ${command}
|
||||||
if [ "$?" = '0' ]
|
if [ "$?" = '0' ]
|
||||||
then
|
then
|
||||||
echo "the command ${command} succeeded"
|
echo "the command ${command} succeeded"
|
||||||
rsync -va --exclude 'build' --exclude 'source.git' "${output_dir}/" ${launch_dir}/ # exclude the source.git and build directories (one for each worker) because they are big and not that precious
|
rsync -va --exclude 'build' --exclude 'source.git' --exclude "${iprbench_venv_path}" "${output_dir}/" ${launch_dir}/ # exclude the source.git and build directories (one for each worker) because they are big and not that precious
|
||||||
# TMPDIR will be deleted by sge at the end of the job
|
# TMPDIR will be deleted by sge at the end of the job
|
||||||
else
|
else
|
||||||
if [ ${executed_by_sge} = 'true' ]
|
if [ ${executed_by_sge} = 'true' ]
|
||||||
then
|
then
|
||||||
# TMPDIR will be deleted by sge at the end of the job. Backup data for investigation
|
# TMPDIR will be deleted by sge at the end of the job. Backup data for investigation
|
||||||
backup_dir="/opt/ipr/cluster/work.local/$(whoami)/${JOB_ID}"
|
backup_dir="/opt/ipr/cluster/work.local/$(whoami)/${JOB_ID}"
|
||||||
echo "moving ${output_dir} to ${backup_dir} to that it doesn't get deleted by sge at the end of the job. This way, data gets a chance to be investigated then manually deleted."
|
echo "moving ${output_dir} to ${backup_dir} so that it doesn't get deleted by sge at the end of the job. This way, data gets a chance to be investigated then manually deleted."
|
||||||
mv "${output_dir}" "${backup_dir}"
|
mv "${output_dir}" "${backup_dir}"
|
||||||
fi
|
fi
|
||||||
echo "the command ${command} failed... the output data dir (${output_dir}) is expected to be cleaned up manually after investigation"
|
echo "the command ${command} failed... the output data dir (${output_dir}) is expected to be cleaned up manually after investigation"
|
|
@ -3,15 +3,19 @@ requires = ["setuptools"]
|
||||||
build-backup = "setuptools.build_meta"
|
build-backup = "setuptools.build_meta"
|
||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "starbench"
|
name = "iprbench"
|
||||||
dynamic = ["version"] # the list of fields whose values are dicovered by the backend (eg __version__)
|
dynamic = ["version"] # the list of fields whose values are dicovered by the backend (eg __version__)
|
||||||
description = "a tool to benchmark a git cmake application using embarassingly parallel runs"
|
description = "benchmarks for IPR (Institut de Physique de Rennes) cluster"
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
keywords = ["benchmark", "hpc", "parallel", 'openmp']
|
keywords = ["benchmark", "hpc", "parallel", 'openmp']
|
||||||
license = {text = "MIT License"}
|
license = {text = "MIT License"}
|
||||||
dependencies = [
|
dependencies = [
|
||||||
|
"pandas",
|
||||||
|
"matplotlib",
|
||||||
|
"sqlalchemy",
|
||||||
# "cocluto >= 1.2"
|
# "cocluto >= 1.2"
|
||||||
# "cocluto@git+https://git.ipr.univ-rennes.fr/cellinfo/cocluto"
|
# "cocluto@git+https://git.ipr.univ-rennes.fr/cellinfo/cocluto"
|
||||||
|
"starbench@git+https://github.com/g-raffy/starbench"
|
||||||
]
|
]
|
||||||
requires-python = ">= 3.8"
|
requires-python = ">= 3.8"
|
||||||
authors = [
|
authors = [
|
||||||
|
@ -19,10 +23,17 @@ authors = [
|
||||||
]
|
]
|
||||||
|
|
||||||
[project.scripts]
|
[project.scripts]
|
||||||
starbench = "starbench.main:main"
|
hibenchonphysix = "iprbench.hibench.hibenchonphysix:main"
|
||||||
|
showresults = "iprbench.hibench.showresults:main"
|
||||||
|
|
||||||
[project.urls]
|
[project.urls]
|
||||||
Repository = "https://github.com/g-raffy/starbench"
|
Repository = "https://github.com/g-raffy/starbench"
|
||||||
|
|
||||||
|
[tool.setuptools]
|
||||||
|
packages = ["iprbench", "iprbench.hibench"]
|
||||||
|
|
||||||
[tool.setuptools.dynamic]
|
[tool.setuptools.dynamic]
|
||||||
version = {attr = "starbench.main.__version__"}
|
version = {attr = "iprbench.main.__version__"}
|
||||||
|
|
||||||
|
[tool.setuptools.package-data]
|
||||||
|
iprbench = ["resources/**/*"]
|
||||||
|
|
|
@ -1,2 +0,0 @@
|
||||||
# from .main import starbench_cmake_app
|
|
||||||
# __all__ = [starbench_cmake_app]
|
|
|
@ -1,284 +0,0 @@
|
||||||
#!/usr/bin/env python3
|
|
||||||
'''starbench is an application that is able to measure the execution time of a user software suite in various conditions (different build modes and different execution modes)
|
|
||||||
|
|
||||||
'''
|
|
||||||
__version__ = '1.0.0'
|
|
||||||
import threading
|
|
||||||
import subprocess
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
from typing import List, Dict, Optional, Tuple, Callable
|
|
||||||
from datetime import datetime
|
|
||||||
from pathlib import Path
|
|
||||||
from abc import ABC, abstractmethod
|
|
||||||
# from typing import ForwardRef
|
|
||||||
try:
|
|
||||||
from typing import ForwardRef # type: ignore pylint: disable=ungrouped-imports
|
|
||||||
except ImportError:
|
|
||||||
# python 3.6
|
|
||||||
from typing import _ForwardRef as ForwardRef
|
|
||||||
|
|
||||||
assert sys.version_info >= (3, 5, 0), 'this code requires at least python 3.5' # type hints in arguments
|
|
||||||
|
|
||||||
|
|
||||||
class StarBenchException(Exception):
|
|
||||||
'''base exception for user errors detected by starbench'''
|
|
||||||
|
|
||||||
|
|
||||||
RunId = int # identifier of a run
|
|
||||||
WorkerId = int # identifier of a worker (a run is performed on a worker)
|
|
||||||
DurationInSeconds = float
|
|
||||||
ProcessId = int
|
|
||||||
ReturnCode = int
|
|
||||||
Url = str
|
|
||||||
GitCommitId = str
|
|
||||||
|
|
||||||
|
|
||||||
class Run():
|
|
||||||
"""represents a run of a run of the benchmarked command within its CommandPerfEstimator
|
|
||||||
"""
|
|
||||||
id: RunId # uniquely identifies a run within its CommandPerfEstimator instance
|
|
||||||
worker_id: WorkerId # the worker used for this run (number of workers = number of parallel runs)
|
|
||||||
pid: Optional[ProcessId] # the process identifier of the process used by the command
|
|
||||||
start_time: datetime # the time at which the command process has started
|
|
||||||
return_code: ReturnCode # the exit code of the command process
|
|
||||||
end_time: Optional[datetime] # the time at which the command process has ended. None if the process is still running
|
|
||||||
|
|
||||||
def __init__(self, run_id: RunId, worker_id: WorkerId):
|
|
||||||
self.id = run_id
|
|
||||||
self.worker_id = worker_id
|
|
||||||
self.pid = None
|
|
||||||
self.return_code = 0
|
|
||||||
self.start_time = datetime.now()
|
|
||||||
self.end_time = None
|
|
||||||
|
|
||||||
def has_finished(self) -> bool:
|
|
||||||
"""indicates if this run has finished"""
|
|
||||||
return self.end_time is not None
|
|
||||||
|
|
||||||
def get_duration(self) -> DurationInSeconds:
|
|
||||||
"""returns the duration of this run, provided it has finished
|
|
||||||
"""
|
|
||||||
assert self.has_finished()
|
|
||||||
return (self.end_time - self.start_time).total_seconds()
|
|
||||||
|
|
||||||
|
|
||||||
CommandPerfEstimator = ForwardRef('CommandPerfEstimator')
|
|
||||||
|
|
||||||
|
|
||||||
class IStarBencherStopCondition(ABC):
|
|
||||||
"""abstract handler that decides if the given CommandPerfEstimator has enough runs to estimate the performance or should trigger new runs
|
|
||||||
|
|
||||||
"""
|
|
||||||
@abstractmethod
|
|
||||||
def should_stop(self, star_bencher: CommandPerfEstimator) -> bool:
|
|
||||||
"""decides if the given CommandPerfEstimator instance should trigger new runs
|
|
||||||
|
|
||||||
This method is called at the end of each run, to decide if another run should be triggered or not.
|
|
||||||
"""
|
|
||||||
|
|
||||||
|
|
||||||
class StopAfterSingleRun(IStarBencherStopCondition):
|
|
||||||
"""a stop condition that causes the given CommandPerfEstimator to never start new runs
|
|
||||||
|
|
||||||
as a result, this causes the given CommandPerfEstimator to just use one single run of the command to estimate its performance.
|
|
||||||
"""
|
|
||||||
def __init__(self):
|
|
||||||
pass
|
|
||||||
|
|
||||||
def should_stop(self, star_bencher: CommandPerfEstimator):
|
|
||||||
# never start a new run
|
|
||||||
return True
|
|
||||||
|
|
||||||
|
|
||||||
class StopWhenConverged(IStarBencherStopCondition):
|
|
||||||
"""a stop condition that triggers when the just completed run doesn't have much effect on the average run's duration
|
|
||||||
"""
|
|
||||||
def __init__(self, max_error: float = 0.01):
|
|
||||||
self.max_error = max_error
|
|
||||||
self._last_mean_duration = None
|
|
||||||
|
|
||||||
def should_stop(self, star_bencher: CommandPerfEstimator) -> bool:
|
|
||||||
do_stop = False
|
|
||||||
mean_duration, _num_runs = star_bencher.get_run_mean_duration()
|
|
||||||
print(f'mean_duration = {mean_duration}')
|
|
||||||
if self._last_mean_duration is not None:
|
|
||||||
diff = abs(mean_duration - self._last_mean_duration)
|
|
||||||
print(f'diff = {diff}')
|
|
||||||
if diff < self.max_error:
|
|
||||||
do_stop = True
|
|
||||||
self._last_mean_duration = mean_duration
|
|
||||||
return do_stop
|
|
||||||
|
|
||||||
|
|
||||||
class CommandPerfEstimator(): # (false positive) pylint: disable=function-redefined
|
|
||||||
'''a command runner that runs a given command multiple times and measures the average execution duration
|
|
||||||
|
|
||||||
the 'star' term comes from hpl's stadgemm benchmark, where we launch `n` independent programs on `n` cores
|
|
||||||
'''
|
|
||||||
run_command: List[str] # the command that this instance of CommandPerfEstimator is expected to run (eg: ['ctest', '--output-on-failure', '-L', '^arch4_quick$']). The command supports the following tags:
|
|
||||||
run_command_cwd: Path # the current directory to use when executing run_command
|
|
||||||
stdout_filepath: Path # the path of the file that records the standard output of run_command
|
|
||||||
stderr_filepath: Path # the path of the file that records the standard error of run_command
|
|
||||||
num_cores_per_run: int # the max number of threads used by each run
|
|
||||||
num_parallel_runs: int # how many times run_command is run simultaneously
|
|
||||||
max_num_cores: int # the maximum allowed number of cores for this CommandPerfEstimator
|
|
||||||
stop_condition: IStarBencherStopCondition # the condition that is used so that this CommandPerfEstimator can decide to stop launching commands
|
|
||||||
stop_on_error: bool
|
|
||||||
_next_run_id: int
|
|
||||||
_runs: Dict[int, Run]
|
|
||||||
_last_mean_duration: Optional[DurationInSeconds]
|
|
||||||
_num_runs: int
|
|
||||||
_runs_lock: threading.Lock
|
|
||||||
_finished_event: threading.Event
|
|
||||||
|
|
||||||
def __init__(self, run_command: List[str], num_cores_per_run: int, num_parallel_runs: int, max_num_cores: int, stop_condition: IStarBencherStopCondition, stop_on_error=True, run_command_cwd: Path = None, stdout_filepath: Path = None, stderr_filepath: Path = None):
|
|
||||||
assert num_cores_per_run * num_parallel_runs <= max_num_cores
|
|
||||||
self.run_command = run_command
|
|
||||||
self.run_command_cwd = run_command_cwd
|
|
||||||
self.stdout_filepath = stdout_filepath
|
|
||||||
self.stderr_filepath = stderr_filepath
|
|
||||||
self.num_cores_per_run = num_cores_per_run
|
|
||||||
self.num_parallel_runs = num_parallel_runs
|
|
||||||
self.max_num_cores = max_num_cores
|
|
||||||
self.stop_condition = stop_condition
|
|
||||||
self.stop_on_error = stop_on_error
|
|
||||||
self._next_run_id = 0
|
|
||||||
self._runs = {}
|
|
||||||
self._last_mean_duration = None
|
|
||||||
self._num_runs = 0
|
|
||||||
self._runs_lock = threading.Lock()
|
|
||||||
self._finished_event = threading.Event()
|
|
||||||
|
|
||||||
def popen_and_call(self, popen_args: List[str], on_exit: Callable[[ProcessId, ReturnCode, RunId], None], run_id: RunId, cwd: Path, stdout_filepath: Path = None, stderr_filepath: Path = None):
|
|
||||||
"""
|
|
||||||
Runs the given args in a subprocess.Popen, and then calls the function
|
|
||||||
on_exit when the subprocess completes.
|
|
||||||
on_exit is a callable object, and popen_args is a list/tuple of args that
|
|
||||||
would give to subprocess.Popen.
|
|
||||||
"""
|
|
||||||
def run_in_thread(popen_args: List[str], on_exit: Callable[[ProcessId, ReturnCode, RunId], None]):
|
|
||||||
stdout = None
|
|
||||||
stderr = None
|
|
||||||
returncode = -1
|
|
||||||
pid = -1
|
|
||||||
streams_are_ok = True
|
|
||||||
try:
|
|
||||||
# with open(stdout_filepath, 'w', encoding='utf8') as stdout, open(stderr_filepath, 'w', encoding='utf8') as stderr:
|
|
||||||
if stdout_filepath is not None:
|
|
||||||
stdout = open(stdout_filepath, 'w', encoding='utf8')
|
|
||||||
if stderr_filepath is not None:
|
|
||||||
stderr = open(stderr_filepath, 'w', encoding='utf8')
|
|
||||||
except:
|
|
||||||
print(f'failed to open {stdout_filepath} or {stderr_filepath} in write mode')
|
|
||||||
streams_are_ok = False
|
|
||||||
if streams_are_ok:
|
|
||||||
try:
|
|
||||||
env = os.environ.copy()
|
|
||||||
# restrict the number of threads used by openmp
|
|
||||||
env['OMP_NUM_THREADS'] = f'{self.num_cores_per_run}'
|
|
||||||
# restrict the nu,ber of threads used by intel math kernel library
|
|
||||||
env['MKL_NUM_THREADS'] = f'{self.num_cores_per_run}'
|
|
||||||
proc = subprocess.Popen(popen_args, cwd=cwd, stdout=stdout, stderr=stderr, env=env)
|
|
||||||
pid = proc.pid
|
|
||||||
proc.wait()
|
|
||||||
returncode = proc.returncode
|
|
||||||
except:
|
|
||||||
print(f'command failed: {popen_args}')
|
|
||||||
on_exit(pid, returncode, run_id)
|
|
||||||
return
|
|
||||||
thread = threading.Thread(target=run_in_thread, args=(popen_args, on_exit))
|
|
||||||
thread.start()
|
|
||||||
# returns immediately after the thread starts
|
|
||||||
return thread
|
|
||||||
|
|
||||||
def get_run_mean_duration(self) -> Tuple[DurationInSeconds, int]:
|
|
||||||
"""returns the average duration of all completed runs of this CommandPerfEstimator instance
|
|
||||||
"""
|
|
||||||
duration_sums = 0.0 # in python3.6+, replace with duration_sums: float = 0.0
|
|
||||||
num_finished_runs = 0 # in python3.6+, replace with num_finished_runs: int = 0
|
|
||||||
with self._runs_lock:
|
|
||||||
for run in self._runs.values():
|
|
||||||
if run.has_finished():
|
|
||||||
num_finished_runs += 1
|
|
||||||
duration_sums += run.get_duration()
|
|
||||||
assert num_finished_runs > 0
|
|
||||||
return duration_sums / num_finished_runs, num_finished_runs
|
|
||||||
|
|
||||||
def _all_runs_have_finished(self):
|
|
||||||
with self._runs_lock:
|
|
||||||
for run in self._runs.values():
|
|
||||||
if not run.has_finished():
|
|
||||||
return False
|
|
||||||
return True
|
|
||||||
|
|
||||||
def on_exit(self, pid: ProcessId, return_code: ReturnCode, run_id: RunId):
|
|
||||||
"""method called when the command executed by a run ends. Unless the stop condition is met, a new run is started.
|
|
||||||
|
|
||||||
pid: the process identifier of the process of the run that just finished
|
|
||||||
return_code: the return code of the process of the run that just finished
|
|
||||||
run_id: the run that just completed
|
|
||||||
"""
|
|
||||||
end_time = datetime.now()
|
|
||||||
# print(self, pid, run_id)
|
|
||||||
run = self._runs[run_id]
|
|
||||||
run.pid = pid
|
|
||||||
run.end_time = end_time
|
|
||||||
run.return_code = return_code
|
|
||||||
|
|
||||||
do_stop = False
|
|
||||||
if self.stop_on_error and run.return_code != 0:
|
|
||||||
do_stop = True
|
|
||||||
else:
|
|
||||||
do_stop = self.stop_condition.should_stop(self)
|
|
||||||
if not do_stop:
|
|
||||||
# print('adding a run')
|
|
||||||
self._start_run(run.worker_id) # reuse the same worker as the run that has just finished
|
|
||||||
if self._all_runs_have_finished():
|
|
||||||
# tell the main thread that all the runs have finished
|
|
||||||
self._finished_event.set()
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _interpret_tags(tagged_string: str, tags_value: Dict[str, str]) -> str:
|
|
||||||
untagged_string = tagged_string
|
|
||||||
for tag_id, tag_value in tags_value.items():
|
|
||||||
assert isinstance(untagged_string, str)
|
|
||||||
untagged_string = untagged_string.replace(tag_id, tag_value)
|
|
||||||
return untagged_string
|
|
||||||
|
|
||||||
def _start_run(self, worker_id: WorkerId):
|
|
||||||
"""starts a run using the given worker"""
|
|
||||||
tags_value = {
|
|
||||||
'<worker_id>': f'{worker_id:03d}'
|
|
||||||
}
|
|
||||||
run_command = [CommandPerfEstimator._interpret_tags(s, tags_value) for s in self.run_command]
|
|
||||||
run_command_cwd = CommandPerfEstimator._interpret_tags(str(self.run_command_cwd), tags_value)
|
|
||||||
stdout_filepath = None
|
|
||||||
if self.stdout_filepath is not None:
|
|
||||||
stdout_filepath = CommandPerfEstimator._interpret_tags(str(self.stdout_filepath), tags_value)
|
|
||||||
Path(stdout_filepath).parent.mkdir(exist_ok=True)
|
|
||||||
stderr_filepath = None
|
|
||||||
if self.stderr_filepath is not None:
|
|
||||||
stderr_filepath = CommandPerfEstimator._interpret_tags(str(self.stderr_filepath), tags_value)
|
|
||||||
Path(stderr_filepath).parent.mkdir(exist_ok=True)
|
|
||||||
|
|
||||||
with self._runs_lock:
|
|
||||||
run = Run(self._next_run_id, worker_id)
|
|
||||||
self._next_run_id += 1
|
|
||||||
self._runs[run.id] = run
|
|
||||||
_run_thread = self.popen_and_call(popen_args=run_command, on_exit=self.on_exit, run_id=run.id, cwd=run_command_cwd, stdout_filepath=stdout_filepath, stderr_filepath=stderr_filepath) # noqa:F841
|
|
||||||
|
|
||||||
def run(self) -> DurationInSeconds:
|
|
||||||
'''performs the runs of the command and returns the runs' average duration'''
|
|
||||||
print(f"executing the following command in parallel ({self.num_parallel_runs} parallel runs) : '{str(self.run_command)}'")
|
|
||||||
for worker_id in range(self.num_parallel_runs):
|
|
||||||
self._start_run(worker_id)
|
|
||||||
# wait until all runs have finished
|
|
||||||
self._finished_event.wait()
|
|
||||||
with self._runs_lock:
|
|
||||||
workers_success = [run.return_code == 0 for run in self._runs.values()]
|
|
||||||
if not all(workers_success):
|
|
||||||
raise StarBenchException(f'at least one run failed (workers_success = {workers_success})')
|
|
||||||
mean_duration, num_runs = self.get_run_mean_duration()
|
|
||||||
print(f'mean duration : {mean_duration:.3f} s ({num_runs} runs)')
|
|
||||||
return mean_duration
|
|
|
@ -1,166 +0,0 @@
|
||||||
#!/usr/bin/env python3
|
|
||||||
'''starbench is an application that is able to measure the execution time of a user software suite in various conditions (different build modes and different execution modes)
|
|
||||||
|
|
||||||
'''
|
|
||||||
__version__ = '1.0.0'
|
|
||||||
import argparse
|
|
||||||
from abc import ABC, abstractmethod
|
|
||||||
import subprocess
|
|
||||||
from typing import List, Optional
|
|
||||||
from pathlib import Path
|
|
||||||
from .core import GitCommitId, Url, CommandPerfEstimator, StopAfterSingleRun
|
|
||||||
|
|
||||||
|
|
||||||
class IFileTreeProvider(ABC):
|
|
||||||
|
|
||||||
@abstractmethod
|
|
||||||
def get_source_tree_path(self) -> Path:
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
class ExistingDir(IFileTreeProvider):
|
|
||||||
dir_path: Path
|
|
||||||
|
|
||||||
def __init__(self, dir_path: Path):
|
|
||||||
self.dir_path = dir_path
|
|
||||||
|
|
||||||
def get_source_tree_path(self) -> Path:
|
|
||||||
return self.dir_path
|
|
||||||
|
|
||||||
|
|
||||||
class GitRepos(IFileTreeProvider):
|
|
||||||
git_repos_url: Url
|
|
||||||
git_user: Optional[str]
|
|
||||||
git_password: Optional[str]
|
|
||||||
code_version: Optional[GitCommitId]
|
|
||||||
src_dir: Optional[Path] # the temporary directory used to populate the source code
|
|
||||||
|
|
||||||
def __init__(self, git_repos_url: Url, git_user: Optional[str] = None, git_password: Optional[str] = None, code_version: Optional[GitCommitId] = None, src_dir: Optional[Path] = None):
|
|
||||||
self.git_repos_url = git_repos_url
|
|
||||||
self.git_user = git_user
|
|
||||||
self.git_password = git_password
|
|
||||||
self.code_version = code_version
|
|
||||||
self.src_dir = src_dir
|
|
||||||
|
|
||||||
def get_source_tree_path(self) -> Path:
|
|
||||||
self.src_dir.mkdir(exist_ok=True)
|
|
||||||
git_credentials = []
|
|
||||||
if self.git_user:
|
|
||||||
git_credentials.append(self.git_user)
|
|
||||||
if self.git_password:
|
|
||||||
git_credentials.append(self.git_password)
|
|
||||||
git_repos_url = self.git_repos_url
|
|
||||||
if len(git_credentials) != 0:
|
|
||||||
git_repos_url = git_repos_url.replace('https://', f"https://{':'.join(git_credentials)}@")
|
|
||||||
# src_dir.mkdir(exist_ok=True)
|
|
||||||
subprocess.run(['git', 'clone', f'{str(self.git_repos_url)}', str(self.src_dir)], cwd=str(self.src_dir), check=True)
|
|
||||||
if self.code_version:
|
|
||||||
subprocess.run(['git', 'checkout', f'{self.code_version}'], cwd=str(self.src_dir), check=True)
|
|
||||||
return self.src_dir
|
|
||||||
|
|
||||||
|
|
||||||
def starbench_cmake_app(source_code_provider: IFileTreeProvider, tmp_dir: Path, num_cores: int, benchmark_command: List[str], cmake_options: Optional[List[str]] = None, cmake_exe_location: Path = None):
|
|
||||||
"""
|
|
||||||
tests_to_run : regular expression as understood by ctest's -L option. eg '^arch4_quick$'
|
|
||||||
"""
|
|
||||||
src_dir = source_code_provider.get_source_tree_path()
|
|
||||||
# we need one build for each parallel run, otherwise running ctest on parallel would overwrite the same file, which causes the test to randomly fail depnding on race conditions
|
|
||||||
worker_dir = tmp_dir / 'worker<worker_id>'
|
|
||||||
build_dir = worker_dir / 'build'
|
|
||||||
if cmake_options is None:
|
|
||||||
cmake_options = []
|
|
||||||
print(f'creating build directory {worker_dir}')
|
|
||||||
create_build_dir = CommandPerfEstimator(
|
|
||||||
run_command=['mkdir', '-p', str(build_dir)],
|
|
||||||
num_cores_per_run=1,
|
|
||||||
num_parallel_runs=num_cores,
|
|
||||||
max_num_cores=num_cores,
|
|
||||||
stop_condition=StopAfterSingleRun(),
|
|
||||||
run_command_cwd=Path('/tmp'),
|
|
||||||
stdout_filepath=worker_dir / 'createdir_stdout.txt',
|
|
||||||
stderr_filepath=worker_dir / 'createdir_stderr.txt')
|
|
||||||
_create_build_dir_duration = create_build_dir.run() # noqa: F841
|
|
||||||
# build_dir.mkdir(exist_ok=True)
|
|
||||||
|
|
||||||
print(f'configuring {src_dir} into {build_dir} ...')
|
|
||||||
cmake_prog = 'cmake'
|
|
||||||
if cmake_exe_location:
|
|
||||||
cmake_prog = str(cmake_exe_location)
|
|
||||||
configure = CommandPerfEstimator(
|
|
||||||
run_command=[cmake_prog] + cmake_options + [str(src_dir)],
|
|
||||||
num_cores_per_run=1,
|
|
||||||
num_parallel_runs=num_cores,
|
|
||||||
max_num_cores=num_cores,
|
|
||||||
stop_condition=StopAfterSingleRun(),
|
|
||||||
run_command_cwd=build_dir,
|
|
||||||
stdout_filepath=worker_dir / 'configure_stdout.txt',
|
|
||||||
stderr_filepath=worker_dir / 'configure_stderr.txt')
|
|
||||||
_configure_duration = configure.run() # noqa: F841
|
|
||||||
|
|
||||||
print(f'building {build_dir} ...')
|
|
||||||
build = CommandPerfEstimator(
|
|
||||||
run_command=['make'],
|
|
||||||
num_cores_per_run=1,
|
|
||||||
num_parallel_runs=num_cores,
|
|
||||||
max_num_cores=num_cores,
|
|
||||||
stop_condition=StopAfterSingleRun(),
|
|
||||||
run_command_cwd=build_dir,
|
|
||||||
stdout_filepath=worker_dir / 'build_stdout.txt',
|
|
||||||
stderr_filepath=worker_dir / 'build_stderr.txt')
|
|
||||||
_build_duration = build.run() # noqa: F841
|
|
||||||
|
|
||||||
print(f'benchmarking {build_dir} ...')
|
|
||||||
stop_condition = StopAfterSingleRun()
|
|
||||||
bench = CommandPerfEstimator(
|
|
||||||
run_command=benchmark_command,
|
|
||||||
num_cores_per_run=1,
|
|
||||||
num_parallel_runs=num_cores,
|
|
||||||
max_num_cores=num_cores,
|
|
||||||
stop_condition=stop_condition,
|
|
||||||
run_command_cwd=build_dir,
|
|
||||||
stdout_filepath=worker_dir / 'bench_stdout.txt',
|
|
||||||
stderr_filepath=worker_dir / 'bench_stderr.txt')
|
|
||||||
mean_duration = bench.run()
|
|
||||||
print(f'duration : {mean_duration:.3f} s' % ())
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
'''main program'''
|
|
||||||
|
|
||||||
example_text = '''example:
|
|
||||||
|
|
||||||
%(prog)s --git-repos-url https://github.com/hibridon/hibridon --code-version a3bed1c3ccfbca572003020d3e3d3b1ff3934fad --git-user g-raffy --git-pass-file "$HOME/.github/personal_access_tokens/bench.hibridon.cluster.ipr.univ-rennes1.fr.pat" --num-cores 2 --output-dir=/tmp/hibench --cmake-path=/opt/cmake/cmake-3.23.0/bin/cmake --cmake-option=-DCMAKE_BUILD_TYPE=Release --cmake-option=-DBUILD_TESTING=ON --benchmark-command='ctest --output-on-failure -L ^arch4_quick$'
|
|
||||||
|
|
||||||
'''
|
|
||||||
|
|
||||||
parser = argparse.ArgumentParser(description='performs a benchmark on a cmake buildable app hosted on a git repository', epilog=example_text, formatter_class=argparse.RawDescriptionHelpFormatter)
|
|
||||||
parser.add_argument('--git-repos-url', required=True, help='the url of the code to benchmark (eg https://github.com/hibridon/hibridon)')
|
|
||||||
parser.add_argument('--code-version', help='the version of the code to use; either a branch or a commit id (eg a3bed1c3ccfbca572003020d3e3d3b1ff3934fad)')
|
|
||||||
parser.add_argument('--git-user', help='the git user to use to clone the code repository')
|
|
||||||
password_group = parser.add_mutually_exclusive_group()
|
|
||||||
password_group.add_argument('--git-pass-file', help='the path to a file containing the password (or personal access token)')
|
|
||||||
password_group.add_argument('--git-pass', type=str, help='the password (or personal access token) to use (not recommended for security reasons)')
|
|
||||||
parser.add_argument('--num-cores', type=int, required=True, help='the number of cores that the benchmark will use')
|
|
||||||
parser.add_argument('--output-dir', type=Path, required=True, help='where the output files will be placed')
|
|
||||||
parser.add_argument('--cmake-path', type=Path, help='the path to the cmake executable to use in case a specific cmake is wanted')
|
|
||||||
parser.add_argument('--cmake-option', type=str, action='append', help='additional option passed to cmake in the configure step (use this flag multiple times if you need more than one cmake option)')
|
|
||||||
parser.add_argument('--benchmark-command', required=True, type=str, help='the command to benchmark')
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
git_user = args.git_user
|
|
||||||
git_repos_url = args.git_repos_url
|
|
||||||
|
|
||||||
git_password = None
|
|
||||||
if args.git_pass:
|
|
||||||
git_password = args.git_pass
|
|
||||||
elif args.git_pass_file:
|
|
||||||
with open(args.git_pass_file, 'r', encoding='utf8') as f:
|
|
||||||
git_password = f.readline().replace('\n', '') # os.environ['HIBRIDON_REPOS_PAT']
|
|
||||||
|
|
||||||
source_tree_provider = GitRepos(git_repos_url=git_repos_url, code_version=args.code_version, git_user=git_user, git_password=git_password, src_dir=args.output_dir / 'source.git')
|
|
||||||
|
|
||||||
starbench_cmake_app(source_tree_provider, tmp_dir=args.output_dir, num_cores=args.num_cores, cmake_options=args.cmake_option, benchmark_command=args.benchmark_command.split(' '), cmake_exe_location=args.cmake_path)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
main()
|
|
|
@ -1,43 +0,0 @@
|
||||||
|
|
||||||
enable_language (Fortran)
|
|
||||||
|
|
||||||
set(MAMUL1_USE_MAGMA "OFF" CACHE BOOL "if set, mamul1 build uses magma (matrix algebra on gpu)")
|
|
||||||
|
|
||||||
set(MAMUL1_MAGMA_API "CPU_MEM_API" CACHE STRING "which magma API to use when building mamul1: CPU_MEM_API for BLAS compatible API (uses matrices stored on CPU memory) or GPU_MEM_API (use matrices stored on GPU memory)")
|
|
||||||
|
|
||||||
add_executable(mamul1 mamul1.F90)
|
|
||||||
|
|
||||||
if (CMAKE_Fortran_COMPILER_ID STREQUAL "GNU")
|
|
||||||
# Allow arbitrary long lines. Needed as preprocessing could generate long line lengths.
|
|
||||||
target_compile_options(mamul1 PUBLIC -ffree-line-length-none)
|
|
||||||
elseif (Fortran_COMPILER_NAME STREQUAL "ifort")
|
|
||||||
# Intel (ifort)
|
|
||||||
target_compile_options(mamul1 PUBLIC -no-wrap-margin)
|
|
||||||
endif()
|
|
||||||
|
|
||||||
|
|
||||||
if (MAMUL1_USE_MAGMA)
|
|
||||||
find_package( MAGMA REQUIRED )
|
|
||||||
if( MAMUL1_MAGMA_API STREQUAL "CPU_MEM_API" )
|
|
||||||
target_compile_definitions(mamul1 PUBLIC USE_MAGMA_DGEMM)
|
|
||||||
elseif( MAMUL1_MAGMA_API STREQUAL "GPU_MEM_API" )
|
|
||||||
target_compile_definitions(mamul1 PUBLIC USE_MAGMA_DGEMM_GPU)
|
|
||||||
else()
|
|
||||||
message(FATAL_ERROR "unexpected value for MAMUL1_MAGMA_API : ${MAMUL1_MAGMA_API}")
|
|
||||||
endif()
|
|
||||||
message(STATUS "MAGMA_INCLUDES=${MAGMA_INCLUDES}")
|
|
||||||
include_directories("${MAGMA_INCLUDES}")
|
|
||||||
target_link_libraries(mamul1 "${MAGMA_LIBRARIES}")
|
|
||||||
else()
|
|
||||||
find_package( BLAS REQUIRED )
|
|
||||||
find_package( LAPACK REQUIRED )
|
|
||||||
# message("BLAS_LIBRARIES=${BLAS_LIBRARIES}")
|
|
||||||
# message("LAPACK_LIBRARIES=${LAPACK_LIBRARIES}")
|
|
||||||
target_compile_definitions(mamul1 PUBLIC USE_DGEMM)
|
|
||||||
|
|
||||||
# Link Blas and Lapack libraries
|
|
||||||
target_link_libraries(mamul1 "${LAPACK_LIBRARIES}")
|
|
||||||
target_link_libraries(mamul1 "${BLAS_LIBRARIES}")
|
|
||||||
endif()
|
|
||||||
|
|
||||||
install(TARGETS mamul1)
|
|
|
@ -1,339 +0,0 @@
|
||||||
#define MAMUL1_VERSION "1.0.0"
|
|
||||||
|
|
||||||
#define magma_devptr_t integer(kind=8)
|
|
||||||
subroutine print_usage(prog_path)
|
|
||||||
character(len=*), intent(in) :: prog_path
|
|
||||||
character(len=80) :: build_variant
|
|
||||||
#if defined(USE_MAGMA_DGEMM_GPU)
|
|
||||||
build_variant='gpu'
|
|
||||||
#elif defined(USE_DGEMM)
|
|
||||||
build_variant='cpu'
|
|
||||||
#else
|
|
||||||
build_variant='unknown'
|
|
||||||
#endif
|
|
||||||
write(6,'("mamul1 v",a," (variant:",a,"): benchmark performs a square matrix multiplication in double precision")') MAMUL1_VERSION, trim(build_variant);
|
|
||||||
write(6,'()');
|
|
||||||
write(6,'("Usage: ",a," <NDIM> <NUM_LOOPS>")') trim(prog_path);
|
|
||||||
write(6,'(" <NDIM> positive integer representing the size of the square matrices to multiply ")');
|
|
||||||
write(6,'(" <NUM_LOOPS> positive integer representing the number of times the multiplication is performed")');
|
|
||||||
end subroutine
|
|
||||||
|
|
||||||
program mamul1
|
|
||||||
|
|
||||||
implicit none
|
|
||||||
|
|
||||||
|
|
||||||
integer :: argc, info, ndim, num_loops
|
|
||||||
|
|
||||||
character(len=32) :: arg0, arg1, arg2
|
|
||||||
|
|
||||||
|
|
||||||
call get_command_argument(0,arg0)
|
|
||||||
|
|
||||||
argc = command_argument_count()
|
|
||||||
if (argc /= 2) then
|
|
||||||
call print_usage(trim(arg0))
|
|
||||||
! write(6,'("Usage: ",a," NDIM NUM_LOOPS, where NDIM is a positive integer")') trim(arg0);
|
|
||||||
stop
|
|
||||||
end if
|
|
||||||
|
|
||||||
call get_command_argument(1,arg1,status=info)
|
|
||||||
if (info /= 0) then
|
|
||||||
write(6,'("Error reading argument: info = ",i2)') info
|
|
||||||
call print_usage(trim(arg0))
|
|
||||||
stop
|
|
||||||
end if
|
|
||||||
|
|
||||||
call get_command_argument(2,arg2,status=info)
|
|
||||||
if (info /= 0) then
|
|
||||||
write(6,'("Error reading argument: info = ",i2)') info
|
|
||||||
call print_usage(trim(arg0))
|
|
||||||
stop
|
|
||||||
end if
|
|
||||||
|
|
||||||
read(arg1,*,iostat=info) ndim
|
|
||||||
if (info /= 0) then
|
|
||||||
write(6,'("Error converting ndim argument to integer: info = ",i2)') info
|
|
||||||
call print_usage(trim(arg0))
|
|
||||||
stop
|
|
||||||
end if
|
|
||||||
|
|
||||||
read(arg2,*,iostat=info) num_loops
|
|
||||||
if (info /= 0) then
|
|
||||||
write(6,'("Error converting num_loops argument to integer: info = ",i2)') info
|
|
||||||
call print_usage(trim(arg0))
|
|
||||||
stop
|
|
||||||
end if
|
|
||||||
|
|
||||||
|
|
||||||
if (ndim < 1) then
|
|
||||||
call print_usage(trim(arg0))
|
|
||||||
stop
|
|
||||||
end if
|
|
||||||
|
|
||||||
call test_dgemm(ndim, num_loops)
|
|
||||||
|
|
||||||
stop
|
|
||||||
end program mamul1
|
|
||||||
|
|
||||||
subroutine set_random_seed(seed)
|
|
||||||
integer :: seed
|
|
||||||
integer :: seed_array_size
|
|
||||||
INTEGER, ALLOCATABLE :: seed_array (:)
|
|
||||||
CALL RANDOM_SEED (SIZE = seed_array_size) ! I is set to the size of
|
|
||||||
! ! the seed array
|
|
||||||
ALLOCATE (seed_array(seed_array_size))
|
|
||||||
seed_array = seed
|
|
||||||
CALL RANDOM_SEED (PUT=seed_array(1:seed_array_size))
|
|
||||||
end subroutine
|
|
||||||
|
|
||||||
subroutine print_matrix(mat, ndim)
|
|
||||||
implicit none
|
|
||||||
integer, parameter :: dp = kind(1.0d0)
|
|
||||||
real(dp), intent(in) :: mat(ndim, ndim)
|
|
||||||
integer, intent(in) :: ndim
|
|
||||||
integer :: irow
|
|
||||||
do irow = 1, ndim
|
|
||||||
write(6, *) mat(irow,:)
|
|
||||||
end do
|
|
||||||
end subroutine
|
|
||||||
|
|
||||||
! square matrix multiplication
|
|
||||||
subroutine sqmatmul(amat, bmat, cmat, ndim)
|
|
||||||
#if defined(USE_MAGMA_DGEMM_GPU)
|
|
||||||
use magma, only: magmaf_init, magmaf_finalize
|
|
||||||
use magma, only: magmaf_queue_create, magmaf_queue_destroy
|
|
||||||
use magma, only: magmaf_dmalloc, magmaf_free
|
|
||||||
use magma, only: magmaf_dsetmatrix, magmaf_dgetmatrix
|
|
||||||
use magma, only: magmablasf_dgemm
|
|
||||||
#endif
|
|
||||||
real*8, intent(in) :: amat(ndim,ndim)
|
|
||||||
real*8, intent(in) :: bmat(ndim,ndim)
|
|
||||||
real*8, intent(out) :: cmat(ndim,ndim)
|
|
||||||
integer :: lda, ldb, ldc
|
|
||||||
integer :: info
|
|
||||||
|
|
||||||
real :: time_before, time_after
|
|
||||||
integer(8) :: num_ops
|
|
||||||
real :: gflops
|
|
||||||
|
|
||||||
#ifdef USE_MAGMA_DGEMM_GPU
|
|
||||||
magma_devptr_t :: d_amat
|
|
||||||
magma_devptr_t :: d_bmat
|
|
||||||
magma_devptr_t :: d_cmat
|
|
||||||
magma_devptr_t :: queue !! really a CPU pointer
|
|
||||||
#endif
|
|
||||||
lda = ceiling(real(ndim)/32)*32
|
|
||||||
ldb = ceiling(real(ndim)/32)*32
|
|
||||||
ldc = ceiling(real(ndim)/32)*32
|
|
||||||
|
|
||||||
|
|
||||||
#if defined(USE_MAGMA_DGEMM_GPU)
|
|
||||||
!! allocate GPU memory
|
|
||||||
write(6,'("DEBUG: before matrix A gpu memory allocation (",i0," doubles)")') lda * ndim
|
|
||||||
info = magmaf_dmalloc( d_amat, lda*ndim )
|
|
||||||
if (d_amat == 0) then
|
|
||||||
print "(a)", "failed to allocate d_amat"
|
|
||||||
return
|
|
||||||
endif
|
|
||||||
write(6,'("DEBUG: before matrix B gpu memory allocation (",i0," doubles)")') ldb * ndim
|
|
||||||
info = magmaf_dmalloc( d_bmat, ldb*ndim )
|
|
||||||
if (d_bmat == 0) then
|
|
||||||
print "(a)", "failed to allocate d_bmat"
|
|
||||||
return
|
|
||||||
endif
|
|
||||||
write(6,'("DEBUG: before matrix C gpu memory allocation (",i0," doubles)")') ldc * ndim
|
|
||||||
info = magmaf_dmalloc( d_cmat, ldc*ndim )
|
|
||||||
if (d_cmat == 0) then
|
|
||||||
print "(a)", "failed to allocate d_cmat"
|
|
||||||
return
|
|
||||||
endif
|
|
||||||
|
|
||||||
! copy A to dA and B to dB
|
|
||||||
call magmaf_queue_create( 0, queue )
|
|
||||||
write(6,'("DEBUG: queue = ",i0)') queue
|
|
||||||
if (queue == 0) then
|
|
||||||
print "(a)", "failed to create a queue"
|
|
||||||
return
|
|
||||||
endif
|
|
||||||
|
|
||||||
write(6,*) 'DEBUG: copying matrix A from CPU to GPU memory'
|
|
||||||
call magmaf_dsetmatrix( ndim, ndim, amat, ndim, d_amat, lda, queue )
|
|
||||||
write(6,*) 'DEBUG: copying matrix B from CPU to GPU memory'
|
|
||||||
call magmaf_dsetmatrix( ndim, ndim, bmat, ndim, d_bmat, ldb, queue )
|
|
||||||
|
|
||||||
call cpu_time(time_before)
|
|
||||||
write (6,*) 'before magmablasf_dgemm, time=', time_before
|
|
||||||
|
|
||||||
call magmablasf_dgemm ('N', 'N', ndim, ndim, ndim, 1.0d0, d_amat, lda, d_bmat, ldb, 0.0d0, d_cmat, ldc, queue)
|
|
||||||
call magmaf_queue_sync(queue)
|
|
||||||
|
|
||||||
call cpu_time(time_after)
|
|
||||||
num_ops = real(ndim) * real(ndim) * real(ndim) * 2
|
|
||||||
gflops = num_ops / (time_after - time_before) / 1.0e9
|
|
||||||
write (6,*) 'after magmablasf_dgemm, time=', time_after
|
|
||||||
write (6,*) 'magmablasf_dgemm (from gpu memory to gpu memory) duration :', (time_after - time_before), '(', gflops, ' gflops)'
|
|
||||||
|
|
||||||
write(6,*) 'DEBUG: copying matrix C from GPU to CPU memory'
|
|
||||||
call magmaf_dgetmatrix( ndim, ndim, d_cmat, ldc, cmat, ndim, queue )
|
|
||||||
call magmaf_queue_destroy( queue )
|
|
||||||
|
|
||||||
info = magmaf_free(d_cmat)
|
|
||||||
info = magmaf_free(d_bmat)
|
|
||||||
info = magmaf_free(d_amat)
|
|
||||||
|
|
||||||
#endif
|
|
||||||
|
|
||||||
#ifdef USE_DGEMM
|
|
||||||
! subroutine dgemm ( character TRANSA,
|
|
||||||
! character TRANSB,
|
|
||||||
! integer M,
|
|
||||||
! integer N,
|
|
||||||
! integer K,
|
|
||||||
! double precision ALPHA,
|
|
||||||
! double precision, dimension(lda,*) A,
|
|
||||||
! integer LDA,
|
|
||||||
! double precision, dimension(ldb,*) B,
|
|
||||||
! integer LDB,
|
|
||||||
! double precision BETA,
|
|
||||||
! double precision, dimension(ldc,*) C,
|
|
||||||
! integer LDC
|
|
||||||
! )
|
|
||||||
call dgemm('N', 'N', ndim, ndim, ndim, 1.0d0, amat, ndim, bmat, ndim, 0.0d0, cmat, ndim)
|
|
||||||
#endif
|
|
||||||
|
|
||||||
end subroutine
|
|
||||||
|
|
||||||
subroutine check_cmat_element(cmat, row, col, amat, bmat, ndim)
|
|
||||||
real(8), intent(in) :: cmat(ndim, ndim)
|
|
||||||
integer, intent(in) :: row
|
|
||||||
integer, intent(in) :: col
|
|
||||||
real(8), intent(in) :: amat(ndim, ndim)
|
|
||||||
real(8), intent(in) :: bmat(ndim, ndim)
|
|
||||||
integer, intent(in) :: ndim
|
|
||||||
|
|
||||||
real(8) :: x
|
|
||||||
x = 0.0d0
|
|
||||||
do i = 1, ndim
|
|
||||||
x = x + amat(row, i) * bmat(i, col)
|
|
||||||
end do
|
|
||||||
|
|
||||||
write(6, '("expected cmat(", i0, ", ", i0, ")", e23.15e3)') row, col, x
|
|
||||||
write(6, '("computed cmat(", i0, ", ", i0, ")", e23.15e3)') row, col, cmat(row, col)
|
|
||||||
if (abs(cmat(row, col) - x) > 1.0e-8) then
|
|
||||||
stop 'a computed element has a wrong value'
|
|
||||||
end if
|
|
||||||
end subroutine
|
|
||||||
|
|
||||||
|
|
||||||
subroutine test_dgemm(ndim, num_loops)
|
|
||||||
#if defined(USE_MAGMA_DGEMM_GPU)
|
|
||||||
use magma, only: magmaf_init, magmaf_finalize
|
|
||||||
use magma, only: magmablasf_dgemm !, magmaf_dgemm_gpu
|
|
||||||
#endif
|
|
||||||
|
|
||||||
implicit none
|
|
||||||
integer, intent(in) :: ndim
|
|
||||||
integer, intent(in) :: num_loops
|
|
||||||
integer, parameter :: dp = kind(1.0d0)
|
|
||||||
real :: ct_start, ct_stop ! elapsed cpu time relative to an arbitrary fixed time. Expressed in seconds with the granularity of 1 microsecond
|
|
||||||
integer(8) :: num_ops
|
|
||||||
real :: gflops
|
|
||||||
|
|
||||||
integer :: sc_start, sc_stop ! system clock time of start and stop events, expressed in ticks
|
|
||||||
integer :: sc_count_rate ! number of system clock ticks per second
|
|
||||||
integer :: sc_count_max ! the max possible number of system clock ticks returned by system_clock
|
|
||||||
integer :: s
|
|
||||||
REAL :: a_diff, diff
|
|
||||||
REAL :: num_sc_ticks_per_second ! the number of system clock ticks per second
|
|
||||||
|
|
||||||
real*8, allocatable :: amat(:,:)
|
|
||||||
real*8, allocatable :: bmat(:,:)
|
|
||||||
real*8, allocatable :: cmat(:,:)
|
|
||||||
real(dp) :: x
|
|
||||||
integer :: i, j
|
|
||||||
|
|
||||||
#if defined(USE_MAGMA_DGEMM_GPU)
|
|
||||||
write(6,*) 'DEBUG: init magma'
|
|
||||||
call magmaf_init()
|
|
||||||
#endif
|
|
||||||
|
|
||||||
! First initialize the system_clock
|
|
||||||
CALL system_clock(count_rate=sc_count_rate)
|
|
||||||
CALL system_clock(count_max=sc_count_max)
|
|
||||||
num_sc_ticks_per_second = REAL(sc_count_rate)
|
|
||||||
WRITE(*,*) "system_clock rate : ", num_sc_ticks_per_second, " ticks per second"
|
|
||||||
|
|
||||||
diff = 0.0
|
|
||||||
a_diff = 0.0
|
|
||||||
s = 0
|
|
||||||
|
|
||||||
allocate(amat(ndim, ndim))
|
|
||||||
allocate(bmat(ndim, ndim))
|
|
||||||
allocate(cmat(ndim, ndim))
|
|
||||||
|
|
||||||
call set_random_seed(42)
|
|
||||||
|
|
||||||
!call random_number(amat)
|
|
||||||
!amat = 0.5_dp*(amat + transpose(amat))
|
|
||||||
do j = 1, ndim
|
|
||||||
do i = 1, ndim
|
|
||||||
call random_number(x)
|
|
||||||
amat(i,j) = x
|
|
||||||
call random_number(x)
|
|
||||||
bmat(i,j) = x
|
|
||||||
end do
|
|
||||||
end do
|
|
||||||
|
|
||||||
call cpu_time(ct_start)
|
|
||||||
call system_clock(sc_start)
|
|
||||||
|
|
||||||
do j = 1, num_loops
|
|
||||||
! playmat = amat
|
|
||||||
|
|
||||||
call sqmatmul(amat, bmat, cmat, ndim)
|
|
||||||
|
|
||||||
end do
|
|
||||||
|
|
||||||
call cpu_time(ct_stop)
|
|
||||||
call system_clock(sc_stop)
|
|
||||||
if ( (sc_stop - sc_start)/num_sc_ticks_per_second < (ct_stop - ct_start) ) s = s + 1
|
|
||||||
diff = (sc_stop - sc_start)/num_sc_ticks_per_second - (ct_stop - ct_start) + diff
|
|
||||||
a_diff = ABS((sc_stop - sc_start)/num_sc_ticks_per_second - (ct_stop - ct_start)) + a_diff
|
|
||||||
|
|
||||||
! check one of the elements of cmat (the last one here: cmat(ndim, ndim))
|
|
||||||
call check_cmat_element(cmat, 1, 1, amat, bmat, ndim)
|
|
||||||
call check_cmat_element(cmat, 1, ndim, amat, bmat, ndim)
|
|
||||||
call check_cmat_element(cmat, ndim, 1, amat, bmat, ndim)
|
|
||||||
call check_cmat_element(cmat, ndim, ndim, amat, bmat, ndim)
|
|
||||||
|
|
||||||
! write(6, *) 'amat = '
|
|
||||||
! call print_matrix(amat, ndim)
|
|
||||||
|
|
||||||
! write(6, *) 'bmat = '
|
|
||||||
! call print_matrix(bmat, ndim)
|
|
||||||
|
|
||||||
! write(6, *) 'cmat = '
|
|
||||||
! call print_matrix(cmat, ndim)
|
|
||||||
|
|
||||||
num_ops = real(ndim) * real(ndim) * real(ndim) * 2 * num_loops
|
|
||||||
gflops = num_ops / (ct_stop-ct_start) / 1.0e9
|
|
||||||
|
|
||||||
|
|
||||||
write(6, '("Time taken by dgemm for matrix size ",i8," was ",f10.2," seconds")') ndim, ct_stop-ct_start
|
|
||||||
WRITE(*,*) "gflops (including potential memory transfers) : ", gflops
|
|
||||||
|
|
||||||
WRITE(*,*) "system_clock : ",(sc_stop - sc_start)/num_sc_ticks_per_second
|
|
||||||
WRITE(*,*) "cpu_time : ",(ct_stop - ct_start)
|
|
||||||
WRITE(*,*) "sys_clock < cpu_time : ",s
|
|
||||||
WRITE(*,*) "mean diff : ",diff
|
|
||||||
WRITE(*,*) "abs mean diff : ",a_diff
|
|
||||||
|
|
||||||
#if defined(USE_MAGMA_DGEMM_GPU)
|
|
||||||
write(6,*) 'DEBUG: deinit magma'
|
|
||||||
call magmaf_finalize()
|
|
||||||
#endif
|
|
||||||
|
|
||||||
|
|
||||||
deallocate(amat, bmat, cmat)
|
|
||||||
end
|
|
|
@ -1,25 +0,0 @@
|
||||||
import unittest
|
|
||||||
import logging
|
|
||||||
from pathlib import Path
|
|
||||||
# from cocluto import ClusterController
|
|
||||||
from starbench.main import starbench_cmake_app, ExistingDir
|
|
||||||
|
|
||||||
|
|
||||||
class StarbenchTestCase(unittest.TestCase):
|
|
||||||
|
|
||||||
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
|
|
||||||
|
|
||||||
def setUp(self) -> None: # pylint: disable=useless-parent-delegation
|
|
||||||
return super().setUp()
|
|
||||||
|
|
||||||
def test_mamul1_benchmark(self):
|
|
||||||
logging.info('test_mamul1_benchmark')
|
|
||||||
source_code_provider = ExistingDir(Path('test/mamul1').absolute())
|
|
||||||
tmp_dir = Path('tmp').absolute()
|
|
||||||
benchmark_command = ['./mamul1', '3000', '10']
|
|
||||||
starbench_cmake_app(source_code_provider=source_code_provider, tmp_dir=tmp_dir, num_cores=2, benchmark_command=benchmark_command)
|
|
||||||
# self.assertIsInstance(job_state, JobsState)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
unittest.main()
|
|
|
@ -1,27 +0,0 @@
|
||||||
This example illustrates how `starbench` is used at IPR (Institut de Physique de Rennes) to measure the performance of [hibridon](https://github.com/hibridon/hibridon) on IPR's cluster (`physix`)
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
usage:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
20241007-15:08:10 graffy@graffy-ws2:~/work/starbench/starbench.git$ rsync --exclude .git --exclude starbench.venv --exclude tmp --exclude usecases/ipr/hibench/results -va ./ graffy@alambix.ipr.univ-rennes.fr:/opt/ipr/cluster/work.global/graffy/starbench.git/
|
|
||||||
sending incremental file list
|
|
||||||
|
|
||||||
sent 1,416 bytes received 25 bytes 960.67 bytes/sec
|
|
||||||
total size is 140,225 speedup is 97.31
|
|
||||||
last command status : [0]
|
|
||||||
```
|
|
||||||
|
|
||||||
```sh
|
|
||||||
graffy@alambix-frontal:/opt/ipr/cluster/work.global/graffy/starbench.git/usecases/ipr/hibench$ ./hibenchonphysix.py --commit-id 53894da48505892bfa05693a52312bacb12c70c9 --results-dir $GLOBAL_WORK_DIR/graffy/hibridon/benchmarks/starbench/hibench/$(date --iso=seconds) --arch-regexp 'intel_xeon_x5650' --cmake-path /usr/bin/cmake
|
|
||||||
```
|
|
||||||
|
|
||||||
`hibenchonphysix.py` script launches two `sge` jobs for each machine type in `physix` cluster:
|
|
||||||
- one job that performs a benchmark of hibridon with `gfortran` compiler
|
|
||||||
- one job that performs a benchmark of hibridon with `ifort` compiler
|
|
||||||
|
|
||||||
When the job successfully completes, it puts the results of the benchmark on `physix`'s global work directory (eg `/opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/53894da48505892bfa05693a52312bacb12c70c9/nh3h2_qma_long/intel_xeon_x5550/gfortran`)
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue