2024-11-27 11:02:54 +01:00
# iprbench
2022-06-10 08:39:22 +02:00
2024-10-10 18:06:09 +02:00
## install iprbench
2022-06-10 08:39:22 +02:00
```sh
2024-10-10 18:06:09 +02:00
graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ python3 -m venv iprbench.venv
graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ source ./iprbench.venv/bin/activate
(iprbench.venv) graffy@alambix-frontal:/opt/ipr/cluster/work.local/graffy/bug3372$ pip install ./iprbench.git
2022-06-10 08:39:22 +02:00
```
2024-06-30 15:53:03 +02:00
2024-10-22 09:16:41 +02:00
## run unit tests
```sh
2024-11-27 11:02:54 +01:00
20241018-16:56:05 graffy@graffy-ws2:~/work/starbench/iprbench.git$ python3 -m unittest
```
## launch a benchmark on the current system
Here's a simple example to run the benchmark `mamul1` on the current system:
```sh
(iprbench.venv) graffy@alambix50:/opt/ipr/cluster/work.local/graffy/bug3958/iprbench.git$ iprbench-run --benchmark-id 'mamul1' --config '{"fortran_compiler": "gfortran:< default > ", "blas_library": "< default-libblas > :< default > ", "matrix_size": 1024, "num_loops":10, "num_cores":2, "launcher": "manual"}' --results-dir /tmp/mamul1_out --target-system-type-id 'debian' --resultsdb-params '{"type": "tsv-files", "tsv_results_dir": "/tmp/mamul1_out/tsv"}'
DEBUG:root:extracting package iprbench.resources.mamul1 resource CMakeLists.txt to /tmp/mamul1_out/mamul1
DEBUG:root:extracting package iprbench.resources.mamul1 resource mamul1.F90 to /tmp/mamul1_out/mamul1
DEBUG:root:shell_command = "starbench --source-tree-provider '{"type": "existing-dir", "dir-path": "/tmp/mamul1_out/mamul1"}' --num-cores 2 --output-dir=/tmp/mamul1_out/output --cmake-path=/usr/bin/cmake --cmake-option=-DCMAKE_BUILD_TYPE=Release --cmake-option=-DCMAKE_Fortran_COMPILER=gfortran --cmake-option=-DBLA_VENDOR=OpenBLAS --benchmark-command='./mamul1 1024 10' --output-measurements=/tmp/mamul1_out/output/measurements.tsv"
2024-10-22 09:16:41 +02:00
creating build directory /tmp/mamul1_out/output/worker< worker_id >
executing the following command in parallel (2 parallel runs) : '['mkdir', '-p', '/tmp/mamul1_out/output/worker< worker_id > /build']'
2024-11-27 11:02:54 +01:00
mean duration : 0.002 s (2 runs)
configuring /tmp/mamul1_out/mamul1 into /tmp/mamul1_out/output/worker< worker_id > /build ...
executing the following command in parallel (2 parallel runs) : '['/usr/bin/cmake', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_Fortran_COMPILER=gfortran', '-DBLA_VENDOR=OpenBLAS', '/tmp/mamul1_out/mamul1']'
mean duration : 0.057 s (2 runs)
2024-10-22 09:16:41 +02:00
building /tmp/mamul1_out/output/worker< worker_id > /build ...
executing the following command in parallel (2 parallel runs) : '['make']'
2024-11-27 11:02:54 +01:00
mean duration : 0.368 s (2 runs)
2024-10-22 09:16:41 +02:00
benchmarking /tmp/mamul1_out/output/worker< worker_id > /build ...
executing the following command in parallel (2 parallel runs) : '['./mamul1', '1024', '10']'
2024-11-27 11:02:54 +01:00
mean duration : 1.933 s (2 runs)
duration : 1.933 s
DEBUG:root:output_measurements_file_path = /tmp/mamul1_out/output/measurements.tsv
DEBUG:root:row = Unnamed: 0 0.000000
run_id 0.000000
duration 1.932536
Name: 0, dtype: float64
DEBUG:root:row = Unnamed: 0 1.000000
run_id 1.000000
duration 1.933324
Name: 1, dtype: float64
total number of cores (including virtual cores) on this host : 24
DEBUG:root:table_file_path=/tmp/mamul1_out/tsv/mamul1.tsv
measurement_time ipr_bench_version host_id ... duration_stddev duration_min duration_max
0 2024-11-27 10:51:02.551947 0.0.13 < unknown > ... 0.000557 1.932536 1.933324
[1 rows x 18 columns]
2024-10-22 09:16:41 +02:00
```
2024-11-27 11:02:54 +01:00
Now that the run has completed succesfully, the results can be found in the folder `/tmp/mamul1_out/tsv` that we chose:
2024-10-22 09:16:41 +02:00
```sh
2024-11-27 11:02:54 +01:00
(iprbench.venv) graffy@alambix50:/opt/ipr/cluster/work.local/graffy/bug3958/iprbench.git$ cat /tmp/mamul1_out/tsv/mamul1.tsv
measurement_time ipr_bench_version host_id host_fqdn user num_cpus cpu_model launcher fortran_compiler blas_library num_cores matrix_size num_loops duration_avg duration_med duration_stddev duration_min duration_max
2024-11-27 10:51:02.551947 0.0.13 < unknown > alambix50.ipr.univ-rennes.fr graffy 2 intel_xeon_x5650 manual gfortran:12.2.0 libopenblas-pthread:0.3.21 2 1024 10 1.93293 1.93293 0.0005572001435750071 1.932536 1.933324
2024-10-22 09:16:41 +02:00
```
2024-10-10 18:06:09 +02:00
## launch benchmark jobs on alambix cluster
2024-06-30 15:53:03 +02:00
2024-11-27 11:02:54 +01:00
The following example command submits jobs (one job per cpu architecture) that execute the benchmark `hibridon` on the cluster `alambix` (only for the architecture `intel_xeon_x5650` in this example).
2024-06-30 15:53:03 +02:00
```sh
2024-11-27 11:02:54 +01:00
(iprbench.venv) graffy@alambix50:/opt/ipr/cluster/work.local/graffy/bug3958/iprbench.git$ results_dir="$GLOBAL_WORK_DIR/graffy/iprbenchs/test_results/clusterbench_submit/$(date --iso-=seconds)"; clusterbench-submit --cluster-id 'alambix' --arch-regexp "intel_xeon_x5650.*" --benchmark-id 'hibridon' --config '{"fortran_compiler": "ifort:< default > ", "blas_library": "intelmkl:< default > ", "test_id": "arch4_quick", "hibridon_version": "a3bed1c3ccfbca572003020d3e3d3b1ff3934fad", "cmake_path": "cmake", "num_cores": 1, "launcher": "graffy.manual"}' --results-dir "${results_dir}" --resultsdb-params '{"type": "tsv-files", "tsv_results_dir": "'"$results_dir"'"}' --target-system-type-id "fr.univ-rennes.ipr.cluster-node"
INFO:root:available host groups: dict_keys(['intel_core_i5_8350u', 'intel_xeon_x5550', 'intel_xeon_x5650', 'intel_xeon_e5-2660', 'intel_xeon_e5-2660v2', 'intel_xeon_e5-2660v4', 'intel_xeon_gold_6140', 'intel_xeon_gold_6154', 'intel_xeon_gold_5220', 'intel_xeon_gold_6226r', 'intel_xeon_gold_6248r', 'intel_xeon_gold_6348', 'amd_epyc_7282', 'amd_epyc_7452'])
2024-10-10 18:06:09 +02:00
INFO:root:requested host groups: ['intel_xeon_x5650']
2024-11-27 11:02:54 +01:00
DEBUG:root:iprbench_venv_hardcoded_path = /tmp/user/59825/iprbench.venv
2024-11-27 13:50:38 +01:00
INFO:root:creating /opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00/iprbench.venv.tgz (the virtual environment that will be used in this bench by all its jobs at some point)
2024-10-10 18:06:09 +02:00
Collecting virtualenv-clone
Using cached virtualenv_clone-0.5.7-py3-none-any.whl (6.6 kB)
Installing collected packages: virtualenv-clone
Successfully installed virtualenv-clone-0.5.7
2024-11-27 11:02:54 +01:00
DEBUG:root:type of resultsdb_params = < class ' dict ' >
2024-11-27 13:50:38 +01:00
DEBUG:root:resultsdb_params = {'type': 'tsv-files', 'tsv_results_dir': '/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00'}
DEBUG:root:resultsdb_params = {"type": "tsv-files", "tsv_results_dir": "/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00"}
DEBUG:root:tags_dict = {'< benchmark_id > ': 'hibridon', '< starbench_job_path > ': '/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00/intel_xeon_x5650/starbench.job', '< iprbench_venv_hardcoded_path > ': '/tmp/user/59825/iprbench.venv', '< iprbench_venv_archive_path > ': '/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00/iprbench.venv.tgz', '< benchmark_config > ': '{\\"fortran_compiler\\": \\"ifort:< default > \\", \\"blas_library\\": \\"intelmkl:< default > \\", \\"test_id\\": \\"arch4_quick\\", \\"hibridon_version\\": \\"a3bed1c3ccfbca572003020d3e3d3b1ff3934fad\\", \\"cmake_path\\": \\"cmake\\", \\"num_cores\\": 12, \\"launcher\\": \\"graffy.manual.alambix.job\'${JOB_ID}\'\\"}', '< results_dir > ': '/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00', '< resultsdb_params > ': '{\\"type\\": \\"tsv-files\\", \\"tsv_results_dir\\": \\"/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00\\"}', '< num_cores > ': '12', '< target_system_type_id > ': 'fr.univ-rennes.ipr.cluster-node'}
2024-11-27 11:02:54 +01:00
DEBUG:root:ram_per_core = 1.073741824G
DEBUG:root:qsub_args = ['-pe', 'smp', '12', '-l', '"hostname=alambix50.ipr.univ-rennes.fr"', '-S', '/bin/bash', '-cwd', '-m', 'ae', '-l', 'mem_available=1.073741824G', '-j', 'y', '-N', 'hibridon_intel_xeon_x5650']
2024-11-27 13:50:38 +01:00
DEBUG:root:qsub_command = qsub -pe smp 12 -l "hostname=alambix50.ipr.univ-rennes.fr" -S /bin/bash -cwd -m ae -l mem_available=1.073741824G -j y -N hibridon_intel_xeon_x5650 /opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00/intel_xeon_x5650/starbench.job , working_dir=/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00/intel_xeon_x5650
Your job 18886 ("hibridon_intel_xeon_x5650") has been submitted
2024-11-27 11:02:54 +01:00
```
The following command shows that the job is running
```sh
(iprbench.venv) graffy@alambix50:/opt/ipr/cluster/work.local/graffy/bug3958/iprbench.git$ qstat
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
2024-11-27 13:50:38 +01:00
18886 0.65000 hibridon_i graffy r 11/26/2024 18:15:32 short.q@alambix50.ipr.univ-ren 12
2024-06-30 15:53:03 +02:00
```
2024-10-10 18:06:09 +02:00
2024-11-27 11:02:54 +01:00
the configuration of the benchmark (`--config`) is defined to run the test `arch4_quick` using the latest versions of ifort and mkl:
```json
{
"fortran_compiler": "ifort:< default > ",
"blas_library": "intelmkl:< default > ",
"test_id": "arch4_quick",
"hibridon_version": "a3bed1c3ccfbca572003020d3e3d3b1ff3934fad",
"cmake_path": "cmake",
"num_cores": 1,
"launcher": "graffy.manual"
}
```
This will cause the benchmark to use the latest versions on ifort and mkl available on the cluster node that run the benchmark.
note: the value given to `num_cores` is not important as `clusterbench-submit` overwrites it with the number of cores of the cluster node that runs the benchmark.
the results database backend used in the benchmark (`--resultsdb-params`) is:
```json
{
"type": "tsv-files",
2024-11-27 13:50:38 +01:00
"tsv_results_dir": "/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00"
2024-11-27 11:02:54 +01:00
}
```
2024-10-10 18:06:09 +02:00
2024-11-27 13:50:38 +01:00
This means that we want to register the results of the benchmark in the tsv (tab separated values) file `/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00/hibridon.tsv` . Please note that this result database backend is not really appropriate for `clusterbench-submit` , as it suffers from racing conditions (`sqlserver-viassh-database` would be a better alternative, but it requires a more complicate setup).
2024-10-10 18:06:09 +02:00
2024-11-27 13:50:38 +01:00
When the jobs successfully complete, they put their results of the benchmark in `$results_dir` (eg `/opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11:39:42+01:00` )
2024-11-27 11:02:54 +01:00
```sh
2024-11-27 13:50:38 +01:00
(iprbench.venv) graffy@alambix50:/opt/ipr/cluster/work.local/graffy/bug3958/iprbench.git$ cat /opt/ipr/cluster/work.global/graffy/iprbenchs/test_results/clusterbench_submit/2024-11-27T11\:39\:42+01\:00/hibridon.tsv
measurement_time ipr_bench_version host_id host_fqdn user num_cpus cpu_model launcher num_cores hibridon_version fortran_compiler blas_library test_id cmake_path duration_avg duration_med duration_stddev duration_min duration_max num_threads_per_run
2024-11-27 11:42:49.511113 0.0.14 < unknown > alambix50.ipr.univ-rennes.fr graffy 2 intel_xeon_x5650 graffy.manual.alambix.job18886 12 a3bed1c3ccfbca572003020d3e3d3b1ff3934fad ifort:2021.13.1 intelmkl:2024.2.1 arch4_quick cmake 3.8646755 3.8377410000000003 0.2467767843388766 3.569571 4.220125 1
2024-11-27 11:02:54 +01:00
```
2024-10-10 18:06:09 +02:00
2024-12-10 18:57:54 +01:00
## launch benchmark jobs on alambix cluster
### run hibridon benchmark on alambix (production version where results are stored on `iprbenchs` database):
```sh
(iprbench.venv) graffy@alambix50:/opt/ipr/cluster/work.local/graffy/bug3958/iprbench.git$ results_dir="$GLOBAL_WORK_DIR/graffy/iprbenchs/test_results/clusterbench_submit/$(date --iso-=seconds)"; clusterbench-submit --cluster-id 'alambix' --arch-regexp ".*" --benchmark-id 'hibridon' --config '{"fortran_compiler": "ifort:< default > ", "blas_library": "intelmkl:< default > ", "test_id": "nh3h2_qma_long", "hibridon_version": "a3bed1c3ccfbca572003020d3e3d3b1ff3934fad", "cmake_path": "cmake", "num_cores": 1, "launcher": "graffy.manual"}' --results-dir "${results_dir}" --resultsdb-params '{ "type": "sqlserver-viassh-database", "db_server_fqdn": "iprbenchsdb.ipr.univ-rennes1.fr", "db_user": "iprbenchw", "db_name": "iprbenchs", "ssh_user": "iprbenchw" }' --target-system-type-id "fr.univ-rennes.ipr.cluster-node"
```
note: for these runs to succeed, graffy@alambix is expected have have the privilieges to write in `iprbenchs` database. This is ensured by allowing `graffy@alambix` to ssh to `iprbenchw@iprbenchsdb.ipr.univ-rennes1.fr` with the ssh key `perf-bencher` (which has been added to `iprbenchw@iprbenchsdb.ipr.univ-rennes1.fr:~iprbenchw/.ssh/authorized_keys` by maco), using the following setup:
```sh
raffy@alambix-frontal:~$ ls -la ./.ssh/
total 92
drwx------ 2 graffy spm 4096 27 nov. 15:46 .
drwxr-xr-x 171 graffy spm 12288 4 déc. 19:22 ..
-rw------- 1 graffy spm 607 22 mai 2024 authorized_keys
-rw-r----- 1 graffy spm 296 27 nov. 15:46 config
-rw------- 1 graffy spm 16658 18 nov. 22:48 known_hosts
...
-rw------- 1 graffy spm 411 27 nov. 15:29 perf-bencher
-rw-r--r-- 1 graffy spm 98 27 nov. 15:29 perf-bencher.pub
...
```
```sh
graffy@alambix-frontal:~$ cat .ssh/config
...
Host iprbenchsdb.ipr.univ-rennes1.fr
hostname iprbenchsdb.ipr.univ-rennes1.fr
user test_iprbenchw
IdentityFile ~/.ssh/perf-bencher
```
### run hibridon benchmark on alambix (test version where results are stored on `test_iprbenchs` database):
This example is the same as the previous one except:
- use `test_iprbenchs` database instead of `iprbenchs`
- use `test_iprbenchw` user instead of `iprbenchw`
2024-10-10 18:06:09 +02:00
## graph the results of benchmarks
2024-11-26 16:56:11 +01:00
`showresults` is a command line tool that graphs the results after they've been downloaded from the results directory (for example `/opt/ipr/cluster/work.global/graffy/hibridon/benchmarks/starbench/hibridon/2024-10-10T12:11:44+02:00` ) to the hardcoded (at the moment) path `/home/graffy/work/starbench/starbench.git/usecases/ipr/hibridon/results`
2024-10-10 18:06:09 +02:00
```sh
20241010-16:30:54 graffy@graffy-ws2:~/work/starbench/iprbench.git$ showresults
```