Commit 77c5ef34 authored by Ralf Jung's avatar Ralf Jung
Browse files

tweaks to server docs

parent 315695de
......@@ -7,31 +7,40 @@ two servers are relevant
+ two sockets (10cores each) to prevent interference when measuring timings
+ one socket for timing jobs, one socket for everything else
+ different sockets means different memory controllers, important for timings
+ config: `/etc/systemd/system.conf` sets cpu-affinity system (cpu topology can be validated with `lscpu`)
+ `/etc/gitlab-runner/config.toml` is the main config for the CI runner (affinity for timing jobs configured to avoid interference through SMT)
+ config: `/etc/systemd/system.conf` sets cpu-affinity for system to all the cores on one socket (cpu topology can be validated with `lscpu`):
`CPUAffinity=1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39`
+ `/etc/gitlab-runner/config.toml` is the main config for the CI runner
+ here we set cpu-affinity for timing jobs to the cores on the other socket -- but only one core for each hyper-threading pair:
`cpuset_cpus = "0,2,4,6,8,10,12,14,16,18"`
(so effectively we disable hyper-threading on that socket)
+ the other jobs use the same cores as the system itself:
`cpuset_cpus = "1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39"`
+ using S3-Server for caching (served by MPI)
+ for timing measurements: #instructions are measured with `perf` (less noisy than exact times)
+ server doesn't use custom kernel, but rather the standard Debian kernel (to enable `perf` in Docker)
+ server doesn't use custom MPI kernel, but rather the standard Debian kernel (to enable `perf` in Docker)
+ sever is stateless, can be wiped anytime
+ may need to clean up storage from time to time, the local copies of the caches on this server can run full.
(somewhere in `/local/docker`, a lot of space will be used; deleting old files usually works pretty well.)
- coq-speed.mpi-sws.org
+ runs in some VM somewhere, serves the `coq-speed.mpi-sws.org` frontend
+ PostreSQL + Grafana + home-made Rust-service to fill the DB with timing info (`https://gitlab.mpi-sws.org/iris/ci-timing-webhook`)
+ stateful, has a backup job
+ stateful (data in PostreSQL, also tricky Grafana config), has a backup job
+ Grafana config:
- queries that put data in right format for Grafana, configured through the Grafana frontend
- hopefully won't have to touch those
+ CI scripts
- base-CI-image is on github.com/RalfJung/opam-ci (branch opam2)
- CI config (.gitlab-ci.yml) is identical for most projects
+ `tags` are used for setting the runner (`fp` (default), `fp-timing`)
+ `variables`: sets environment variables for CI, list with variables is at the top of https://gitlab.mpi-sws.org/iris/ci/-/blob/master/buildjob
+ `OPAM_PINS` variable: uses triples separated by spaces to configure opam pins (e.g. `coq-core.dev git git+https://....`); there's a shortcut for use with git repos containing `.opam` configs, just use `git+https://...` instead of the triple
+ `CI_RUNNER_DESCRIPTION`: parsed for activation of `perf`, sets the `TIMECMD` env variable (which is read by Coq); the `perf` script calls `perf` with the right parameters and concatenates the output files to one which is parsed by the frontend (this concatenated info is also shown in the CI output log on Gitlab)
+ ...
+ for creating/updating opam packages: https://gitlab.mpi-sws.org/iris/opam-updater (private due to secrets) runs CI jobs for updating.
+ updating coqdocs: uses a `coqdoc` user which can only use rsync
+ `prepare-opam.sh`: uses a few opam workarounds; in case there are problems with the opam cache, use the `Clear Runner Caches` button on the pipeline overview in Gitlab. may need to clean up storage from time to time.
- iris: `iris-bot` script does two things: checking if reverse dependencies still build and checking two commits for timing differences. To do that, there are `trigger-iris.timing`/`trigger-iris.dev` CI jobs, CI is using the `TIMING_AD_HOC_ID` variable to check which one to run.
- CI scripts
- base-CI-image is on github.com/RalfJung/opam-ci (branch opam2)
- CI config (.gitlab-ci.yml) is identical for most projects
+ `tags` are used for setting the runner (`fp` (default), `fp-timing`)
+ `variables`: sets environment variables for CI, list with variables is at the top of https://gitlab.mpi-sws.org/iris/ci/-/blob/master/buildjob
+ `OPAM_PINS` variable: uses triples separated by spaces to configure opam pins (e.g. `coq-core.dev git git+https://....`); there's a shortcut for use with git repos containing `.opam` configs, just use `git+https://...` instead of the triple
+ `CI_RUNNER_DESCRIPTION`: parsed for activation of `perf`, sets the `TIMECMD` env variable (which is read by Coq); the `perf` script calls `perf` with the right parameters and concatenates the output files to one which is parsed by the frontend (this concatenated info is also shown in the CI output log on Gitlab)
+ see `README.md` for more details
+ for creating/updating opam packages: https://gitlab.mpi-sws.org/iris/opam-updater (private due to secrets) runs CI jobs for updating.
+ updating coqdocs: uses a `coqdoc` user which can only use rsync for the specific coqdoc directory
+ `prepare-opam.sh`: uses a few opam workarounds; in case there are problems with the opam cache, use the `Clear Runner Caches` button on the pipeline overview in Gitlab.
This will not delete anything, just bump up some counter to avoid using the old caches.
- iris: `iris-bot` script does two things: checking if reverse dependencies still build and checking two commits for timing differences. To do that, there are `trigger-iris.timing`/`trigger-iris.dev` CI jobs, CI is using the `TIMING_AD_HOC_ID` variable to check which one to run. Again see `README.md` for more details.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment