From e98b7d27bc969762ec4952f82634bb6e6375b8c2 Mon Sep 17 00:00:00 2001 From: Boris Kolpackov Date: Wed, 6 Mar 2024 11:10:27 +0200 Subject: Document auxiliary machine semantics in manual --- doc/manual.cli | 295 +++++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 237 insertions(+), 58 deletions(-) (limited to 'doc/manual.cli') diff --git a/doc/manual.cli b/doc/manual.cli index 41f0eeb..885fc48 100644 --- a/doc/manual.cli +++ b/doc/manual.cli @@ -41,12 +41,24 @@ that are executed on the build host. Inside virtual machines/containers, agent. Virtual machines and containers running a \c{bbot} instance in the worker mode are collectively called \i{build machines}. +In addition to a build machine, a build task may also require one or more +\i{auxiliary machines} which provide additional components that are required +for building or testing a package and that are impossible or impractical to +provide as part of the build machine itself. + Let's now examine the workflow in the other direction, that is, from a worker -to a controller. Once a build machine is booted (by the agent), the worker -inside connects to the TFTP server running on the build host and downloads the -\i{build task manifest}. It then proceeds to perform the build task and -uploads the \i{build artifacts archive}, if any, followed by the \i{build -result manifest} (which includes build logs) to the TFTP server. +to a controller. Once a build machine (plus auxiliary machines, if any) are +booted (by the agent), the worker inside the build machine connects to the +TFTP server running on the build host and downloads the \i{build task +manifest}. It then proceeds to perform the build task and uploads the \i{build +artifacts archive}, if any, followed by the \i{build result manifest} (which +includes build logs) to the TFTP server. + +Unlike build machines, auxiliary machines are not expected to run \c{bbot}. +Instead, on boot, they are expected to upload to the TFTP server a list of +environment variables to propagate to the build machine (see the +\c{auxiliary-environment} task manifest value as well as \l{#arch-worker +Worker Logic} for details). Once an agent receives a build task for a specific build machine, it goes through the following steps. First, it creates a directory on its TFTP server @@ -94,12 +106,14 @@ implementation of the build artifacts upload handling. \h#arch-machine-config|Configurations| -The \c{bbot} architecture distinguishes between a \i{machine configuration}, -\i{build target configuration}, and a \i{build package configuration}. The -machine configuration captures the operating system, installed compiler -toolchain, and so on. The same build machine may be used to \"generate\" -multiple \i{build target configurations}. For example, the same machine can -normally be used to produce 32/64-bit and debug/optimized builds. +The \c{bbot} architecture distinguishes between a \i{build machine +configuration}, \i{build target configuration}, and a \i{build package +configuration}. The machine configuration captures the operating system, +installed compiler toolchain, and so on. The same build machine may be used to +\"generate\" multiple \i{build target configurations}. For example, the same +machine can normally be used to produce debug/optimized builds. + +\h2#arch-machine-config-build-machine|Build Machine Configuration| The machine configuration is \i{approximately} encoded in its \i{machine name}. The machine name is a list of components separated with \c{-}. @@ -110,24 +124,24 @@ component. The encoding is approximate in a sense that it captures only what's important to distinguish in a particular \c{bbot} deployment. -The first component normally identifies the operating system and has the -following recommended form: +The first three components normally identify the architecture, operating +system, and optional variant. They have the following recommended form: \ -[_][_][_] +-[_][_][-] \ For example: \ -windows -windows_10 -windows_10.1607 -i686_windows_xp -bsd_freebsd_10 -linux_centos_6.2 -linux_ubuntu_16.04 -macos_10.12 +x86_64-windows +x86_64-windows_10 +x86_64-windows_10.1607 +x86_64-windows_10-devmode +x86_64-bsd_freebsd_10 +x86_64-linux_ubuntu_16.04 +x86_64-linux_rhel_9.2-bindist +aarch64-macos_10.12 \ The second component normally identifies the installed compiler toolchain and @@ -144,38 +158,53 @@ gcc gcc_6 gcc_6.3 gcc_6.3_mingw_w64 +clang_3.9 clang_3.9_libc++ -clang_3.9_libstdc++ msvc_14 -msvc_14u3 -icc +msvc_14.3 +clang_15.0_msvc_msvc_17.6 +clang_16.0_llvm_msvc_17.6 \ Some examples of complete machine names: \ -windows_10-msvc_14u3 -macos_10.12-clang_10.0 -linux_ubuntu_16.04-gcc_6.3 -aarch64_linux_debian_11-gcc_12.2 +x86_64-windows_10-msvc_14.3 +x86_64-macos_10.12-clang_10.0 +aarch64-linux_ubuntu_16.04-gcc_6.3 +aarch64-linux_rhel_9.2-bindist-gcc_11 \ +\h2#arch-machine-config-build-target-config|Build Target Configuration| + Similarly, the build target configuration is encoded in a \i{configuration name} using the same overall format. As described in \l{#arch-controller Controller Logic}, target configurations are generated from machine configurations. As a result, it usually makes sense to have the first component identify the operating systems and the second component \- the -toolchain with the rest identifying a particular target configuration variant, -for example, optimized, sanitized, etc. For example: +compiler toolchain with the rest identifying a particular target configuration +variant, for example, optimized, sanitized, etc: + +\ +[_][_]-[-] +\ + +For example: \ -windows-vc_14-O2 -linux-gcc_6-O3_asan +windows_10-msvc_17.6 +windows_10-msvc_17.6-O2 +windows_10-msvc_17.6-static_O2 +windows_10-msvc_17.6-relocatable +windows_10-clang_16.0_llvm_msvc_17.6_lld +linux_debian_12-clang_16_libc++-static_O3 \ -While we can also specify the \c{} component in a build target -configuration, this information is best conveyed as part of \c{} as -described in \l{#arch-controller Controller Logic}. +Note that there is no \c{} component in a build target configuration: +this information is best conveyed as part of \c{} as described in +\l{#arch-controller Controller Logic}. + +\h2#arch-machine-config-build-package-config|Build Package Configuration| A package can be built in multiple package configurations per target configuration. A build package configuration normally specifies the options @@ -187,6 +216,42 @@ originate from the package manifest \c{*-build-config}, \c{*-builds}, \l{bpkg#manifest-package Package Manifest} for more information on these values. + +\h2#arch-machine-config-auxiliary|Auxiliary Machines and Configurations| + +Besides the build machine and the build configuration that is derived from it, +a package build may also involve one or more \i{auxiliary machines} and the +corresponding \i{auxiliary configurations}. + +An auxiliary machine provides additional components that are required for +building or testing a package and that are impossible or impractical to +provide as part of the build machine itself. For example, a package may need +access to a suitably configured database, such as PostgreSQL, in order to run +its tests. + +The auxiliary machine name follows the same overall format as the build +machine name except that the last component captures the information about the +additional component in question rather that the compiler toolchain. For +example: + +\ +x86_64-linux_debian_12-postgresql_16 +aarch64-linux_debian_12-mysql_8 +\ + +The auxiliary configuration name is automatically derived from the machine +name by removing the \c{} component. For example: + +\ +linux_debian_12-postgresql_16 +linux_debian_12-mysql_8 +\ + +\N|Note that there is no generation of multiple auxiliary configurations from +the same auxiliary machine since that would require some communication of the +desired configuration variant to the machine.| + + \h#arch-machine-header|Machine Header Manifest| @@ TODO: need ref to general manifest overview in bpkg, or, better yet, @@ -201,16 +266,28 @@ followed by the detailed description of each value in subsequent sections. id: name: summary: +[role]: build|auxiliary +[ram-minimum]: +[ram-maximum]: \ For example: \ -id: windows_10-msvc_14-1.3 -name: windows_10-msvc_14 +id: x86_64-windows_10-msvc_14-1.3 +name: x86_64-windows_10-msvc_14 summary: Windows 10 build 1607 with VC 14 update 3 \ +\ +id: aarch64-linux_debian_12-postgresql_16-1.0 +name: aarch64-linux_debian_12-postgresql_16 +summary: Debian 12 with PostgreSQL 16 test user/database +role: auxiliary +ram-minimum: 2097152 +ram-minimum: 4194304 +\ + \h2#arch-machine-header-id|\c{id}| \ @@ -243,11 +320,34 @@ summary: The one-line description of the machine. +\h2#arch-machine-header-role|\c{role}| + +\ +[role]: build|auxiliary +\ + +The machine role. If unspecified, then \c{build} is assumed. + + +\h2#arch-machine-header-ram|\c{ram-minimum}, \c{ram-maximum}| + +\ +[ram-minimum]: +[ram-maximum]: +\ + +The minimum and the maximum amount of RAM in KiB that the machine requires. +The maximum amount is interpreted as the amount beyond which there will be no +benefit. If unspecified, then it is assumed the machine will run with any +minimum amount a deployment will provide and will always benefit from more +RAM, respectively. + + \h#arch-machine|Machine Manifest| The build machine manifest contains the complete description of a build machine on the build host (see the Build OS documentation for their origin and -location). The machine manifest starts with the machine manifest header with +location). The machine manifest starts with the machine header manifest with all the header values appearing before any non-header values. The non-header part of manifest synopsis is presented next followed by the detailed description of each value in subsequent sections. @@ -360,8 +460,11 @@ repository-url: [dependency-checksum]: machine: +[auxiliary-machine]: +[auxiliary-machine-]: target: [environment]: +[auxiliary-environment]: [target-config]: [package-config]: [host]: true|false @@ -459,6 +562,21 @@ machine: The name of the build machine to use. +\h2#arch-task-auxiliary-machine|\c{auxiliary-machine}| + +\ +[auxiliary-machine]: +[auxiliary-machine-]: +\ + +The names of the auxiliary machines to use. These values correspond to the +\c{build-auxiliary} and \c{build-auxiliary-} values in the package +manifest. While there each value specifies an auxiliary configuration pattern, +here it specifies the concrete auxiliary machine name that was picked by the +controller from the list of available auxiliary machines (sent as part of the +task request) that match this pattern. + + \h2#arch-task-target|\c{target}| \ @@ -484,6 +602,50 @@ The name of the build environment to use. See \l{#arch-worker Worker Logic} for details. +\h2#arch-task-auxiliary-environment|\c{auxiliary-environment}| + +\ +[auxiliary-environment]: +\ + +The environment variables describing the auxiliary machines. If any +\c{auxiliary-machine*} values are specified, then after starting such +machines, the agent prepares a combined list of environment variables that +were uploaded by such machines and passes it in this value to the worker. + +The format of this value is a list of environment variable assignments +one per line, in the form: + +\ += +\ + +Whitespaces before \c{}, around \c{=}, and after \c{} as well as +blank lines are ignored. The \c{} part as a whole can be single ('\ ') +or double (\"\ \") quoted. For example: + +\ +DATABASE_HOST=192.168.0.1 +DATABASE_PORT=1245 +DATABASE_USER='John \"Johnny\" Doe' +DATABASE_NAME=\" test database \" +\ + +If the corresponding machine is specified as \c{auxiliary-machine-}, +then its environment variables are prefixed with capitalized \c{_}. For +example: + +\ +auxiliary-machine-pgsql: x86_64-linux_debian_12-postgresql_16 +auxiliary-environment: +\\ +PGSQL_DATABASE_HOST=192.168.0.1 +PGSQL_DATABASE_PORT=1245 +... +\\ +\ + + \h2#arch-task-target-config|\c{target-config}| \ @@ -699,7 +861,7 @@ Note that the overall \c{status} value should appear before any per-operation The \c{skip} status indicates that the received from the controller build task checksums have not changed and the task execution has therefore been skipped -under the assumtion that it would have produced the same result. See +under the assumption that it would have produced the same result. See \c{agent-checksum}, \c{worker-checksum}, and \c{dependency-checksum} for details. @@ -765,9 +927,9 @@ The version of the worker logic used to perform the package build task. An agent (or controller acting as an agent) sends a task request to its controller via HTTP/HTTPS POST method (@@ URL/API endpoint). The task request -starts with the task request manifest followed by a list of machine manifests. -The task request manifest synopsis is presented next followed by the detailed -description of each value in subsequent sections. +starts with the task request manifest followed by a list of machine header +manifests. The task request manifest synopsis is presented next followed by +the detailed description of each value in subsequent sections. \ agent: @@ -776,6 +938,7 @@ toolchain-version: [interactive-mode]: false|true|both [interactive-login]: [fingerprint]: +[auxiliary-ram]: \ @@ -842,6 +1005,18 @@ authentication in which case it should respond with the 401 (unauthorized) HTTP status code. +\h2#arch-task-req-auxiliary-ram|\c{auxiliary-ram}| + +\ +[auxiliary-ram]: +\ + +The amount of RAM in KiB that is available for running auxiliary machines. If +unspecified, then assume there is no hard limit (that is, the agent can +allocate up to the host's available RAM minus the amount required to run the +build machine). + + \h#arch-task-res|Task Response Manifest| A controller sends the task response manifest in response to the task request @@ -969,20 +1144,24 @@ established for a particular build target. The environment has three components: the execution environment (environment variables, etc), build system modules, as well as configuration options and variables. -Setting up of the environment is performed by an executable (script, batch -file, etc). Specifically, upon receiving a build task, if it specifies the -environment name then the worker looks for the environment setup executable -with this name in a specific directory and for the executable called -\c{default} otherwise. Not being able to locate the environment executable is -an error. - -Once the environment setup executable is determined, the worker re-executes -itself as that executable passing to it as command line arguments the target -name, the path to the \c{bbot} worker to be executed once the environment is -setup, and any additional options that need to be propagated to the re-executed -worker. The environment setup executable is executed in the build directory as -its current working directory. The build directory contains the build task -\c{task.manifest} file. +Setting up of the execution environment is performed by an executable (script, +batch file, etc). Specifically, upon receiving a build task, if it specifies +the environment name then the worker looks for the environment setup +executable with this name in a specific directory and for the executable +called \c{default} otherwise. Not being able to locate the environment +executable is an error. + +In addition to the environment executable, if the task requires any auxiliary +machines, then the \c{auxiliary-environment} value from the task manifest is +incorporated into the execution environment. + +Specifically, once the environment setup executable is determined, the worker +re-executes itself in the auxiliary environment and as that executable passing +to it as command line arguments the target name, the path to the \c{bbot} +worker to be executed once the environment is setup, and any additional +options that need to be propagated to the re-executed worker. The environment +setup executable is executed in the build directory as its current working +directory. The build directory contains the build task \c{task.manifest} file. The environment setup executable sets up the necessary execution environment for example by adjusting \c{PATH} or running a suitable \c{vcvars} batch file. @@ -2211,7 +2390,7 @@ manifest. The matched machine name, the target, the environment name, configuration options/variables, and regular expressions are included into the build task manifest. -Values in the \c{} list can be opionally prefixed with the +Values in the \c{} list can be optionally prefixed with the \i{step id} or a leading portion thereof to restrict it to a specific step, operation, phase, or tool in the \i{worker script} (see \l{#arch-worker Worker Logic}). The prefix can optionally begin with the \c{+} or \c{-} character (in -- cgit v1.1