From 3218a4ed8f8bfcde4ab8bf2cd3f27d7f0df47787 Mon Sep 17 00:00:00 2001 From: Boris Kolpackov Date: Fri, 27 Oct 2023 08:20:52 +0200 Subject: Further work on packaging guide --- doc/packaging.cli | 269 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 268 insertions(+), 1 deletion(-) (limited to 'doc') diff --git a/doc/packaging.cli b/doc/packaging.cli index f3b9129..96c2861 100644 --- a/doc/packaging.cli +++ b/doc/packaging.cli @@ -121,6 +121,7 @@ upstream repository project package (third-party project) package \c{git} repository +multi-package repository \h1#core|Core Guidelines| @@ -171,6 +172,8 @@ repository name. If there is no upstream repository (for example, because the project doesn't use a version control system), the name used in the source archive distribution would be the natural fallback. +\N|See \l{#core-package-name Decide on the package name} for the complete +picture on choosing names.| \h2#core-repo-create|Create package repository in personal workspace| @@ -280,10 +283,274 @@ Next add and commit these files: \ git add . +git status git commit -m \"Initialize repository\" \ -@@ TODO: note on multi-package repository +\N|In these guidelines we will be using the package repository setup that is +capable of having multiple packages. This is recommended even for upstream +projects that only provides a single package because it gives us the +flexibility of adding new packages at a later stage without having to perform +a major restructuring of our repository. + +Note also that upstream providing multiple package is not the only reason we +may end up having multiple \c{build2} packages. Another common reason is +factoring tests into a separate package due to a dependency on a testing +framework +(see \l{https://github.com/build2/HOWTO/blob/master/entries/handle-tests-with-extra-dependencies.md +How do I handle tests that have extra dependencies?} for background and +details). While upstream adding new packages may not be very common, upstream +deciding to use a testing framework is a lot more plausible. + +The only notable drawback of using a multi-package setup with a single package +is the extra subdirectory for the package and a few extra files (such as +\c{packages.manifest} that lists the packages) in the root of the repository. +If you are certain that the project that you are converting is unlikely to +have multiple packages (for example, because you are the upstream) or need +extra dependencies for its tests (a reasonable assumption for a C project), +then you could instead go with the single-package repository where the +repository root is the package root. See \l{bdep-new(1)} for details on how to +initialize such a repository. In this guide, however, we will continue to +assume a multi-package repository setup.| + + +\h2#core-repo-submodule|Add upstream repository as \c{git} submodule| + +If the third-party project is available from a \c{git} repository, then the +recommended approach is to use the \c{git} submodule mechanism to make the +upstream source code available inside the package repository, customarily in a +subdirectory called \c{upstream/}. + +\N|While \c{git} submodules receive much criticism, in our case we use them +exactly as indended: to select and track specific (release) commits of an +external project. As a result, there is nothing tricky about their use for our +purpose and all the relevant commands will be provided and explained, in case +you are not familiar with this \c{git} mechanism.| + +Given the upstream repository URL, to add it as a submodule, run the following +command from the package repository root: + +\ +git submodule add https://github.com/.../.git upstream +\ + +\N|You should prefer \c{https://} over \c{git://} for the upstream repository +URL since the \c{git://} protocol may not be accessible from all networks. +Naturally, never use a URL that requires authentication, for example, SSH.| + +Besides the repository URL, you also need the commit of the upstream release +which you will be packaging. It is common practice to tag releases so the +upstream tags would be the first place to check. Failed that, you can always +use the commit id. + +Assuming the upstream release tag you are interested in is called \c{vX.Y.Z}, +to update the \c{upstream} submodule to point to this release commit, run the +following command: + +\ +cd upstream +git checkout vX.Y.Z +cd .. +\ + +Then add and commit these changes: + +\ +git add . +git status +git commit -m \"Add upstream submodule\" +\ + +Now we have all the upstream source code for the release that we are +interested in available in the \c{upstream/} subdirectory of our repository. + +The plan is to then use symbolic links (symlinks) to non-invasively overlay +the \c{build2} files (\c{buildfile}, \c{manifest}, etc) with the upstream +source code, if necessary adjusting upstream structure to split it into +multiple packages and/or to better align with the source/output layouts +recommended by \c{build2} (see \l{https://build2.org/article/symlinks.xhtml +Using Symlinks in \c{build2} Projects} for background and rationale). But +before we can start adding symlinks to the upstream source (and other files +like \c{README}, \c{LICENSE}, etc), we want to generate the \c{buildfile} +templates that match the upstream source code layout. This is the subject of +the next section. + +\N|While on UNIX-like operating systems symlinks are in widespread use, on +Windows it's a niche feature that unfortunately could be cumbersome to use +(see \l{https://build2.org/article/symlinks.xhtml#windows Symlinks and +Windows} for details). However, the flexibility afforded by symlinks when +packaging third-party projects is unmatched by any other mechanism and we +therefore use them despite potentially sub-optimal experience on Windows.| + + +\h#core-package|Create package and generate \c{buildfile} templates| + +This section covers the addition of the package to the repository we have +prepared in the previous steps and the generation of the \c{buildfile} +templates that match the upstream source code layout. + + +\h2#core-package-name|Decide on the package name| + +While choosing the package repository name was pretty straightforward, things +get less clear cut when it comes to the package name. + +\N|If you need a refresher on the distinction between projects and packages, +see \l{#intro-term Terminology}.| + +Picking a name for a package that provides an executable is still relatively +straightforward: you should use the upstream name (which is usually the same +as the upstream project name) unless there is a good reason to deviate. One +recommended place to check before deciding on a name is the +\l{https://packages.debian.org Debian package repository}. If their package +name differs from upstream, then there is likely a good reason for that and +it is worth trying to understand what it is. + +\N|Tip: when trying to find the corresponding Debain package, search for the +executable file name in the package contents if you cannot fine the package by +its upstream name. Also consider searching in the \c{unstable} distribution in +addition to \c{testing} for newer packages.| + +Picking a name for a package that provides a library is where things can get +more complicated. While all the recommendation that have been listed for +executables apply equally to libraries, there are additional considerations. + +In \c{build2} we recommend (but not require) that new library projects use a +name that starts with \c{lib} in order to easily distinguish them from +executables and avoid any clashes, potential in the future (see +\l{intro#proj-struct Canonical Project Structure} for details). To illustrate +the problem, consider the \c{zstd} project which provides a library and an +executable. In upstream repository both are part of the same codebase that +doesn't try to separate them into packages so that, for example, library could +be used without downloading and building the executable. In \c{build2}, +however, we do need to split them into two separate packages and both packages +cannot be called \c{zstd}. So we call them \c{zstd} and \c{libzstd}. + +\N|If you are familiar with the Debian package naming policy, you will +undoubtedly recognize the approach. In Debian all the library packages (with +very few exceptions) start with the \c{lib} prefix. So when searching for an +upstream name in the \l{https://packages.debian.org Debian package repository} +make sure to prefix it with \c{lib} (unless it already starts with this +prefix, of course).| + +This brings the question of what to do about third-party libraries: should we +add the \c{lib} prefix to the package name if it's not already there? +Unfortunately, there is no clear cut answer and whichever decision you make, +there will be drawbacks. Specifically, if you add the \c{lib} prefix, the main +drawback is that the package name now deviates from upstream name and if the +project maintainer ever decides to add \c{build2} support the upstream +repository, there could be substantial friction. On the other handle, if you +don't add the \c{lib} prefix, then you will always run the risk of a future +clash with an executable name. And, as was illustrated with the \c{zstd} +example, a late addition of an executable won't necessarily cause any issues +to upstream. As a result, we don't have a hard requirement for the \c{lib} +prefix unless there is already an executable that would cause the clash (this +applies even if it's not being packaged yet or is provided by an unrelated +project). If you don't have a strong preference, we recommend that you add the +\c{lib} prefix (unless it is already there). In particular, this will free you +from having to check for any potential clashes. See +\l{https://github.com/build2/HOWTO/blob/master/entries/name-packages-in-project.md +How should I name packages when packaging third-party projects?} for +additional background and details. + +To build some intuition for choosing package names, let's consider several +real examples. We start with executables: + +\ + upstream | upstream | Debian | build2 package| build2 +project name|executable name|package name|repository name|package name +------------+---------------+------------+---------------+------------ +byacc byacc byacc byacc byacc +sqlite sqlite3 sqlite3 sqlite sqlite3 +vim xxd xxd xxd xxd +OpenBSD m4 - openbsd-m4 openbsd-m4 +qtbase 5 moc qtbase5-\ Qt5 Qt5Moc + dev-tools +qtbase 6 moc qt6-base-\ Qt6 Qt6Moc + dev-tools +\ + +The examples are arranged from the most straightforward naming to the +least. The last two examples show that sometimes, after carefully considering +upstream naming, you nevertheless have no choice but to ignore it and forge +your own path. + +Next let's look at library examples. Notice that some use the same \c{build2} +package repository name as the executables above. That means they are part of +the same multi-package repository. + +\ + upstream | upstream | Debian | build2 package| build2 +project name|library name |package name|repository name|package name +------------+---------------+------------+---------------+------------ +libevent libevent libevent libevent libevent +brotli brotli libbrotli brotli libbrotli +zlib zlib zlib zlib libz +sqlite libsqlite3 libsqlite3 sqlite libsqlite3 +libsig\ libsigc++ libsigc++ libsig\ libsigc++ +cplusplus cplusplus +qtbase 5 QtCore qtbase5-dev Qt5 libQt5Core +qtbase 6 QtCore qt6-base-dev Qt6 libQt6Core +\ + +If an upstream project is just a single library, then the project name is +normally the same as the library name (but there are exceptions, like +\c{libsigcplusplus} in the above table). However, when looking at upstream +repository that contains multiple components (libraries and/or executables, +like \c{qtcore} in the above example), it may not be immediately obvious what +the upstream's library names are. In such cases, the corresponding Debian +packages can really help clarify the situation. Failed that, look into the +existing build system. In particular, if it generates the \c{pkg-config} file, +then the name of this file is usually the upstream library name. + +\N|Looking at the names of the library binaries is less helpful because on +UNIX-like systems they must start with the \c{lib} prefix. And on Windows the +names of library binaries often embed extra information (static/import, +debug/release, etc) and may not correspond directly to the library name.| + +And, speaking of multiple components, if you realize the upstream project +provides multiple libraries and/or executables, then you need to decide +whether to split them into seperate \c{build2} packages and if so, how. Here, +again, the corresponding Debian packages can be a good strating point. Note, +however, that in this case we often deviate from their split, especially when +it comes to libraries. For example, \c{libevent} shown in the above table +provides several libraries (\c{libevent-core}, \c{libevent-extra}, etc) and in +Debian it is actually split into several binary packages along these lines. In +\c{build2}, however, there is a single package that provides all these +libraries with everything except \c{libevent-core} being optional. An example +which shows the decision made in a different direction would be the Boost +libraries: in Debian all the header-only Boost libraries are bundled into a +single package while in \c{build2} they are all seperate packages. + +The overall criteria here can be stated as follows: if a small family of +libraries provide complimentary functionality (like \c{libevent}), then we put +them all into a single package, usually making the additional functionality +optional. However, if the libraries are independent (like Boost) or provide +alternative rather than complimentary functionality (for example, like +different backends in \c{imgui}), then we make them separate packages. Note +that we never bundle an executable and a (public) library in a single package. + +Note also that while it's a good idea to decide on the package split and all +the package names upfront to avoid suprises later, you don't have to actually +provide all the packages right away. For example, if upstream provides a +library and an executable (like \c{zstd}), you can start with the library and +the executable package can be added later (potentially by someone else). + +Admittedly, the recommendation in this section are all a bit fuzzy and one can +choose different names or different package splits that could all seem +reasonable. If you are unsure how to split the upstream project or what names +to use, \l{https://build2.org/community.xhtml#help get in touch} to discuss +the alternatives. It can be quite painful to change these things after you +have completed the remaining packaging steps. + + +@@ Where do we overlay the source code? + +====================================================================== + + + + -- cgit v1.1