Update and expand Canonical Project Structure section in intro

author: Boris Kolpackov <boris@codesynthesis.com> 2018-08-30 15:18:17 +0200
committer: Boris Kolpackov <boris@codesynthesis.com> 2018-08-30 15:18:17 +0200
commit: 69da791221597621a4ce0a34d5a3e3cee05adc62 (patch)
tree: 974156e25097117a249efe1f237956f1d0cb9e32 /doc
parent: 57e6b4f4371842dac2ed829525b032845373dc57 (diff)
1 files changed, 155 insertions, 83 deletions
diff --git a/doc/intro.cli b/doc/intro.cli
index efaa0cb..d45904f 100644
--- a/doc/intro.cli
+++ b/doc/intro.cli
@@ -2023,20 +2023,18 @@ good idea to instead add a \l{bpkg#manifest-package-requires \c{requires}}
 entry as a documentation to users of our project.|
 
 
-\h1#structure|Project Structure|
-
-\h#structure-canonical|Canonical Project Structure|
+\h1#proj-struct|Canonical Project Structure|
 
 The primary goal of establishing the \i{canonical project structure} is to
 create an ecosystem of packages that can coexist, are easy to comprehend by
 both humans and tools, scale to complex, real-world requirements, and, last
-but not least, are pleasant to develop.
+but not least, are pleasant to work with.
 
 The canonical structure is primarily meant for packages \- a single library or
 program (or, sometimes, a collection of related programs) with a specific and
 well-defined function. While it may be less suitable for more elaborate,
 multi-library/program \i{end-products} that are not meant to be packaged, most
-of the recommendations discussed below would still apply. Oftentimes, you
+of the recommendations discussed below would still make sense. Oftentimes, you
 would start with a canonical project and expand from there. Note also that
 while the discussion below focuses on C++, most of it applies equally to C
 projects.
@@ -2050,6 +2048,7 @@ projects are presented below.
 ├── build/
 ├── <name>/
 │   ├── <name>.cxx
+│   ├── <name>.test.cxx
 │   ├── testscript
 │   └── buildfile
 ├── buildfile
@@ -2062,6 +2061,7 @@ lib<name>/
 ├── lib<name>/
 │   ├── <name>.hxx
 │   ├── <name>.cxx
+│   ├── <name>.test.cxx
 │   ├── export.hxx
 │   ├── version.hxx.in
 │   └── buildfile
@@ -2081,7 +2081,7 @@ Below is a short summary of the key points:
 \li|\n\i{Header and source files (or module interface and implementation
 files) are next to each other (no \c{include/} and \c{src/} split).}|
 
-\li|\n\i{Header inclusions use \c{<>} and contain the project directory
+\li|\n\i{Headers are included with \c{<>} and contain the project directory
 prefix, for example, \c{<libhello/hello.hxx>}.}|
 
 \li|\n\i{Header and source file extensions are either \c{.hpp/.cpp} or
@@ -2097,7 +2097,15 @@ Let's start with naming projects: by convention, library names start with the
 avoided. The \c{bdep-new} command warns about both violations.
 
 The project's root directory should contain the root \c{buildfile} and package
-\c{manifest}.
+\c{manifest} file. Other recommended top-level subdirectory names are
+\c{examples/} (for libraries it is normally a subproject like \c{tests/}, see
+below), \c{doc/}, and \c{etc/} (sample configurations, scripts, third-party
+contributions, etc). See also \l{b#intro-proj-struct Project Structure} in the
+build system manual for details on the build-related files (\c{buildfile}) and
+subdirectories (\c{build/}).
+
+
+\h#proj-struct-src-dir|Source Directory|
 
 The project's source code is placed into a subdirectory of the root directory
 named the same as the project, for example, \c{hello/hello/} or
@@ -2110,12 +2118,12 @@ expect to find our project's source code. Finally, this layout prevents
 clutter in the project's root directory which usually contains various other
 files (like \c{README}, \c{LICENSE}) and directories (like \c{tests/}).
 
-\N|Another popular approach is to place (public) headers into the \c{include/}
-subdirectory and source files (as well as private headers) into \c{src/}. The
-advantage of this layout is the predictable location that contains only the
-project's public headers (that is, its API). This can make the project easier
-to navigate and understand while harder to misuse (for example, by including a
-private header).
+\N|Another popular approach is to place public headers into the \c{include/}
+subdirectory and source files as well as private headers into \c{src/}. The
+cited advantage of this layout is the predictable location that contains only
+the project's public headers (that is, its API). This can make the project
+easier to navigate and understand while harder to misuse, for example, by
+including a private header.
 
 However, this split layout is not without drawbacks:
 
@@ -2134,28 +2142,28 @@ different directories. Even if we can move things around post-generation,
 build systems may not support this arrangement (for example, \c{build2} does
 not currently support target groups with members in different directories).||
 
-Also, the stated advantage of this layout (separation of public headers) is
-not as clear cut as it may seem. The common assumption of the split layout is
-that only headers from \c{include/} are installed and, conversely, to use the
-headers in-place, all one has to do is add \c{-I} pointing to \c{include/}.
-On the other hand, it is common for public headers to include private, for
-example, to call an implementation-detail function in inline or template code
-(note that the same applies to private modules imported in public module
-interfaces). Which means such private, (or, probably now more accurately
-called implementation-detail) headers have to be placed in the \c{include/}
-directory as well, perhaps into a subdirectory (such \c{details/}) or with a
-file name suffix (sich as \c{-impl}) to signal to the user that they are still
-\"private\". Needless to say, keeping track of which private headers can still
-stay in \c{src/} and which have to be moved to \c{include/} (or vice versa) is
-an arduous, error-prone task. As a result, practically, the split layout often
-degrades into the \"all headers in \c{include/}\" arrangement which negates
-its main advantage.
-
-It's also not clear how the split layout will fit modularized projects. With
+Also, the stated advantage of this layout \- separation of public headers from
+private \- is not as clear cut as it may seem. The common assumption of the
+split layout is that only headers from \c{include/} are installed and,
+conversely, to use the headers in-place, all one has to do is add \c{-I}
+pointing to \c{include/}.  On the other hand, it is common for public headers
+to include private, for example, to call an implementation-detail function in
+inline or template code (note that the same applies to private modules
+imported in public module interfaces). Which means such private, (or, probably
+now more accurately called implementation-detail) headers have to be placed in
+the \c{include/} directory as well, perhaps into a subdirectory (such
+\c{details/}) or with a file name suffix (such as \c{-impl}) to signal to the
+user that they are still \"private\". Needless to say, keeping track of which
+private headers can still stay in \c{src/} and which have to be moved to
+\c{include/} (and vice versa) is an arduous, error-prone task. As a result,
+practically, the split layout quickly degrades into the \"all headers in
+\c{include/}\" arrangement which negates its main advantage.
+
+It is also not clear how the split layout will fit modularized projects. With
 modules, both the interface and implementation (including non-inline/template
 function definitions) can reside in the same file with a substantial number of
-developers finding this arrangement appealing. If a project consists of only
-such single-file modules, then \c{include/} and \c{src/} are effectively
+C++ developers finding this arrangement appealing. If a project consists of
+only such single-file modules, then \c{include/} and \c{src/} are effectively
 become the same thing. In a sense, we already have this situation with
 header-only libraries except that in case of modules calling the directory
 \c{include/} would be an anachronism.
@@ -2164,19 +2172,17 @@ To summarize, the split directory arrangement offers little benefit over the
 single directory layout, has a number of real drawbacks, and does not fit
 modularized projects well. In practice, private headers are placed into
 \c{include/}, often either in a subdirectory or with a special file name
-suffix, a mechanis that is readily available to the single directory layout.|
+suffix, a mechanism that is readily available to the single directory layout.|
 
-All headers within a project should be included using the \c{<>} style and
-contain the project name as a directory prefix. And all headers means \i{all
-headers} \- public, private, or implementation detail, in executables and in
-libraries.
+All headers within a project should be included using the \c{<>} style
+inclusion and contain the project name as a directory prefix. And all headers
+means \i{all headers} \- public, private, or implementation detail, in
+executables and in libraries.
 
 As an example, let's say we've added \c{utility.hxx} to our \c{hello}
 executable project. This is how it should be included in \c{hello.cxx}:
 
 \
-// hello/hello.cxx
-
 // #include \"utility.hxx\"           // Wrong.
 // #include <utility.hxx>           // Wrong.
 // #include \"../hello/utility.hxx\"  // Wrong.
@@ -2193,11 +2199,11 @@ should look like this:
 
 \N|The problem with the \c{\"\"} style inclusion is if the header is not found
 relative to the including file, most compilers will continue looking for it in
-the include search paths (\c{-I}). As a result, if the header is not present
-in the right place (for example, because it was mistakenly not listed as to be
-installed), chances are that a completely unrelated header with the same name
-will be found and included. Needless to say, debugging situations like these
-is unpleasant.
+the include search paths, the same as for \c{<>}. As a result, if the header
+is not present in the right place (for example, because it was mistakenly not
+listed as to be installed), chances are that a completely unrelated header
+with the same name will be found and included. Needless to say, debugging
+situations like these is unpleasant.
 
 Similarly, prefixing all inclusions with the project name makes sure that
 headers with common names (for example, \c{utility.hxx}) can coexist (for
@@ -2216,8 +2222,8 @@ So let's imagine the \c{\"\"} style inclusion does not exist and we will all
 have a much better time.|
 
 If you have to disregard every rule and recommendation in this section but
-one, for example, because you are working an existing library, then insist on
-this: \i{public header inclusions must use the library name as a directory
+one, for example, because you are working on an existing library, then insist
+on this: \b{public header inclusions must use the library name as a directory
 prefix}.
 
 The project's source subdirectory can have subdirectories of its own, for
@@ -2228,7 +2234,7 @@ example, into \c{/usr/include}), this subdirectory hierarchy is
 automatically recreated.
 
 If you would like to separate public API headers/modules from implementation
-details, then the convention is to place then into the \c{details/}
+details, then the convention is to place them into the \c{details/}
 subdirectory. For example:
 
 \
@@ -2242,14 +2248,49 @@ libhello/
 It is recommended that you still install the implementation details headers
 and modules for the reasons discussed above. If, however, you would like to
 disable their installation, you can add the following line to your source
-subdirectory's \c{buildfile}:
+subdirectory \c{buildfile}:
 
 \
 details/hxx{*}: install = false
 \
 
+\N|If you are creating a \i{family of libraries} with the common name prefix,
+then it makes sense to use a nested source directory layout with the common
+top-level directory. As an example, let's say we have the \c{libstud-path} and
+\c{libstud-url} libraries that belong to the same \c{libstud} family. Their
+source subdirectory layouts could look like this:
+
+\
+libstud-path/
+└── libstud/
+    └── path/
+        ├── path.hxx
+        ├── path-io.hxx
+        ├── ...
+        └── buildfile
+
+libstud-url/
+└── libstud/
+    └── url/
+        ├── url.hxx
+        ├── url-io.hxx
+        ├── ...
+        └── buildfile
+\
+
+With the header inclusion paths adjusted accordingly:
+
+\
+#include <libstud/path/path.hxx>
+#include <libstud/url/url.hxx>
+\
+
+|
+
+\h#proj-struct-src-name|Source Naming|
+
 When naming source files, only use ASCII alphabetic characters, digits, as
-well as \c{_} (underscore) and \c{-} (minus). Use \c{.} (dot/period) only for
+well as \c{_} (underscore) and \c{-} (minus). Use \c{.} (dot) only for
 extensions, that is, trailing parts of the name that \i{classify} your files.
 Examples of good names:
 
@@ -2287,7 +2328,7 @@ files with the same name (or, sometimes, name prefix) are assumed to be
 related and are collectively called a \i{module}. \N{This term is meant to
 correspond directly to a C++ module.}
 
-By default the \l{bdep-new(1)} command use the \c{.?xx} scheme. To use
+By default the \l{bdep-new(1)} command uses the naming \c{.?xx} scheme. To use
 \c{.?pp} instead, pass \c{-t\ c++,cpp}.
 
 \N|There are several reasons not to \"reuse\" the \c{.h} C header extension
@@ -2305,6 +2346,9 @@ for C++ files:
 The last two reasons are also why headers without extensions are probably not
 worth the trouble.|
 
+
+\h#proj-struct-src-content|Source Contents|
+
 Let's now move inside our source file: All macros defined by a project, such
 as include guards, version macro, etc., must all start with the project name
 (including the \c{lib} prefix for libraries), for example
@@ -2323,22 +2367,22 @@ namespace hello
 \
 
 Executable project may use a namespace (in which case it is natural to name it
-after the project) and its modules shouldn't be qualified with the project
-name (in order not to clash with similarly named modules from the
+after the project) and its (private) modules shouldn't be qualified with the
+project name (in order not to clash with similarly named modules from the
 corresponding library, if any).
 
 \N|Hopefully by now the insistence on the \c{lib} prefix should be easy to
 understand: oftentimes executables and libraries come in pairs, for example
 \c{hello} and \c{libhello}, with the reusable functionality being factored out
-from the executable and into the library. It is natural to want to use the
-same name \i{stem} (\c{hello} in our case) for both.
+from the executable into the library. It is natural to want to use the same
+name \i{stem} (\c{hello} in our case) for both.
 
 The above naming scheme (with the \c{lib} prefix present in some names but not
 the others) is carefully crafted to allow such library/executable pairs to
 coexist and be used together without too much friction. For example, both the
 library and executable can have a header called \c{utility.hxx} with the
 executable being able to include both and even get the \"merged\"
-functionality without any extra effort (since they use the same namespace):
+functionality without extra effort (since they use the same namespace):
 
 \
 // hello/hello.cxx
@@ -2354,11 +2398,19 @@ namespace hello
 
 |
 
-The source file that implements a module's unit tests should be placed next to
-that module's other files and called with the module's name plus the \c{.test}
+A canonical library project contains two special headers: \c{export.hxx} (or
+\c{export.hpp}) that defines the library's symbol exporting macro as well as
+\c{version.hxx} (or \c{version.hpp}) that defines the library's version macros
+(see \l{b#module-version \c{version} Module} for details).
+
+
+\h#proj-struct-tests|Tests|
+
+A source file that implements a module's unit tests should be placed next to
+that module's files and called with the module's name plus the \c{.test}
 second-level extension. If a module uses Testscript for unit testing, then the
-corresponding file should be called with the module's name plus the \c{.test}
-extension. For example:
+corresponding file should be called with the module's name plus the
+\c{.test.testscript} extension. For example:
 
 \
 libhello/
@@ -2366,7 +2418,7 @@ libhello/
     ├── hello.hxx
     ├── hello.cxx
     ├── hello.test.cxx
-    └── hello.test
+    └── hello.test.testscript
 \
 
 \N|All source files (that is, headers, modules, etc.) with the \c{.test}
@@ -2374,14 +2426,15 @@ second-level extension are assumed to belong to unit tests and are
 automatically excluded from the library/executable sources.|
 
 The canonical library project created by \c{bdep-new} includes the \c{tests/}
-subdirectory which contains the library's integration (as opposed to unit)
-tests. Or, in other words, these are the tests that excercise the library via
-its public interface, just like the real users of the library would. The
+subdirectory which contains the library's functional/integration (as opposed
+to unit) tests. Or, in other words, these are the tests that exercise the
+library via its public API, just like the real users of the library would. The
 \c{tests/} subdirectory is an unnamed subproject (in the build system terms)
 which allows us to build and run tests against an installed version of the
-library. \N{The \c{build2} CI implementation will automatically perform this
-test is a library contains the \c{tests/} subproject. See \c{bbot}
-\l{bbot#arch-worker Worker Logic} for details.}
+library (see \l{b#intro-operations-test Testing} for more information on the
+contents of this directory). \N{The \c{build2} CI implementation will
+automatically perform this test is a library contains the \c{tests/}
+subproject. See \c{bbot} \l{bbot#arch-worker Worker Logic} for details.}
 
 By default executable projects do not have the \c{tests/} subprojects instead
 placing integration tests next to the source code (the \c{testscript} file;
@@ -2389,23 +2442,43 @@ see \l{testscript The build2 Testscript Language} for details). However, if
 desired, executable projects can have the \c{tests/} subproject, the same as
 libraries.
 
-Other recommended top-level subdirectory names are \c{examples/} (for
-libraries it is normally a subproject like \c{tests/}), \c{doc/}, and \c{etc/}
-(sample configurations, scripts, third-party contributions, etc).
+\N|By default projects created by \c{bdep-new} include support for
+functional/integration testing but exclude support for unit testing. These
+default, however, can be overridden with \c{no-tests} and \c{unit-tests}
+options, respectively. For example:
+
+\
+$ bdep new -t lib,unit-tests -l c++ libhello
+\
+
+The rationale behind these defaults is that if a functionality can be tested
+through the public API, then we should generally prefer integration to unit
+testing. And in simple projects the entire functionality is often exposed
+through the public API. At the same time, support for unit testing adds extra
+complexity to the build infrastructure. Note also that it is fairly
+straightforward to add support for unit testing at a later stage. The relevant
+build logic is localized in the source subdirectory \c{buildfile} so you can
+simply generate a new project with unit tests enables and copy over the
+relevant parts.|
+
+
+\h#proj-struct-build-out|Build Output|
 
-Note that there are no \c{bin/} or \c{obj/} subdirectories: output (object
-files, libraries, executables, etc.) go into a parallel directory structure
-(in case of an out-of-source build) or next to the sources (in case of an
-in-source build).
+There are no \c{bin/} or \c{obj/} subdirectories: build output (object files,
+libraries, executables, etc.) go into a parallel directory structure (in case
+of an out of source build) or next to the sources (in case of an in source
+build). See \l{b#intro-dirs-scopes Directories and Scopes} for details on
+in/out of source builds.
 
 Projects managed with \l{bdep(1)} are always built out-of-source. However, by
 default, the source directory is configured as \i{forwarded} to one of the
 out-of-source builds. This has two effects: we can run the build system driver
 \l{b(1)} directly in the source directory and certain \"interesting\" output
 (such as executables, documentation, test results, etc) will be automatically
-\i{backlinked} to the source directory (see \l{b(1)} for details on forwarded
-configurations). The following listing illustrates this setup for our
-\c{hello} project (executables are marked with \c{*}):
+\i{backlinked} to the source directory (see \l{b#intro-operations-config
+Configuration} for details on forwarded configurations). The following listing
+illustrates this setup for our \c{hello} project (executables are marked with
+\c{*}):
 
 \
                  hello-gcc/
@@ -2416,17 +2489,16 @@ hello/    ~~~    └── hello/
     └── hello     -->    └── *hello
 \
 
-The result is an \i{as-if} in-source build with all the benfits (such as
+The result is an \i{as-if} in-source build with all the benefits (such as
 having both source and relevant output in the same directory) but without any
 of the drawback (such as the inability to have multiple builds or source
 directory cluttered with object files).
 
 \N|The often cited motivation for placing executables into \c{bin/} is that in
-many build system by also copying shared libraries there it is the only way to
-make things runnable in a reasonably cross-platform manner. The major drawback
-of this arrangement is the need for unique executable names which is
-especially constraining when writing tests where it is convenient to call the
-executable just \c{driver} or \c{test}.
+many build system it is the only way to make things runnable in a reasonably
+cross-platform manner. The major drawback of this arrangement is the need for
+unique executable names which is especially constraining when writing tests
+where it is convenient to call the executable just \c{driver} or \c{test}.
 
 In \c{build2} there is not such restrictions and all executables can run
 \i{in-place}. This is achieved with \c{rpath} which is emulated with DLL
author	Boris Kolpackov <boris@codesynthesis.com>	2018-08-30 15:18:17 +0200
committer	Boris Kolpackov <boris@codesynthesis.com>	2018-08-30 15:18:17 +0200
commit	69da791221597621a4ce0a34d5a3e3cee05adc62 (patch)
tree	974156e25097117a249efe1f237956f1d0cb9e32 /doc
parent	57e6b4f4371842dac2ed829525b032845373dc57 (diff)