summaryrefslogtreecommitdiff
path: root/doc/bash-style.cli
blob: 4edc984dee58277178041c8d219285beb7ba92cc (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
// file      : doc/bash-style.cli
// license   : MIT; see accompanying LICENSE file

"\title=Bash Style Guide"

// NOTES
//
// - Maximum <pre> line is 70 characters.
//

"\h1|Table of Contents|"
"\$TOC$"

"
\h1#intro|Introduction|

Bash works best for simple tasks. Needing arrays, arithmetic, and so on, is
usually a good indication that the task at hand may be too complex for Bash.

Most of the below rules can be broken if there is a good reason for it.
Besides making things consistent, rules free you from having to stop and think
every time you encounter a particular situation. But if it feels that the
prescribed way is clearly wrong, then it probably makes sense to break it.
You just need to be clear on why you are doing it.

See also \l{https://google.github.io/styleguide/shell.xml Google's Bash Style
Guide} as well as \l{https://github.com/progrium/bashstyle Let's do Bash
right!}; we agree with quite a few (but not all) items in there. In particular,
the former provides a lot more rationale compared to this guide.

\h1#style|Style|

Don't use any extensions for your scripts. That is, call it just \c{foo}
rather than \c{foo.sh} or \c{foo.bash} (though we do use the \c{.bash}
extension for
\l{https://build2.org/build2/doc/build2-build-system-manual.xhtml#module-bash
Bash modules}). Use lower-case letters and dash to separate words, for example
\c{foo-bar}.

Indentation is two spaces (not tabs). Maximum line length is 79 characters
(excluding newline). Use blank lines between logical blocks to improve
readability.

Variable and function names should use lower-case letters with underscores
separating words.

For \c{if}/\c{while} and \c{for}/\c{do} the corresponding \c{then} or \c{do}
is written on the same line after a semicolon, for example:

\
if [[ ... ]]; then
  ...
fi

for x in ...; do
  ...
done
\

Do use \c{elif} instead of nested \c{else} and \c{if} (and consider if
\c{case} can be used instead).

For \c{if}/\c{while} use \c{[[ ]]} since it results in cleaner code for
complex expressions, for example:

\
if [[ \"$foo\" && (\"$bar\" || \"$baz\") ]]; then
  ...
fi
\

\N|If for some reason you need the semantics of \c{[}, use \c{test} instead to
make it clear this is intentional.|

\h1#struct|Structure|

The overall structure of the script should be as follows:

\
#! /usr/bin/env bash

# <SUMMARY>
#
# [<FUNCTIONALITY-DESCRIPTION>]
#
# [<OPTIONS-DESCRIPTION>]
#
usage=\"usage: $0 <OPTIONS>\"

owd=\"$(pwd)\"
trap \"{ cd '$owd'; exit 1; }\" ERR
set -o errtrace   # Trap in functions and subshells.
set -o pipefail   # Fail if any pipeline command fails.
shopt -s lastpipe # Execute last pipeline command in the current shell.
shopt -s nullglob # Expand no-match globs to nothing rather than themselves.

function info () { echo \"$*\" 1>&2; }
function error () { info \"$*\"; exit 1; }

[<OPTIONS-ARGUMENTS-DEFAULTS>]

[<OPTIONS-ARGUMENTS-PARSING>]

[<OPTIONS-ARGUMENTS-VALIDATION>]

<FUNCTIONALITY>
\

\h#struct-summary|SUMMARY|

One-two sentences describing what the script does.

\h#struct-func-desc|FUNCTIONALITY-DESCRIPTION|

More detailed functionality description for more complex scripts.

\h#struct-opt-desc|OPTIONS-DESCRIPTION|

Description of command line options. For example:

\
# -q
#   Run quiet.
#
# -t <dir>
#   Specify the alternative toolchain installation directory.
\

\h#struct-opt|OPTIONS|

Command line options summary. For example:

\
usage=\"usage: $0 [-q] [-t <dir>] <file>\"
\

\h#struct-opt-arg-default|OPTIONS-ARGUMENTS-DEFAULTS|

Set defaults to variables that will contain option/argument values. For
example:

\
quiet=\"n\"
tools=/usr/local
file=
\

\h#struct-opt-arg-parse|OPTIONS-ARGUMENTS-PARSING|

Parse the command line options/arguments. For example:

\
while [[ \"$#\" -gt 0 ]]; do
  case \"$1\" in
    -q)
      quiet=\"y\"
      shift
      ;;
    -t)
      shift
      tools=\"${1%/}\"
      shift
      ;;
    *)
      if [[ -n \"$file\" ]]; then
        error \"$usage\"
      fi

      file=\"$1\"
      shift
      ;;
  esac
done
\

If the value you are expecting from the command line is a directory path,
then always strip the trailing slash (as shown above for the \c{-t} option).

\h#struct-opt-arg-valid|OPTIONS-ARGUMENTS-VALIDATION|

Validate option/argument values. For example:

\
if [[ -z \"$file\" ]]; then
  error \"$usage\"
fi

if [[ ! -d \"$file\" ]]; then
  error \"'$file' does not exist or is not a directory\"
fi
\

\h#struct-func|FUNCTIONALITY|

Implement script logic. For diagnostics use the \c{info()} and \c{error()}
functions defined above (so that it goes to stderr, not stdout). If using
functions, then define them just before use.

\h1#quote|Quoting|

We quote every variable expansion, no exceptions. For example:

\
if [[ -n \"$foo\" ]]; then
  ...
fi
\

\N|While there is no word splitting in the \c{[[ ]]} context, we still quote
variable expansions for consistency.|

This also applies to command substitution (which we always write as
\c{$(foo arg)} rather than \c{`foo arg`}), for example:

\
list=\"$(cat foo)\"
\

Note that a command substitution creates a new quoting context, for example:

\
list=\"$(basename \"$1\")\"
\

We also quote values that are \i{strings} as opposed to options/file names,
paths, enum-like values, or integers. Prefer single quotes for \c{sed}
scripts, for example:

\
url=\"https://example.org\"  # String.
quiet=y                    # Enum-like.
verbosity=1                # Integer.
dir=/etc                   # Directory path.
out=/dev/null              # File path.
file=manifest              # File name.
option=--quiet             # Option name.
seds='s%^./%%'             # sed script.
\

Take care to quote globs that are not meant to be expanded, for example:

\
unset \"array[0]\"
\

And since quoting will inhibit globbing, you may end up with expansions along
these lines:

\
rm -f \"$dir/$name\".*
\

Note also that globbing is not performed in the \c{[[ ]]} context so this is
ok:

\
if [[ -v array[0] ]]; then
  ...
fi
\

\N|One exception to this quoting rule is arithmetic expansion (\c{$((\ ))}):
Bash treats it as if it was double-quoted and, as a result, any inner quoting
is treated literally. For example:

\
z=$(($x + $y))           # Ok.
z=$((\"$x\" + \"$y\"))       # Error.
z=$(($x + $(echo \"$y\"))) # Ok.
\

|


If you have multiple values (e.g., program arguments) that may contain spaces,
don't try to handle them with quoting and use arrays instead. Here is a
typical example of a space-aware argument handling:

\
files=()

while [[ \"$#\" -gt 0 ]]; do
  case \"$1\" in

    ...

    *)
      files+=(\"$1\")
      shift
      ;;
  esac
done

rm -f \"${files[@]}\"
\

In the same vein, never write:

\
cmd $*
\

Instead always write:

\
cmd \"$@\"
\

Also understand the difference between \c{@} and \c{*} expansion:

\
files=('one' '2 two' 'three')
echo \"files: ${files[@]}\"  # $1='files: one', $2='2 two', $3='three'
echo \"files: ${files[*]}\"  # $1='files: one 2 two three'
\


\h1#bool|Boolean|

For boolean values use empty for false and \c{true} for true. This way you
can have terse and natural looking conditions, for example:

\
first=true
while ...; do

  if [[ ! \"$first\" ]]; then
     ...
  fi

  if [[ \"$first\" ]]; then
     first=
  fi

done
\


\h1#subshell|Subshell|

Bush executes certain constructs in \i{subshells} and some of these constructs
may not be obvious:

\ul|

\li|Explicit subshell: \c{(...)}|

\li|Pipeline: \c{...|...}|

\li|Command substitution: \c{$(...)}|

\li|Process substitution: \c{<(...)}, \c{>(...)}|

\li|Background: \c{...&}, \c{coproc ...}|

|

Naturally, a subshell cannot modify any state in the parent shell, which
sometimes leads to counter-intuitive behavior, for example:

\
lines=()

... | while read l; do
  lines+=(\"$l\")
done
\

At the end of the loop, \c{lines} will remain empty since the loop body is
executed in a subshell. One way to resolve this is to use the program
substitution instead of the pipeline:

\
lines=()

while read l; do
  lines+=(\"$l\")
done < <(...)
\

This, however, results in an unnatural, backwards-looking (compared to the
pipeline) code. Instead, we can request the last command of the pipeline to be
executed in the parent shell with the \c{lastpipe} shell option, for example:

\
shopt -s lastpipe

lines=()

... | while read l; do
  lines+=(\"$l\")
done
\

\N|The \c{lastpipe} shell option is inherited by functions and subshells.|


\h1#function|Functions|

If a function takes arguments, provide a brief usage after the function
header, for example:

\
function dist() # <pkg> <dir>
{
  ...
}
\

For non-trivial/obvious functions also provide a short description of its
functionality/purpose, for example:

\
# Prepare a distribution of the specified packages and place it
# into the specified directory.
#
function dist() # <pkg> <dir>
{
  ...
}
\

Inside functions use local variables, for example:

\
function dist()
{
  local x=\"foo\"
}
\

If the evaluation of the value may fail (e.g., it contains a program
substitution), then place the assignment on a separate line since \c{local}
will cause the error to be ignored. For example:

\
function dist()
{
  local b
  b=\"$(basename \"$2\")\"
}
\

A function can return data in two primary ways: exit code and stdout.
Normally, exit code 0 means success and exit code 1 means failure though
additional codes can be used to distinguish between different kinds of
failures (for example, \"hard\" and \"soft\" failures), signify special
conditions, etc., see \l{#error-handing Error Handling} for details.

A function can also write to stdout with the result available to the caller in
the same way as from programs (command substitution, pipeline, etc). If a
function needs to return multiple values, then it can print them separated
with newlines with the caller using the \c{readarray} builtin to read them
into an indexed array, for example:

\
function func ()
{
  echo one
  echo two
  echo three
}

func | readarray -t r
\

\N|The use of the newline as a separator means that values may not contain
newlines. While \c{readarray} supports specifying a custom separator with the
\c{-d} option, including a \c{NUL} separator, this support is only available
since Bash 4.4.|

This technique can also be extended to return an associative array by first
returning the values as an indexed array and then converting them to
an associative array with \c{eval}, for example:

\
function func ()
{
  echo \"[a]=one\"
  echo \"[b]=two\"
  echo \"[c]=three\"
}

func | readarray -t ia

eval declare -A aa=(\"${ia[@]}\")
\

Note that if a key or a value contains whitespaces, then it must be quoted.
The recommendation is to always quote both, for example:

\
function func ()
{
  echo \"['a']='one ONE'\"
  echo \"['b']='two'\"
  echo \"['c']='three'\"
}
\

Or, if returning a local array:

\
function func ()
{
  declare -A a=([a]='one ONE' [b]=two [c]=three)

  for k in \"${!a[@]}\"; do
    echo \"['$k']='${a[$k]}'\"
  done
}
\

For more information on returning data from functions, see
\l{https://mywiki.wooledge.org/BashFAQ/084 BashFAQ#084}.


\h1#error-handing|Error Handling|

Our scripts use the \c{ERR} trap to automatically terminate the script in case
any command fail. This semantics is also propagated to functions and subshells
by specifying the \c{errtrace} shell option and to all the commands of a
pipeline by specifying the \c{pipefail} option.

\N|Without \c{pipefail}, a non-zero exit of any command in the pipeline except
the last is ignored. The \c{pipefail} shell option is inherited by functions
and subshells.|

\N|While the \c{nounset} options may also seem like a good idea, it has
subtle, often latent pitfalls that make it more trouble than it's worth (see
\l{https://mywiki.wooledge.org/BashPitfalls#nounset \c{nounset} pitfalls}).|

The \c{pipefail} semantics is not without pitfalls which should be kept in
mind. In particular, if a command in a pipeline exits before reading the
preceding command's output in its entirety, such a command may exit with a
non-zero exit status (see \l{https://mywiki.wooledge.org/BashPitfalls#pipefail
\c{pipefail} pitfalls} for details).

\N|Note that in such a situation the preceding command may exit with zero
status not only because it gracefully handled \c{SIGPIPE} but also because all
of its output happened to fit into the pipe buffer.|

For example, these are the two common pipelines that may exhibit this issue:

\
prog | head -n 1
prog | grep -q foo
\

In these two cases, the simplest (though not the most efficient) way to work
around this issue is to reimplement \c{head} with \c{sed} and to get rid of
\c{-q} in \c{grep}, for example:

\
prog | sed -n -e '1p'
prog | grep foo >/dev/null
\

If you need to check the exit status of a command, use \c{if}, for example:

\
if grep -q \"foo\" /tmp/bar; then
  info \"found\"
fi

if ! grep -q \"foo\" /tmp/bar; then
  info \"not found\"
fi
\

Note that the \c{if}-condition can be combined with capturing the output, for
example:

\
if v=\"$(...)\"; then
  ...
fi
\

But keep in mind that in Bash a failure is often indistinguishable from a
true/false result. For example, in the above \c{grep} command, the result will
be the same whether there is no match or if the file does not exist.

Furthermore, in certain contexts, the above-mentioned \c{ERR} trap is ignored.
Quoting from the Bash manual:

\i{The \c{ERR} trap is not executed if the failed command is part of the
command list immediately following an \c{until} or \c{while} keyword, part of
the test following the \c{if} or \c{elif} reserved words, part of a command
executed in a \c{&&} or \c{||} list except the command following the final
\c{&&} or \c{||}, any command in a pipeline but the last, or if the command’s
return status is being inverted using \c{!}. These are the same conditions
obeyed by the \c{errexit} (\c{-e}) option.}

To illustrate the gravity of this point, consider the following example:

\
function cleanup()
{
  cd \"$1\"
  rm -f *
}

if ! cleanup /no/such/dir; then
  ...
fi
\

Here, the \c{cleanup()} function will continue executing (and may succeed)
even if the \c{cd} command has failed.

Note, however, that notwithstanding the above statement from the Bash manual,
the \c{ERR} trap is executed inside all the subshell commands of a pipeline
provided the \c{errtrace} option is specified. As a result, the above code can
be made to work by temporarily disabling \c{pipefail} and reimplementing it as
a pipeline:

\
set +o pipefail
cleanup /no/such/dir | cat
r=\"${PIPESTATUS[0]}\"
set -o pipefail

if [[ \"$r\" -ne 0 ]]; then
  ...
fi
\

\N|Here, if \c{cleanup}'s \c{cd} fails, the \c{ERR} trap will be executed in
the subshell, causing it to exit with an error status, which the parent shell
then makes available in \c{PIPESTATUS}.|

The recommendation is then to avoid calling functions in contexts where the
\c{ERR} trap is ignored resorting to the above pipe trick where that's not
possible.  And to be mindful of the potential ambiguity between the true/false
result and failure for other commands. The use of the \c{&&} and \c{||}
command expressions is best left to the interactive shell.

\N|The pipe trick cannot be used if the function needs to modify the global
state. Such a function, however, might as well return the exit status also as
part of the global state. The pipe trick can also be used to ignore the exit
status of a command.|

The pipe trick can also be used to distinguish between different exit codes,
for example:

\
function func()
{
  bar  # If this command fails, the function returns 1.

  if ... ; then
    return 2
  fi
}

set +o pipefail
func | cat
r=\"${PIPESTATUS[0]}\"
set -o pipefail

case \"$r\" in
  0)
    ;;
  1)
    exit 1
    ;;
  2)
    ...
    ;;
esac
\

\N|In such functions it makes sense to keep exit code 1 to mean failure so
that the inherited \c{ERR} trap can be re-used.|

This technique can be further extended to implement functions that both
return multiple exit codes and produce output, for example:

\
function func()
{
  bar  # If this command fails, the function returns 1.

  if ... ; then
    return 2
  fi

  echo result
}

set +o pipefail
func | readarray -t ro
r=\"${PIPESTATUS[0]}\"
set -o pipefail

case \"$r\" in
  0)
    echo \"${ro[0]}\"
    ;;
  1)
    exit 1
    ;;
  2)
    ...
    ;;
esac
\

\N|We use \c{readarray} instead of \c{read} since the latter fails if the left
hand side of the pipeline does not produce anything.|

"