Categories
Development System Administration Technical

2025 Bash tips

Been doing a lot of Bash coding in the past year developing Zesk Build: Pipeline, build, and operations tools useful for any project an open source project I started after I realized I kept using the same code base for all of my projects to build and maintain them.

bash was a logical choice largely because it is universal, stable, secure, and installed by default (nearly) on all Linux or variant operating system by default.

Mac OS X has bash 3 installed by default and recently moved to zsh as their default shell (as an operating system default…) I am going to assume largely due to the fact that bash‘s version by GNU may have licensing that Apple does not like – just a guess. Apple’s version of bash is GNU bash, version 3.2.57 as of February 2025. bash also adheres to a standard defined in IEEE 1003.2 shell which is behind a paywall.

As well, it’s where I spend most of my time in addition to using the great iTerm2 terminal project.

Quickly: Zesk Build gives you tools for your pipelines and production systems and is a framework as well as a library of functions written in bash. It is near its 1.0 version and has self-documenting functions, management functions for nearly every aspect of your system, and tools to integrate with many of the common tools used to build web applications.

In the process of building it, I have found many tips in working with bash as well as things to watch out for.

Beware read

read returns an error when it hits end of file yet can still succeed if the last entry happens to be right before the end of file. This pattern in your code:

while read -r lineContent; do
  [ -n "$lineContent" || continue
   # handle line content
   ...
done < "$fileToLoad"

will fail when the last line does not contain a newline, for example, but $lineContent will have a valid value for the last line of the file and will be skipped, for example.

The revised pattern is:

local done=false
while ! $done; do
  read -r lineContent || done=true
  [ -n "$lineContent" || continue
   # handle line content
   ...
done < "$fileToLoad"

Compact conditions with || and true-ish statements

Most statements which can fail appear in the form:

doSomething || return $?

For those unfamiliar, it means run doSomething and if it fails with a non-zero exit code it will then return that exit code ($? represents the value of the most recent statement’s exit code).

You can chain these commands to handle primitive ternary functions in bash like:

[ "$value" -gt 0 ] && printf "%s" "Some" || printf "%s" "Zero"

The key part of this is to avoid having any section evaluate to false and not exit afterwards. Why?

So, this is bad:

$verbose && printf "%s\n" "Here's my verbose message to you"

And this is good:

! $verbose || printf "%s\n" "Here's my verbose message to you"

The first one will fail in scripts using set -e and the 2nd one will not.

This advice largely applies to those who are writing libraries for other people to use, but I would argue that you should do this all of this time.

set -e – if set by a calling script, for example, fails on any error so this is a practice which depends on it.

As a result, many tests evaluate for the negative of a result to test whether failure occurs:

[ -n "$codeName" ] || __throwArgument "$usage" "codeName is blank" || return $?

Meaning is:

  • If codeName is not empty, success.
  • If not, return the argument error code and output “codeName is blank” to stderr.
  • __throwArgument always returns 1, so return 1 is called immediately afterwards

Be verbose and clean

The default of many (most?) shell commands is to output nothing and exit with an exit code. As a corollary to this, I have found that when things fail is usually best to output an error to stderr before returning non-zero.

There are a few ways to do this but – in general – it means checking the return code of each and every command that matters; with slim exceptions made for decoration. Anything that fails should then be cleaned up before returning.

Assume (and practice) set -eou pipefail

Put this at the top of any script which is meant to be invoked directly (as opposed to sourced).

Why?

For those who are unfamiliar

set -e – will exit the script immediately if any command fails. Good practice.

set -o pipefail – will exit the script immediately if any pipe fails midway in a pipe (otherwise you’d be surprised how this behaves)

set -u – will cause an error (and exit wink wink) when any variable is undefined and referenced

The above is all of them combined into a single statement.

Now that you know what it does, how will this blow up your bash scripts?

  1. You will be forced to explicitly define exported globals with default values if they are not set
  2. Any arrays which are zero length must be handled with a special + operator to return an empty list in for loops
  3. Local variables must be initialized to a default value

All of the above are very good things as they require us to handle and initialize variables which are used and ensure they have values. All good development.

Beware cd

Due to the fact that cd essentially modifies a global state (and PWD which is simply the I believe) (I know, right?) and affects future calls, avoid doing cd to ensure you are in the right directory, unless you control all aspects of all steps of your internal processes or in special cases.

If you need to change directories, an easier route is using pushd and popd with a muzzle wrapper (which hides stdout when not needed); a few tools like terraform and opentofu for some reason really prefer to be in the current directory; as well as git and any project-related tools.

A pushd recipe:

local undo=()
__catchEnvironment "$usage" muzzle pushd "$path" || return $?
undo+=(-- muzzle popd)
__catchEnvironment "$usage" ... || _undo $? "${undo[@]}" || return $?
__catchEnvironment "$usage" muzzle popd || return $?

mktemp Creates files

Really, it does – for a short second there I had thought that it solely gave you a file name but in fact it creates the file on disk and returns the file name. Similarly for a directory.

While most mktemp files are cleaned up eventually (as they are located in /tmp typically) – on systems which run for months (or years!) without reboot this volume may not be cleaned up – so it’s best to attempt to clean up any files prior to exiting any function. I typically do this by keeping a list of files to clean up as an array.

foo=$(mktemp) || return $?
clean+=("$foo")
someNextThing || _clean $? "${clean[@]}" || return $?

Use sugar (Syntactic)

I am not sure which professor at Dartmouth drilled the concept of syntactic sugar but it is a lesson and term which I still remember to date – and it makes a world of difference when writing libraries.

Syntactic sugar is “dressing code” which makes other code look cleaner or make more sense.

When I used to write a lot of C (no, not C++) – it was essential and usually in the form of macros handled by the preprocessor. You could build entire worlds in the preprocessor prior to compiling your code. In bash – you can write utility functions like _undo and _clean which are used in the bash CI toolkit to … undo steps and delete files prior to exiting a function.

An example here is a script which is intended to upgrade our live application.

Unsugared version

# Script to upgrade our application
...
# Create new app first
if ! newApp=$(mktemp -d); then
  return 1
fi
if ! copyAppTo "$newApp"; then
  rm -rf "$newApp"
  return 1
fi
# Backup our app
local backup="$oldApp.BACKUP.$$"
# ensure target is empty
# Now tell LB to stop serving from here
# move old app to back up
if ! rm -rf "$backup" || 
  ! notifyLoadBalancerWeAreDown; then
  rm -rf "$newApp" || :
  return 1
fi
# LB thinks we are down
if ! mv "$oldApp" "$backup"; then
  rm -rf "$newApp" || :
  notifyLoadBalancerWeAreBackUp || :
  return 1
fi
# Unstable state
if ! mv "$newApp" "$oldApp"; then
    mv -f "$backup" "$oldApp" || :
    rm -rf "$newApp" || :
    notifyLoadBalancerWeAreBackUp || :
    return 1
fi
# Can this fail?
rm -rf "$backup"

notifyLoadBalancerWeAreBackUp
# Stable state

Sugared version

Note we use undo and clean arrays as stacks, essentially, adding steps as we create them; and the unroll is handled by _undo and _clean.

# Script to upgrade our application
...
# Create new app first
newApp=$(mktemp -d) || return $?

local clean=("$newApp") undo=()
copyAppTo "$newApp" || _clean "$?" "${clean[@]}"|| return $?

# ensure target is empty
backup="$oldApp.BACKUP.$$"
rm -rf "$backup" || _clean "$?" "${clean[@]}"|| return $?

# Now tell LB to stop serving from here
notifyLoadBalancerWeAreDown || _clean "$?" "${clean[@]}"|| return $?
undo+=(-- notifyLoadBalancerWeAreBackUp)

# Handle move
mv "$oldApp" "$backup" && undo+=(-- mv -f "$backup" "$oldApp") && mv "$newApp" "$oldApp" || _undo $? "${undo[@]}" -- rm -rf "${clean[@]}" || return $?

# Stable state

Looks a lot cleaner, eh? The formatting which puts all captures of an event failure on one line is clean and helps to see how things work.

Lint your scripts

shellcheck is essential as is bash -l for any script. bashSanitize does this for you and more.

I have added a few additional checks which are helpful for any project which contains bash scripts:

  1. Scan for “set -x" or “set "-x” (Yes that is one quote) – avoid checking in debugging code unless you are sure you want to. Having done this more than once while debugging, worthwhile to avoid having this happen.
  2. Scan for any code which does not capture an error and return it properly (e.g. look for lines without || return $? at the end of them, for example) – using this library calls which contain __throw or __catch must have these so checks are done automatically.

If you want to have an easy pre-commit hook you can run gitInstallHooks in your project to install the default git pre-commit and post-commit hooks which handle bash linting for free.

Handle all errors (except)

Handle all errors, except anything which is

  • cosmetic (statusMessage for example, or user messages)
  • will be captured as an error later and wastes time now

In general, any failure usually means something is terribly wrong and so in general, errors should be the exception in any process which is more than a single step.

Example would be if after I deploy my application, I can not delete my temporary files (it fails!) should I consider this a failure?

I would argue that the reasons why this would happen are:

  • Bad permissions on these temp files (owned by someone else?)
  • Disk is full
  • Disk error

Long story short – which of the above is not absolutely terrible?

None. So handle all errors.

Stay tuned …

No language is without its issues … bash is not alone in having some minor gotcha’s.

I have found that the quality, degree and speed of building and testing code has improved rapidly with these tips. Will continue to post interesting notes about bash and its workings as I found out more.

Leave a Reply

Your email address will not be published. Required fields are marked *