Performance & efficiency
Before we begin, I would like to clarify that the power of shell scripting stems from the Unix-native packages. In this post, I will focus on what I believe to be the most crucial aspect of any programming language, which is efficiency.
One should do as little as possible in shell script and aim just to use it to connect the existing logic available in the rich set of utilities available on a UNIX system. !
It is worth noting that even ChatGpt, while powerful in its own right, is not
trained to write efficient code due to the limitations of the training data.
Therefore, there is a risk of producing suboptimal code.
To illustrate why efficiency is critical, let us consider a straightforward shell function that counts the number of lines in a file
- Inefficient Approach:
count_lines_in_file() {
lines=$(cat "$1" | wc -l)
echo "Number of lines: $lines"
}
In this approach, the cat command is used to read the entire file and then pipe it to the wc command to count the number of lines. This involves unnecessary I/O operations and can be improved.
- Efficient Approach:
count_lines_in_file() {
lines=$(wc -l "$1")
echo "Number of lines: $lines"
}
Seems very simple right? Yet this is the core idea of writing efficient scripts.
cat FILE.txt | grep DO_Something | grep SomethingElse | sed 's/SomethingElse/Replacement/'
You can achieve the same result more efficiently using a single grep or sed command. By avoiding unnecessary tool usage and combining commands efficiently, we simplify the script and improve its performance.
cutis a way faster thanawkso if you really don’t need it don’t use it !
Stop using cat if you don’t need it !
# Bad practise
cat file.txt | cut -d' ' -f1
cat file.txt | grep "Search For Something"
# Good practise
cut -d' ' -f1 file.txt
grep "Search For Something" file.txt
- Same is true for all other packages
'tr, grep, find, sed etc ...'
Use Streams
- Use streams instead of writing to files can be more efficient and can help avoid unnecessary disk I/O operations. When you write to a file, the data has to be written to disk, which can slow down your script if you are writing a lot of data.
- Use (variables, arrays etc) instead of storing data to a file
command1 | command2
This sends the output of “command1” to “command2” without having to write it to a file first. This can be especially useful when dealing with large amounts of data or when working with sensitive information that you don’t want to save to disk.
- When you use a temporary file make sure you cleanup !
someFonction{
# Doing something
trap cleanup INT QUIT TERM EXIT
cleanup(){
# remove the temporary when something goes wrong (or when the script finishes)
[ -f $tmpfile ] && rm $tmpfile
}
}
Stop using sed for simple stuff
Use ${a// /_} to replace spaces in variable names with underscore instead of
# Bad Practise
sed 's/ /_/g' VAR
Best Practices for File Naming
Use “./*.pdf” instead of “*.pdf”
To improve security, it is recommended to use the file path prefix of ./ when
specifying PDF files in a command. Instead of using just *.pdf, which would
match any PDF file in the current directory and possibly in subdirectories, use
the more specific pattern ././*.pdf.
Using ././*.pdf ensures that the command only operates on PDF files in the
current directory and not in subdirectories, which could potentially contain
files that are not intended to be operated on. This is an important security
measure to prevent accidental or malicious actions on files outside of the
current directory.
Why use sh over bash?
While bash is a more powerful shell language than sh, it also has more
complexity and features that can make scripts more difficult to read and
maintain.
Here are some reasons why you might choose sh over bash:
Portability:
shis more widely available on different Unix-like systems thanbash, which may not be installed by default on some systems. This means that scripts written in sh are more likely to work on different systems without modifications.Efficiency:
shis a simpler and more lightweight language thanbash, which can make scripts run faster and use less system resources.Simplicity:
shhas a simpler syntax and fewer features thanbash, which can make scripts easier to read and maintain.
Use set -e to exit on errors
Add set -e at the top of your script to exit immediately if any command returns
a non-zero status code. This can help catch errors early and prevent your
script from continuing in an invalid state.
Use $(command) instead of backticks.
Note: backticks: `someCommand`
Use $(command) instead of backticks to execute commands and capture their
output. Backticks can be difficult to read and can cause syntax errors in some
cases.
Debugging tricks
set -x # activate debugging from here
# Some Logic
set +x # stop debugging from here
Use printf instead of echo
Use printf instead of echo for more consistent and portable output formatting.
printf also supports more advanced formatting options.
Styling and readability
# Use this to conditionally execute a command based on the value of a variable
[ "$var" ] && command1 # If var is empty, command1 will not execute
# Instead of this, which can lead to unexpected behavior if var contains whitespace or special characters
[ ! -z $var ] && something
# Use this to check if the value of a variable is equal to a specific string
[ "$var" = "find" ] && echo found
# Instead of this, which is longer and less readable
if [ "$var" -eq 'find' ]; then
echo found
fi
# Use this to set a default value for a variable
"${var=value}"
# Instead of this, which is longer and less efficient
[ "$var" ] || var="value"
