bash functions: enclosing the body in braces vs. parentheses

Why are braces used by default to enclose the function body instead of parentheses?

The body of a function can be any compound command. This is typically { list; }, but three other forms of compound commands are technically allowed: (list), ((expression)), and [[ expression ]].

C and languages in the C family like C++, Java, C#, and JavaScript all use curly braces to delimit function bodies. Curly braces are the most natural syntax for programmers familiar with those languages.

Are there other major downsides (*) to using parentheses instead of braces (which might explain why braces seem to be preferred)?

Yes. There are numerous things you can't do from a sub-shell, including:

  • Change global variables. Variables changes will not propagate to the parent shell.
  • Exit the script. An exit statement will exit only the sub-shell.

Starting a sub-shell can also be a serious performance hit. You're launching a new process each time you call the function.

You might also get weird behavior if your script is killed. The signals the parent and child shells receive will change. It's a subtle effect but if you have trap handlers or you kill your script those parts not work the way you want.

When shall I use curly braces to enclose the function body, and when is it advisable to switch to parentheses?

I would advise you to always use curly braces. If you want an explicit sub-shell, then add a set of parentheses inside the curly braces. Using just parentheses is highly unusual syntax and would confuse many people reading your script.

foo() {
   (
       subshell commands;
   )
}

It really matters. Since bash functions do not return values and the variables they used are from the global scope (that is, they can access the variables from "outside" its scope), the usual way to handle the output of a function is to store the value in a variable and then call it.

When you define a function with (), you are right: it will create sub-shell. That sub-shell will contain the same values the original had, but won't be able to modify them. So that you are losing that resource of changing global scope variables.

See an example:

$ cat a.sh
#!/bin/bash

func_braces() { #function with curly braces
echo "in $FUNCNAME. the value of v=$v"
v=4
}

func_parentheses() (
echo "in $FUNCNAME. the value of v=$v"
v=8
)


v=1
echo "v=$v. Let's start"
func_braces
echo "Value after func_braces is: v=$v"
func_parentheses
echo "Value after func_parentheses is: v=$v"

Let's execute it:

$ ./a.sh
v=1. Let's start
in func_braces. the value of v=1
Value after func_braces is: v=4
in func_parentheses. the value of v=4
Value after func_parentheses is: v=4   # the value did not change in the main shell

I tend to use a subshell when I want to change directories, but always from the same original directory, and cannot be bothered to use pushd/popd or manage the directories myself.

for d in */; do
    ( cd "$d" && dosomething )
done

This would work as well from a function body, but even if you define the function with curly braces, it is still possible to use it from a subshell.

doit() {
    cd "$1" && dosomething
}
for d in */; do
    ( doit "$d" )
done

Of course, you can still maintain variable scope inside a curly-brace-defined function using declare or local:

myfun() {
    local x=123
}

So I would say, explicitly define your function as a subshell only if not being a subshell is detrimental to the obvious correct behavior of that function.

Trivia: As a side note, consider that bash actually always treats the function as a curly-brace compound command. It just sometimes has parentheses in it:

$ f() ( echo hi )
$ type f
f is a function
f () 
{ 
    ( echo hi )
}