How to modify a global variable within a function in bash?

This needs bash 4.1 if you use {fd} or local -n.

The rest should work in bash 3.x I hope. I am not completely sure due to printf %q - this might be a bash 4 feature.

Summary

Your example can be modified as follows to archive the desired effect:

# Add following 4 lines:
_passback() { while [ 1 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; return $1; }
passback() { _passback "$@" "$?"; }
_capture() { { out="$("${@:2}" 3<&-; "$2_" >&3)"; ret=$?; printf "%q=%q;" "$1" "$out"; } 3>&1; echo "(exit $ret)"; }
capture() { eval "$(_capture "$@")"; }

e=2

# Add following line, called "Annotation"
function test1_() { passback e; }
function test1() {
  e=4
  echo "hello"
}

# Change following line to:
capture ret test1 

echo "$ret"
echo "$e"

prints as desired:

hello
4

Note that this solution:

  • Works for e=1000, too.
  • Preserves $? if you need $?

The only bad sideffects are:

  • It needs a modern bash.
  • It forks quite more often.
  • It needs the annotation (named after your function, with an added _)
  • It sacrifices file descriptor 3.
    • You can change it to another FD if you need that.
      • In _capture just replace all occurances of 3 with another (higher) number.

The following (which is quite long, sorry for that) hopefully explains, how to adpot this recipe to other scripts, too.

The problem

d() { let x++; date +%Y%m%d-%H%M%S; }

x=0
d1=$(d)
d2=$(d)
d3=$(d)
d4=$(d)
echo $x $d1 $d2 $d3 $d4

outputs

0 20171129-123521 20171129-123521 20171129-123521 20171129-123521

while the wanted output is

4 20171129-123521 20171129-123521 20171129-123521 20171129-123521

The cause of the problem

Shell variables (or generally speaking, the environment) is passed from parental processes to child processes, but not vice versa.

If you do output capturing, this usually is run in a subshell, so passing back variables is difficult.

Some even tell you, that it is impossible to fix. This is wrong, but it is a long known difficult to solve problem.

There are several ways on how to solve it best, this depends on your needs.

Here is a step by step guide on how to do it.

Passing back variables into the parental shell

There is a way to pass back variables to a parental shell. However this is a dangerous path, because this uses eval. If done improperly, you risk many evil things. But if done properly, this is perfectly safe, provided that there is no bug in bash.

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }

d() { let x++; d=$(date +%Y%m%d-%H%M%S); _passback x d; }

x=0
eval `d`
d1=$d
eval `d`
d2=$d
eval `d`
d3=$d
eval `d`
d4=$d
echo $x $d1 $d2 $d3 $d4

prints

4 20171129-124945 20171129-124945 20171129-124945 20171129-124945

Note that this works for dangerous things, too:

danger() { danger="$*"; passback danger; }
eval `danger '; /bin/echo *'`
echo "$danger"

prints

; /bin/echo *

This is due to printf '%q', which quotes everything such, that you can re-use it in a shell context safely.

But this is a pain in the a..

This does not only look ugly, it also is much to type, so it is error prone. Just one single mistake and you are doomed, right?

Well, we are at shell level, so you can improve it. Just think about an interface you want to see, and then you can implement it.

Augment, how the shell processes things

Let's go a step back and think about some API which allows us to easily express, what we want to do.

Well, what do we want do do with the d() function?

We want to capture the output into a variable. OK, then let's implement an API for exactly this:

# This needs a modern bash 4.3 (see "help declare" if "-n" is present,
# we get rid of it below anyway).
: capture VARIABLE command args..
capture()
{
local -n output="$1"
shift
output="$("$@")"
}

Now, instead of writing

d1=$(d)

we can write

capture d1 d

Well, this looks like we haven't changed much, as, again, the variables are not passed back from d into the parent shell, and we need to type a bit more.

However now we can throw the full power of the shell at it, as it is nicely wrapped in a function.

Think about an easy to reuse interface

A second thing is, that we want to be DRY (Don't Repeat Yourself). So we definitively do not want to type something like

x=0
capture1 x d1 d
capture1 x d2 d
capture1 x d3 d
capture1 x d4 d
echo $x $d1 $d2 $d3 $d4

The x here is not only redundant, it's error prone to always repeate in the correct context. What if you use it 1000 times in a script and then add a variable? You definitively do not want to alter all the 1000 locations where a call to d is involved.

So leave the x away, so we can write:

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }

d() { let x++; output=$(date +%Y%m%d-%H%M%S); _passback output x; }

xcapture() { local -n output="$1"; eval "$("${@:2}")"; }

x=0
xcapture d1 d
xcapture d2 d
xcapture d3 d
xcapture d4 d
echo $x $d1 $d2 $d3 $d4

outputs

4 20171129-132414 20171129-132414 20171129-132414 20171129-132414

This already looks very good. (But there still is the local -n which does not work in oder common bash 3.x)

Avoid changing d()

The last solution has some big flaws:

  • d() needs to be altered
  • It needs to use some internal details of xcapture to pass the output.
    • Note that this shadows (burns) one variable named output, so we can never pass this one back.
  • It needs to cooperate with _passback

Can we get rid of this, too?

Of course, we can! We are in a shell, so there is everything we need to get this done.

If you look a bit closer to the call to eval you can see, that we have 100% control at this location. "Inside" the eval we are in a subshell, so we can do everything we want without fear of doing something bad to the parental shell.

Yeah, nice, so let's add another wrapper, now directly inside the eval:

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }
# !DO NOT USE!
_xcapture() { "${@:2}" > >(printf "%q=%q;" "$1" "$(cat)"); _passback x; }  # !DO NOT USE!
# !DO NOT USE!
xcapture() { eval "$(_xcapture "$@")"; }

d() { let x++; date +%Y%m%d-%H%M%S; }

x=0
xcapture d1 d
xcapture d2 d
xcapture d3 d
xcapture d4 d
echo $x $d1 $d2 $d3 $d4

prints

4 20171129-132414 20171129-132414 20171129-132414 20171129-132414                                                    

However, this, again, has some major drawback:

  • The !DO NOT USE! markers are there, because there is a very bad race condition in this, which you cannot see easily:
    • The >(printf ..) is a background job. So it might still execute while the _passback x is running.
    • You can see this yourself if you add a sleep 1; before printf or _passback. _xcapture a d; echo then outputs x or a first, respectively.
  • The _passback x should not be part of _xcapture, because this makes it difficult to reuse that recipe.
  • Also we have some unneded fork here (the $(cat)), but as this solution is !DO NOT USE! I took the shortest route.

However, this shows, that we can do it, without modification to d() (and without local -n)!

Please note that we not neccessarily need _xcapture at all, as we could have written everyting right in the eval.

However doing this usually isn't very readable. And if you come back to your script in a few years, you probably want to be able to read it again without much trouble.

Fix the race

Now let's fix the race condition.

The trick could be to wait until printf has closed it's STDOUT, and then output x.

There are many ways to archive this:

  • You cannot use shell pipes, because pipes run in different processes.
  • One can use temporary files,
  • or something like a lock file or a fifo. This allows to wait for the lock or fifo,
  • or different channels, to output the information, and then assemble the output in some correct sequence.

Following the last path could look like (note that it does the printf last because this works better here):

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }

_xcapture() { { printf "%q=%q;" "$1" "$("${@:2}" 3<&-; _passback x >&3)"; } 3>&1; }

xcapture() { eval "$(_xcapture "$@")"; }

d() { let x++; date +%Y%m%d-%H%M%S; }

x=0
xcapture d1 d
xcapture d2 d
xcapture d3 d
xcapture d4 d
echo $x $d1 $d2 $d3 $d4

outputs

4 20171129-144845 20171129-144845 20171129-144845 20171129-144845

Why is this correct?

  • _passback x directly talks to STDOUT.
  • However, as STDOUT needs to be captured in the inner command, we first "save" it into FD3 (you can use others, of course) with '3>&1' and then reuse it with >&3.
  • The $("${@:2}" 3<&-; _passback x >&3) finishes after the _passback, when the subshell closes STDOUT.
  • So the printf cannot happen before the _passback, regardless how long _passback takes.
  • Note that the printf command is not executed before the complete commandline is assembled, so we cannot see artefacts from printf, independently how printf is implemented.

Hence first _passback executes, then the printf.

This resolves the race, sacrificing one fixed file descriptor 3. You can, of course, choose another file descriptor in the case, that FD3 is not free in your shellscript.

Please also note the 3<&- which protects FD3 to be passed to the function.

Make it more generic

_capture contains parts, which belong to d(), which is bad, from a reusability perspective. How to solve this?

Well, do it the desparate way by introducing one more thing, an additional function, which must return the right things, which is named after the original function with _ attached.

This function is called after the real function, and can augment things. This way, this can be read as some annotation, so it is very readable:

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }
_capture() { { printf "%q=%q;" "$1" "$("${@:2}" 3<&-; "$2_" >&3)"; } 3>&1; }
capture() { eval "$(_capture "$@")"; }

d_() { _passback x; }
d() { let x++; date +%Y%m%d-%H%M%S; }

x=0
capture d1 d
capture d2 d
capture d3 d
capture d4 d
echo $x $d1 $d2 $d3 $d4

still prints

4 20171129-151954 20171129-151954 20171129-151954 20171129-151954

Allow access to the return-code

There is only on bit missing:

v=$(fn) sets $? to what fn returned. So you probably want this, too. It needs some bigger tweaking, though:

# This is all the interface you need.
# Remember, that this burns FD=3!
_passback() { while [ 1 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; return $1; }
passback() { _passback "$@" "$?"; }
_capture() { { out="$("${@:2}" 3<&-; "$2_" >&3)"; ret=$?; printf "%q=%q;" "$1" "$out"; } 3>&1; echo "(exit $ret)"; }
capture() { eval "$(_capture "$@")"; }

# Here is your function, annotated with which sideffects it has.
fails_() { passback x y; }
fails() { x=$1; y=69; echo FAIL; return 23; }

# And now the code which uses it all
x=0
y=0
capture wtf fails 42
echo $? $x $y $wtf

prints

23 42 69 FAIL

There is still a lot room for improvement

  • _passback() can be elmininated with passback() { set -- "$@" "$?"; while [ 1 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; return $1; }

  • _capture() can be eliminated with capture() { eval "$({ out="$("${@:2}" 3<&-; "$2_" >&3)"; ret=$?; printf "%q=%q;" "$1" "$out"; } 3>&1; echo "(exit $ret)")"; }

  • The solution pollutes a file descriptor (here 3) by using it internally. You need to keep that in mind if you happen to pass FDs.
    Note thatbash 4.1 and above has {fd} to use some unused FD.
    (Perhaps I will add a solution here when I come around.)
    Note that this is why I use to put it in separate functions like _capture, because stuffing this all into one line is possible, but makes it increasingly harder to read and understand

  • Perhaps you want to capture STDERR of the called function, too. Or you want to even pass in and out more than one filedescriptor from and to variables.
    I have no solution yet, however here is a way to catch more than one FD, so we can probably pass back the variables this way, too.

Also do not forget:

This must call a shell function, not an external command.

There is no easy way to pass environment variables out of external commands. (With LD_PRELOAD= it should be possible, though!) But this then is something completely different.

Last words

This is not the only possible solution. It is one example to a solution.

As always you have many ways to express things in the shell. So feel free to improve and find something better.

The solution presented here is quite far from being perfect:

  • It was nearly not tested at all, so please forgive typos.
  • There is a lot of room for improvement, see above.
  • It uses many features from modern bash, so probably is hard to port to other shells.
  • And there might be some quirks I haven't thought about.

However I think it is quite easy to use:

  • Add just 4 lines of "library".
  • Add just 1 line of "annotation" for your shell function.
  • Sacrifices just one file descriptor temporarily.
  • And each step should be easy to understand even years later.

When you use a command substitution (i.e., the $(...) construct), you are creating a subshell. Subshells inherit variables from their parent shells, but this only works one way: A subshell cannot modify the environment of its parent shell.

Your variable e is set within a subshell, but not the parent shell. There are two ways to pass values from a subshell to its parent. First, you can output something to stdout, then capture it with a command substitution:

myfunc() {
    echo "Hello"
}

var="$(myfunc)"

echo "$var"

The above outputs:

Hello

For a numerical value in the range of 0 through 255, you can use return to pass the number as the exit status:

mysecondfunc() {
    echo "Hello"
    return 4
}

var="$(mysecondfunc)"
num_var=$?

echo "$var - num is $num_var"

This outputs:

Hello - num is 4