What exactly is an environment variable?

An environment is not as magical as it might seem. The shell stores it in memory and passes to the execve() system call. The child process inherits it as an array pointer called environ. From the execve manpage:

SYNOPSIS
   #include <unistd.h>

   int execve(const char *filename, char *const argv[],
              char *const envp[]);
argv is an array of argument strings passed to the new program.
By convention, the first of these strings should contain the filename associated with the file being executed. envp is an array of strings, conventionally of the form key=value, which are passed as environment to the new program.

The environ(7) manpage also offers some insight:

SYNOPSIS
   extern char **environ;
DESCRIPTION

The variable environ points to an array of pointers to strings called the "environment". The last pointer in this array has the value NULL. (This variable must be declared in the user program, but is declared in the header file <unistd.h> in case the header files came from libc4 or libc5, and in case they came from glibc and _GNU_SOURCE was defined.) This array of strings is made available to the process by the exec(3) call that started the process.

Both of these GNU manpages match the POSIX specification

You've got it just a little wrong: SOME_NAME=value creates a shell variable (in most shells). export SOME_NAME=value creates an environment variable. For better for for worse, most Unix/Linux/*BSD shells use identical syntax in accessing environment variables and shell variables.

In some larger sense, an "environment" is just the information that goes along with program execution. In C programs, you might find the process ID with a getpid() call, in a shell program you would use a variable access: $$. The process ID is just part of the program's environment. I believe the term "environment" comes from some of the more theoretical computer science topics, like modelling program execution.. Models of program execution have an environment "which contains the associations between variables and their values".

And this latter, stronger definition is what an "environment" is for Unix/Linux/*BSD shells: an association between names ("variables") and their values. For most Unix-style shells, the values are all character strings, although that's not as strictly true as it used to be. Ksh, Zsh and Bash all have typed variables these days. Even shell function definitions can be exported.

The use of an environment separate from plain shell variables involves the fork/exec method of starting a new process that all Unixes use. When you export a name/value pair, that name/value pair will be present in the environment of new executables, started by the shell with an execve(2) system call (usually following a fork(2), except when the exec shell command was used).

Following an execve(), the main() function of new binary has its command line arguments, the environment (stored as a NULL-terminated array of pointers to var=value strings, see the environ(7) man page). Other state that's inherited includes ulimit settings, current working directory, and any open file descriptors that the execve() caller didn't have FD_CLOEXEC set for. The current state of the tty (echo enabled, raw mode, etc.) could also be considered part of the execution state inherited by a newly-execed process.

See the bash manual's description of the execution environment for simple commands (other than builtin or shell functions).

Unix environment is different from at least some other operating systems: VMS "lexicals" could be changed by a child process, and that change was visible in the parent. A VMS cd in a child process would affect the working directory of the parent. At least in some circumstances, and my memory may be failing me.

Some environment variables are well known, $HOME, $PATH, $LD_LIBRARY_PATH and others. Some are conventional to a given programming system, so that a parent shell can pass lots and lots of special-purpose information to some program, like a specific temporary directory, or a user ID and password that don't show up in ps -ef. Simple CGI programs inherit a lot of information from the web server via environment variables, for example.

Environment variables in their rawest form are just a set of name/value pairs. As described in the bash man page (man 1 bash) under the ENVIRONMENT section:

   When  a  program  is invoked it is given an array of strings called the
   environment.   This  is  a  list  of  name-value  pairs,  of  the  form
   name=value.

   The  shell  provides  several  ways  to manipulate the environment.  On
   invocation, the shell scans its own environment and creates a parameter
   for  each name found, automatically marking it for export to child pro-
   cesses.  Executed commands inherit the  environment.

In practical terms, it allows you to define behavior that is shared or unique to programs invoked from the present shell. For example, when using crontab or visudo you can define the EDITOR environment variable to define another editor other than the one your system would use by default. The same can be held true for things like the man command which looks at your PAGER environment to work out what pager program should be used to display the output of the man page with.

Quite a lot of unix commands read the environment and depending on what is set there alter their output/processing/action depending on these. Some are shared, some are unique to the program. Most man pages contain information on how the environment variable have an effect on the described program.

Other practical illustrations are for things such as systems with several installs of Oracle on the same platform. By setting ORACLE_HOME, the whole suite of oracle commands (as loaded from your PATH environment variable) then pull settings, definitions, mappings and libraries from under that top level directory. The same hold true for other programs such as java with it's JAVA_HOME environment variable.

bash itself has many environment variables which can change the behavior of a range of things from history (HISTSIZE, HISTFILE etc), screen size (COLUMNS), tab completion (FIGNORE,GLOBIGNORE) locale and character encoding/decoding (LANG, LC_*), prompt (PS1 .. PS4), and so forth (again seek knowledge from the bash man page).

Also you can write scripts/programs that make use of your own custom environment variables (to pass settings, or change functionality).

What exactly is an environment variable?

Tags:

Shell

Bash

Environment Variables

Related

Recent Posts