What exactly does "/usr/bin/env node" do at the beginning of node files?
The exec
system call of the Linux kernel understands shebangs (#!
) natively
When you do on bash:
./something
on Linux, this calls the exec
system call with the path ./something
.
This line of the kernel gets called on the file passed to exec
: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_script.c#L25
if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!'))
It reads the very first bytes of the file, and compares them to #!
.
If the comparison is true, then the rest of the line is parsed by the Linux kernel, which makes another exec
call with:
- executable:
/usr/bin/env
- first argument:
node
- second argument: script path
therefore equivalent to:
/usr/bin/env node /path/to/script.js
env
is an executable that searches PATH
to e.g. find /usr/bin/node
, and then finally calls:
/usr/bin/node /path/to/script.js
The Node.js interpreter does see the #!
line in the file, but it must be programmed to ignore that line even though #
is not in general a valid comment character in Node (unlike many other languages such as Python where it is), see also: Pound Sign (#) As Comment Start In JavaScript?
And yes, you can make an infinite loop with:
printf '#!/a\n' | sudo tee /a
sudo chmod +x /a
/a
Bash recognizes the error:
-bash: /a: /a: bad interpreter: Too many levels of symbolic links
#!
just happens to be human readable, but that is not required.
If the file started with different bytes, then the exec
system call would use a different handler. The other most important built-in handler is for ELF executable files: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_elf.c#L1305 which checks for bytes 7f 45 4c 46
(which also happens to be human readable for .ELF
). Let's confirm that by reading the 4 first bytes of /bin/ls
, which is an ELF executable:
head -c 4 "$(which ls)" | hd
output:
00000000 7f 45 4c 46 |.ELF|
00000004
So when the kernel sees those bytes, it takes the ELF file, puts it into memory correctly, and starts a new process with it. See also: How does kernel get an executable binary file running under linux?
Finally, you can add your own shebang handlers with the binfmt_misc
mechanism. For example, you can add a custom handler for .jar
files. This mechanism even supports handlers by file extension. Another application is to transparently run executables of a different architecture with QEMU.
I don't think POSIX specifies shebangs however: https://unix.stackexchange.com/a/346214/32558 , although it does mention in on rationale sections, and in the form "if executable scripts are supported by the system something may happen". macOS and FreeBSD also seem to implement it however.
PATH
search motivation
Likely, one big motivation for the existence of shebangs is the fact that in Linux, we often want to run commands from PATH
just as:
basename-of-command
instead of:
/full/path/to/basename-of-command
But then, without the shebang mechanism, how would Linux know how to launch each type of file?
Hardcoding the extension in commands:
basename-of-command.js
or implementing PATH search on every interpreter:
node basename-of-command
would be a possibility, but this has the major problem that everything breaks if we ever decide to refactor the command into another language.
Shebangs solve this problem beautifully.
Scripts that are to be executed by an interpreter normally have a shebang line at the top to tell the OS how to execute them.
If you have a script named foo
whose first line is #!/bin/sh
, the system will read that first line and execute the equivalent of /bin/sh foo
. Because of this, most interpreters are set up to accept the name of a script file as a command-line argument.
The interpreter name following the #!
has to be a full path; the OS won't search your $PATH
to find the interpreter.
If you have a script to be executed by node
, the obvious way to write the first line is:
#!/usr/bin/node
but that doesn't work if the node
command isn't installed in /usr/bin
.
A common workaround is to use the env
command (which wasn't really intended for this purpose):
#!/usr/bin/env node
If your script is called foo
, the OS will do the equivalent of
/usr/bin/env node foo
The env
command executes another command whose name is given on its command line, passing any following arguments to that command. The reason it's used here is that env
will search $PATH
for the command. So if node
is installed in /usr/local/bin/node
, and you have /usr/local/bin
in your $PATH
, the env
command will invoke /usr/local/bin/node foo
.
The main purpose of the env
command is to execute another command with a modified environment, adding or removing specified environment variables before running the command. But with no additional arguments, it just executes the command with an unchanged environment, which is all you need in this case.
There are some drawbacks to this approach. Most modern Unix-like systems have /usr/bin/env
, but I worked on older systems where the env
command was installed in a different directory. There might be limitations on additional arguments you can pass using this mechanism. If the user doesn't have the directory containing the node
command in $PATH
, or has some different command called node
, then it could invoke the wrong command or not work at all.
Other approaches are:
- Use a
#!
line that specifies the full path to thenode
command itself, updating the script as needed for different systems; or - Invoke the
node
command with your script as an argument.
See also this question (and my answer) for more discussion of the #!/usr/bin/env
trick.
Incidentally, on my system (Linux Mint 17.2), it's installed as /usr/bin/nodejs
. According to my notes, it changed from /usr/bin/node
to /usr/bin/nodejs
between Ubuntu 12.04 and 12.10. The #!/usr/bin/env
trick won't help with that (unless you set up a symlink or something similar).
UPDATE: A comment by mtraceur says (reformatted):
A workaround for the nodejs vs node problem is to start the file with the following six lines:
#!/bin/sh - ':' /*- test1=$(nodejs --version 2>&1) && exec nodejs "$0" "$@" test2=$(node --version 2>&1) && exec node "$0" "$@" exec printf '%s\n' "$test1" "$test2" 1>&2 */
This will first try
nodejs
and then trynode
, and only print the error messages if both of them are not found. An explanation is out of scope of these comments, I'm just leaving it here in case it helps anyone deal with the problem since this answer brought the problem up.
I haven't used NodeJS lately. My hope is that the nodejs
vs. node
issue has been resolved in the years since I first posted this answer. On Ubuntu 18.04, the nodejs
package installs /usr/bin/nodejs
as a symlink to /usr/bin/node
. On some earlier OS (Ubuntu or Linux Mint, I'm not sure which), there was a nodejs-legacy
package that provided node
as a symlink to nodejs
. No guarantee that I have all the details right.
#!/usr/bin/env node
is an instance of a shebang line: the very first line in an executable plain-text file on Unix-like platforms that tells the system what interpreter to pass that file to for execution, via the command line following the magic #!
prefix (called shebang).
Note: Windows does not support shebang lines, so they're effectively ignored there; on Windows it is solely a given file's filename extension that determines what executable will interpret it. However, you still need them in the context of npm
.[1]
The following, general discussion of shebang lines is limited to Unix-like platforms:
In the following discussion I'll assume that the file containing source code for execution by Node.js is simply named file
.
You NEED this line, if you want to invoke a Node.js source file directly, as an executable in its own right - this assumes that the file has been marked as executable with a command such as
chmod +x ./file
, which then allows you to invoke the file with, for instance,./file
, or, if it's located in one of the directories listed in the$PATH
variable, simply asfile
.- Specifically, you need a shebang line to create CLIs based on Node.js source files as part of an npm package, with the CLI(s) to be installed by
npm
based on the value of the"bin"
key in a package'spackage.json
file; also see this answer for how that works with globally installed packages. Footnote [1] shows how this is handled on Windows.
- Specifically, you need a shebang line to create CLIs based on Node.js source files as part of an npm package, with the CLI(s) to be installed by
You do NOT need this line to invoke a file explicitly via the
node
interpreter, e.g.,node ./file
Optional background information:
#!/usr/bin/env <executableName>
is a way of portably specifying an interpreter: in a nutshell, it says: execute <executableName>
wherever you (first) find it among the directories listed in the $PATH
variable (and implicitly pass it the path to the file at hand).
This accounts for the fact that a given interpreter may be installed in different locations across platforms, which is definitely the case with node
, the Node.js binary.
By contrast, the location of the env
utility itself can be relied upon to be in the same location across platforms, namely /usr/bin/env
- and specifying the full path to an executable is required in a shebang line.
Note that POSIX utility env
is being repurposed here to locate by filename and execute an executable in the $PATH
.
The true purpose of env
is to manage the environment for a command - see env
's POSIX spec and Keith Thompson's helpful answer.
It's also worth noting that Node.js is making a syntax exception for shebang lines, given that they're not valid JavaScript code (#
is not a comment character in JavaScript, unlike in POSIX-like shells and other interpreters).
[1] In the interest of cross-platform consistency, npm
creates wrapper *.cmd
files (batch files) on Windows when installing executables specified in a package's package.json
file (via the "bin"
property). Essentially, these wrapper batch files mimic Unix shebang functionality: they invoke the target file explicitly with the executable specified in the shebang line - thus, your scripts must include a shebang line even if you only ever intend to run them on Windows - see this answer of mine for details.
Since *.cmd
files can be invoked without the .cmd
extension, this makes for a seamless cross-platform experience: on both Windows and Unix you can effectively invoke an npm
-installed CLI by its original, extension-less name.