Single dashes `-` for single-character options, but double dashes `--` for words?
In The Art of Unix Programming Eric Steven Raymond describes how this practice evolved:
In the original Unix tradition, command-line options are single letters preceded by a single hyphen... The original Unix style evolved on slow ASR-33 teletypes that made terseness a virtue; thus the single-letter options. Holding down the shift key required actual effort; thus the preference for lower case, and the use of “-” (rather than the perhaps more logical “+”) to enable options.
The GNU style uses option keywords (rather than keyword letters) preceded by two hyphens. It evolved years later when some of the rather elaborate GNU utilities began to run out of single-letter option keys (this constituted a patch for the symptom, not a cure for the underlying disease). It remains popular because GNU options are easier to read than the alphabet soup of older styles. 1
[1] http://www.catb.org/esr/writings/taoup/html/ch10s05.html
One reason for continuing to use the single letter options is because they can be strung together: ls -ltr
is a lot easier to type than ls --sort=time --reverse --format=long
. There are a number of times when both are good to use. As for searching for this topic, try "unix command line options convention".
The quote from Raymond by @jasonwryan has some useful information, but starts in the middle of the story:
- Keep in mind that Unix started as a reduced-scope version of Multics, and that throughout its history, features in Unix were often imitations or adaptations of features seen and used on other systems.
- The
'-'
option character was used in Multics. Bitsavers has a manual for its user commands. - Other systems used different characters, some with more claim to be more keystroke-efficient (such as
'/'
used for TOPS and VMS) and some less (such as'('
used in VM/SP CMS). - Multics options were multi-character, e.g., keywords separated by underscore.
- Longer Multics options frequently had a shorter, abbreviated form, such as
-print
vs-pr
(page 3-8). - Unix options were single-character, and after several years,
getopt
was introduced. Because it was not part of the original Unix, there are utilities which did not usegetopt
and were left as-is. But havinggetopt
helped with making programs consistent.
On the other hand, Unix options using getopt
were single-character. Other systems, in particular all larger ones, used keywords. Some (not all) allowed those keywords to be abbreviated, i.e., not all characters provided as long as the option was unambiguous. There are pitfalls in that test for ambiguity. For example:
- early in 1985, I was working on a program which had to be ported to PrimOS. Prime's developers competed with several other companies by offering a command-language that (tried to) imitate each of those others, providing the most commonly used commands from each. Of course, they supported abbreviations (as did VMS). After reading the online help, I typed
sta
, thinking to getstatus
. That was the abbreviation forstart
, and having given nothing to start, the command interpreter logged me off. - The X Toolkit (used by xterm) allows abbreviated options. To use this effectively in xterm, it has to preprocess the command parameters to prefer
-v
(for version) over-vb
(visual bell). The X Toolkit has no direct way to specify a preferred option when there is an ambiguity.
Because of this potential for ambiguity, some developers prefer to not allow abbreviations. Lynx, for example, uses multi-character options without allowing abbreviations.
Not all programs used getopt
: tar
and ps
did not. Nor did rcs
(or sccs
), as you can see by noting where the dash was optional, and option values were optional.
Taking all of this into account, GNU developers adapted the keyword options used in other systems by extending getopt
to provide a long version of each short option. For instance, textutils 1.0 changelog says
Tue May 8 03:41:42 1990 David J. MacKenzie (djm at abyss)
* tac.c: Use regular expressions as the record boundaries.
Give better error messages.
Reformat code and make it more readable.
(main): Use getopt_long to parse options.
The change in fileutils was earlier:
Tue Oct 31 02:03:32 1989 David J. MacKenzie (djm at spiff)
* ls.c (decode_switches): Add long options, using getopt_long
instead of getopt.
and someone may find one still earlier, but it seems that the file-header shows the earliest date:
/* Getopt for GNU.
Copyright (C) 1987, 1989 Free Software Foundation, Inc.
which is (for instance) concurrent with the X Toolkit (1987). Most of the Unix utilities with which you are familiar (such as ls
, ps
) used the existing single-character options that require periodic visits to the manual. When introducing getopt_long
, the GNU developers did not do this by first adding new options; they began by tabulating the existing options and providing a matching long option.
Because they were adding to an existing repertoire, there was (again) the problem of conflict with existing options. To avoid this, they changed the syntax, using two dashes before long options.
These programs continue to use getopt_long
in this manner for the usual reasons:
- scripts depend upon the options; developers are not anxious to break scripts
- there's a written coding standard (which may be effective)
- no one has come up with a competing set of tools which is markedly incompatible (both BSDs and GNU developers copy option names from each other)