Why does C's "fopen" take a "const char *" as its second argument?
I believe that one of the advantages of the character string instead of a simple bit-mask is that it allows for platform-specific extensions which are not bit-settings. Purely hypothetically:
FILE *fp = fopen("/dev/something-weird", "r+,bs=4096");
For this gizmo, the open()
call needs to be told the block size, and different calls can use radically different sizes, etc. Granted, I/O has been organized pretty well now (such was not the case originally — devices were enormously diverse and the access mechanisms far from unified), so it seldom seems to be necessary. But the string-valued open mode argument allows for that extensibility far better.
On IBM's mainframe MVS o/s, the fopen()
function does indeed take extra arguments along the general lines described here — as noted by Andrew Henle (thank you!). The manual page includes the example call (slightly reformatted):
FILE *fp = fopen("myfile2.dat", "rb+, lrecl=80, blksize=240, recfm=fb, type=record");
The underlying open()
has to be augmented by the ioctl()
(I/O control) call or fcntl()
(file control) or functions hiding them to achieve similar effects.
One word : legacy. Unfortunately we have to live with it.
Just speculation : Maybe at the time a "const char *" seemed more flexible solution, because it is not limited in any way. A bit mask could only have 32 different values. Looks like a YAGNI to me now.
More speculation : Dudes were lazy and writing "rb" requires less typing than MASK_THIS | MASK_THAT :)
Dennis Ritchie (in 1993) wrote an article about the history of C, and how it evolved gradually from B. Some of the design decisions were motivated by avoiding source changes to existing code written in B or embryonic versions of C.
In particular, Lesk wrote a 'portable I/O package' [Lesk 72] that was later reworked to become the C `standard I/O' routines
The C preprocessor wasn't introduced until 1972/3, so Lesk's I/O package was written without it! (In very early not-yet-C, pointers fit in integers on the platforms being used, and it was totally normal to assign an implicit-int return value to a pointer.)
Many other changes occurred around 1972-3, but the most important was the introduction of the preprocessor, partly at the urging of Alan Snyder [Snyder 74]
Without #include
and #define
, an expression like IO_READ | IO_WRITE
wasn't an option.
The options in 1972 for what fopen
calls could look in typical source without CPP are:
FILE *fp = fopen("file.txt", 1); // magic constant integer literals
FILE *fp = fopen("file.txt", 'r'); // character literals
FILE *fp = fopen("file.txt", "r"); // string literals
Magic integer literals are obviously horrible, so unfortunately the obviously most efficient option (which Unix later adopted for open(2)
) was ruled out by lack of a preprocessor.
A character literal is obviously not extensible; presumably that was obvious to API designers even back then. But it would have been sufficient (and more efficient) for early implementations of fopen
: They only supported single-character strings, checking for *mode
being r
, w
, or a
. (See @Keith Thompson's answer.) Apparently r+
for read+write (without truncating) came later. (See fopen(3)
for the modern version.)
C did have a character data type (added to B 1971 as one of the first steps in producing embryonic C, so it was still new in 1972. Original B didn't have char
, having been written for machines that pack multiple characters into a word, so char()
was a function that indexed a string! See Ritchie's history article.)
Using a single-byte string is effectively passing a char
by const-reference, with all the extra overhead of memory accesses because library functions can't inline. (And primitive compilers probably weren't inlining anything, even trival functions (unlike fopen) in the same compilation unit where it would shrink total code size to inline them; Modern style tiny helper functions rely on modern compilers to inline them.)
PS: Steve Jessop's answer with the same quote inspired me to write this.
Possibly related: strcpy() return value. strcpy
was probably written pretty early, too.