How exactly does fopen(), fclose() work?
With VS2017 on Windows 10, you can see the internal by callstack:
ntdll.dll!NtCreateFile() Unknown
KernelBase.dll!CreateFileInternal() Unknown
KernelBase.dll!CreateFileW() Unknown
ucrtbased.dll!create_file(const wchar_t * const path, _SECURITY_ATTRIBUTES * const security_attributes, const `anonymous-namespace'::file_options options) Line 234 C++
ucrtbased.dll!_wsopen_nolock(int * punlock_flag, int * pfh, const wchar_t * path, int oflag, int shflag, int pmode, int secure) Line 702 C++
ucrtbased.dll!_sopen_nolock(int * punlock_flag, int * pfh, const char * path, int oflag, int shflag, int pmode, int secure) Line 852 C++
ucrtbased.dll!__crt_char_traits<char>::tsopen_nolock<int * __ptr64,int * __ptr64,char const * __ptr64 const & __ptr64,int const & __ptr64,int,int const & __ptr64,int>(int * && <args_0>, int * && <args_1>, const char * const & <args_2>, const int & <args_3>, int && <args_4>, const int & <args_5>, int && <args_6>) Line 109 C++
ucrtbased.dll!common_sopen_dispatch<char>(const char * const path, const int oflag, const int shflag, const int pmode, int * const pfh, const int secure) Line 172 C++
ucrtbased.dll!_sopen_dispatch(const char * path, int oflag, int shflag, int pmode, int * pfh, int secure) Line 204 C++
ucrtbased.dll!_sopen_s(int * pfh, const char * path, int oflag, int shflag, int pmode) Line 895 C++
ucrtbased.dll!__crt_char_traits<char>::tsopen_s<int * __ptr64,char const * __ptr64 const & __ptr64,int const & __ptr64,int const & __ptr64,int>(int * && <args_0>, const char * const & <args_1>, const int & <args_2>, const int & <args_3>, int && <args_4>) Line 109 C++
ucrtbased.dll!common_openfile<char>(const char * const file_name, const char * const mode, const int share_flag, const __crt_stdio_stream stream) Line 38 C++
ucrtbased.dll!_openfile(const char * file_name, const char * mode, int share_flag, _iobuf * public_stream) Line 67 C++
ucrtbased.dll!__crt_char_traits<char>::open_file<char const * __ptr64 const & __ptr64,char const * __ptr64 const & __ptr64,int const & __ptr64,_iobuf * __ptr64>(const char * const & <args_0>, const char * const & <args_1>, const int & <args_2>, _iobuf * && <args_3>) Line 109 C++
ucrtbased.dll!common_fsopen<char>(const char * const file_name, const char * const mode, const int share_flag) Line 54 C++
ucrtbased.dll!fopen(const char * file, const char * mode) Line 104 C++
Most code are in:
C:\Program Files (x86)\Windows Kits\10\Source\10.0.17763.0\ucrt\stdio\fopen.cpp
C:\Program Files (x86)\Windows Kits\10\Source\10.0.17763.0\ucrt\stdio\openfile.cpp
C:\Program Files (x86)\Windows Kits\10\Source\10.0.17763.0\ucrt\lowio\open.cpp
In _wsopen_nolock in open.cpp, there is:
// Allocate the CRT file handle. Note that if a handle is allocated, it is
// locked when it is returned by the allocation function. It is our caller's
// responsibility to unlock the file handle (we do not unlock it before
// returning).
*pfh = _alloc_osfhnd();
Finally, it calls Windows API CreateFileW which calls hiden API "NtCreateFile" whose assembly code is:
NtCreateFile:
00007FFFD81A0120 mov r10,rcx
00007FFFD81A0123 mov eax,55h
00007FFFD81A0128 test byte ptr[7FFE0308h],1
00007FFFD81A0130 jne NtCreateFile+15h(07FFFD81A0135h)
00007FFFD81A0132 syscall
00007FFFD81A0134 ret
00007FFFD81A0135 int 2Eh
00007FFFD81A0137 ret
00007FFFD81A0138 nop dword ptr[rax + rax]
So finally it execute the syscall instruction which goes into kernel code.
Disclaimer: I'm mostly unqualified to talk about this. It'd be great if someone more knowledgeable posted too.
Files
The details of how things like fopen() are implemented will depend a lot on the operating system (UNIX has fopen() too, for example). Even versions of Windows can differ a lot from each other.
I'll give you my idea of how it works, but it's basically speculation.
- When called, fopen allocates a FILE object on the heap. Note that the data in a FILE object is undocumented - FILE is an opaque struct, you can only use pointers-to-FILE from your code.
- The FILE object gets initialized. For example, something like
fillLevel = 0
where fillLevel is the amount of buffered data that hasn't been flushed yet. - A call to the filesystem driver (FS driver) opens the file and provides a handle to it, which is put somewhere in the FILE struct.
- To do this, the FS driver figures out the HDD address corresponding to the requested path, and internally remembers this HDD address, so it can later fulfill calls to fread etc.
- The FS driver uses a sort of indexing table (stored on the HDD) to figure out the HDD address corresponding to the requested path. This will differ a lot depending on the filesystem type - FAT32, NTFS and so on.
- The FS driver relies on the HDD driver to perform the actual reads and writes to the HDD.
- To do this, the FS driver figures out the HDD address corresponding to the requested path, and internally remembers this HDD address, so it can later fulfill calls to fread etc.
- A cache might be allocated in RAM for the file. This way, if the user requests 1 byte to be read, C++ may read a KB just in case, so later reads will be instantaneous.
- A pointer to the allocated FILE gets returned from fopen.
If you open a file and never close it, some things will leak, yes. The FILE struct will leak, the FS driver's internal data will leak, the cache (if any) will leak too.
But memory is not the only thing that will leak. The file itself will leak, because the OS will think it's open when it's not. This can become a problem for example in Windows, where a file opened in write-mode cannot be opened in write-mode again until it's been closed.
If your app exits without closing some file, most OSes will clean up after it. But that's not much use, because your app will probably run for a long time before exiting, and during that time, it will still need to properly close all files. Also, you can't fully rely on the OS to clean up after you - it's not guaranteed in the C Standard.
Sockets
A socket's implementation will depend on the type of socket - network listen socket, network client socket, inter-process socket, etc.
A full discussion of all types of sockets and their possible implementations wouldn't fit here.
In short:
- just like a file, a socket keeps some info in RAM, describing things relevant to its operation, such as the IP of the remote host.
- it can also have caches in RAM for performance reasons
- it can hold onto finite OS resources such as open ports, making them unavailable for use by other apps
All these things will leak if you don't close the socket.
The role of the OS in sockets
The OS implements the TCP/IP standard, Ethernet and other protocols needed to schedule/dispatch/accept connections and to make them available to user code via an API like Berkeley Sockets.
The OS will delegate network I/O (communication with the network card) to the network driver.