Writing Secure C and Secure C Idioms
I think your sscanf example is wrong. It can still overflow when used that way.
Try this, which specifies the maximum number of bytes to read:
void main(int argc, char **argv)
{
char buf[256];
sscanf(argv[0], "%255s", &buf);
}
Take a look at this IBM dev article about protecting against buffer overflows.
In terms of testing, I would write a program that generates random strings of random length and feed them to your program, and make sure they are handled appropriately.
A good place to start looking at this is David Wheeler's excellent secure coding site.
His free online book "Secure Programming for Linux and Unix HOWTO" is an excellent resource that is regularly updated.
You might also like to look at his excellent static analyser FlawFinder to get some further hints. But remember, no automated tool is a replacement for a good pair of experienced eyes, or as David so colourfully puts it..
Any static analysis tool, such as Flawfinder, is merely a tool. No tool can substitute for human thought! In short, "a fool with a tool is still a fool". It's a mistake to think that analysis tools (like flawfinder) are a substitute for security training and knowledge
I have personally used David's resources for several years now and find them to be excellent.
Reading from a stream
The fact that
getline()
"will automatically enlarge the block of memory as needed" means that this could be used as a denial-of-service attack, as it would be trivial to generate an input that was so long it would exhaust the available memory for the process (or worse, the system!). Once an out-of-memory condition occurs, other vulnerabilities may also come into play. The behaviour of code in low/no memory is rarely nice, and very hard to predict. IMHO it is safer to set reasonable upper bounds on everything, especially in security-sensitive applications.Furthermore (as you anticipate by mentioning special characters),
getline()
only gives you a buffer; it does not make any guarantees about the contents of the buffer (as the safety is entirely application-dependent). So sanitising the input is still an essential part of processing and validating user data.sscanf
I would tend to prefer to use a regular expression library, and have very narrowly defined regexps for user data, rather than use
sscanf
. This way you can perform a good deal of validation at the time of input.General comments
- Fuzzing tools are available which generate random input (both valid and invalid) that can be used to test your input handling
- Buffer management is critical: buffer overflows, underflows, out-of-memory
- Race conditions can be exploited in otherwise secure code
- Binary files could be manipulated to inject invalid values or oversized values into headers, so file format code must be rock-solid and not assume binary data is valid
- Temporary files can often be a source of security issues, and must be carefully managed
- Code injection can be used to replace system or runtime library calls with malicious versions
- Plugins provide a huge vector for attack
- As a general principle, I would suggest having clearly defined interfaces where user data (or any data from outside the application) is assumed invalid and hostile until it is processed, sanitised and validated, and the only way for user data to enter the application