"The average man does not want to be free. He simply wants to be safe." - H. L. Menken
I am attempting to write very secure C. Below I list some of the techniques I use and ask are they as secure as I think they are. Please don't not hesitate to tear my code/preconceptions to shreds. Any answer that finds even the most trivial vulnerability or teaches me a new idea will be highly valued.
According to the GNU C Programming Tutorial getline:
The getline function will automatically enlarge the block of memory as needed, via the realloc function, so there is never a shortage of space -- one reason why getline is so safe. [..] Notice that getline can safely handle your line of input, no matter how long it is.
I assume that getline should, under all inputs, prevent a buffer overflow from occurring when reading from a stream.
If malloc encounters an error malloc returns a NULL pointer. This presents a security risk since one can still apply pointer arithmetic to a NULL (0x0) pointer, thus wikipedia recommends
/* Allocate space for an array with ten elements of type int. */
int *ptr = (int*)malloc(10 * sizeof (int));
if (ptr == NULL) {
/* Memory could not be allocated, the program should handle
the error here as appropriate. */
}
When using sscanf I've gotten in the habit of allocating the size to-be-extracted strings to the size of the input string hopefully avoiding the possibility of an overrun. For example:
const char *inputStr = "a01234b4567c";
const char *formatStr = "a%[0-9]b%[0-9]c":
char *str1[strlen(inputStr)];
char *str2[strlen(inputStr)];
sscanf(inputStr, formatStr, str1, str2);
Because str1 and str2 are the size of the inputStr and no more characters than strlen(inputStr) can be read from inputStr, it seems impossible, given all possible values for the inputStr to cause a buffer overflow?
While I've posted a large number of questions I don't expect anyone to answer all of them. The questions are more of guideline to the sorts of answers I am looking for. I really want to learn the secure C mindset.
Many of the resources were borrowed from the answers.
Reading from a stream
The fact that getline()
"will automatically enlarge the block of memory as needed" means that this could be used as a denial-of-service attack, as it would be trivial to generate an input that was so long it would exhaust the available memory for the process (or worse, the system!). Once an out-of-memory condition occurs, other vulnerabilities may also come into play. The behaviour of code in low/no memory is rarely nice, and very hard to predict. IMHO it is safer to set reasonable upper bounds on everything, especially in security-sensitive applications.
Furthermore (as you anticipate by mentioning special characters), getline()
only gives you a buffer; it does not make any guarantees about the contents of the buffer (as the safety is entirely application-dependent). So sanitising the input is still an essential part of processing and validating user data.
sscanf
I would tend to prefer to use a regular expression library, and have very narrowly defined regexps for user data, rather than use sscanf
. This way you can perform a good deal of validation at the time of input.
General comments