Search code examples
ccharescaping

Escaped backslash (double backslash) in C counting as two bytes in string


I'm a relatively inexperienced programmer and I'm running into some confusion about escaped backslashes. I'm using strings containing literal backslashes to represent Windows filepaths (for some reason forward slashes weren't working even though Windows supports them) and the double backslashes are being accepted by windows, but for some reason, unlike other escaped characters, the double backslashes are being counted by strlen() as two characters instead of one, and I have to malloc() an extra byte for the escape character to prevent a seg fault. Other escapes like '\n' seem to work perfectly normally and count as a single char. Why are my double backslashes "foo\\bar" counting as two chars instead of one?

I'm using VSCodium to write my code, and compiling with MinGW64 GCC in MSYS2.

Code sample: strlen("documents\\notes") returns 16 instead of 15, and takes up 17 bytes including the null terminator.

full code: https://github.com/WyntrHeart/note/blob/main/note.c

EDIT: To clarify the solution, I thought that my escape backslashes were taking up an extra byte because adding an extra byte to my malloc() calls made the seg fault crashes appear to go away, but I had actually forgotten to allocate memory at all for part of my strings, so it was only ever "working" by luck.


Solution

  • The problem is likely this (from the fullPathOfFileName function):

    fullPath = malloc((strlen(notesDirName)+strlen(fileName)+3)*sizeof(char));
    

    You copy notesDirName into fullPath, which is fine.

    Then you append a slash (backward or forward) which is fine.

    Then you append fileName, which is also fine.

    Then comes the problem: You append ".txt" to the string.

    In total you add strlen(notesDirName) + 1 + strlen(fileName) + 4 + 1 characters to the string. Which will write out of bounds and give you undefined behavior.

    Moral of the story is to not be cheap. Unless you're targeting a system known to have limited memory (small embedded systems), there's plenty of memory around. Heck, even on my old 512 KiB Amiga, memory was usually plenty to allow you some failure margins.