Search code examples
compiler-construction

What does back slash "\" really mean?


I'm wondering about Java's backslash. How does the computer or the compiler see this backslash and how is it stored in computer?

I read that backslash removes the special meaning of the following character. But how does a computer treat this one and in what conditions treat it in some other ways?

For example the null character \0 in C programming, is the end of the string, but is it a single character or two characters, i.e., backslash + zero?

The objective of back slash is to indicate for humans or to indicate for 0-1 computer?


Solution

  • The backslash \ is a character, just like the letter A, the comma ,, and the number 4. In some programming languages, notably C and its descendants (and maybe ancestors), it is used inside a string or character literal to escape other characters. For instance, '\a' represents the bell character, and will produce a beep from the computer if you print it (printf("%c", '\a')).

    As a C-language escape character, it is largely a human construct allowed by the compiler so humans can express, e.g., the bell character. The compiled code simply stores the character — a byte with the value 7. Just to be absolutely clear, it does not store a \ followed by an a.

    Under other contexts, the backslash means something to the program at runtime. The most well-known instance of this is regular expression syntax, in which a backslash escape other characters in order to either give them special meaning or take away a special meaning they might have. For example, grep '\<foo\>' file.txt will locate lines with the word foo in file.txt. In this case the backslashes really are there at runtime, and are interpreted by the program as escapes for the < and > that follow them. In this case, \< and \> don't represent characters at all; they denote a zero-width match against the beginning and end of a word, respectively.