Search code examples
c#stringescapingstring-literals

Why can users not put escape sequences in their input by default?


So i'm working on this challenge in which I have to take in user input, check if it contains a escape sequence and then execute the escape sequence.

My question is why do escape sequences execute on pre determined string variables but then you take a users input and store that in a variable. That input happens to contain a escape sequence such as \n but does not execute.

No user input Ex:

string noInput = "this is a escape \n sequence"
Console.WriteLine(noInput);
Console.ReadLine()

Output is : This is an escape 
             sequence

or user input Ex:

string input = Console.ReadLine();
Console.WriteLine(input);
Console.ReadLine();

Output is : This is an escape \n sequence 

Hopefully i explained my question well enough. I'm assuming this may be because of security but would like to know the answer.


Solution

  • "Escape sequence" is a feature of the language / compiler.. in this case C#. The relevant language specification can be found at - 2.4.4.5 String literals

    Note that the reference is to an older version of language specification, but still applies. Latest version can be found here.

    From the spec -

    A character that follows a backslash character () in a regular-string-literal-character must be one of the following characters: ', ", \, 0, a, b, f, n, r, t, u, U, x, v. Otherwise, a compile-time error occurs. The example

    • string a = "hello, world"; // hello, world
    • string b = @"hello, world"; // hello, world
    • string c = "hello \t world"; // hello world
    • string d = @"hello \t world"; // hello \t world

    Point is, that a .Net language is free to define what special characters in a string literal will be treated as escape sequences.. however it is typically what has been used for ages from languages like C and C++ in old days.

    When you are accepting user input.. The input is (obviously?) treaded as a literal string. (Another way to think is, a compiled .Net program is obviously compiler and language independent.. the runtime a.k.a CLR doesn't have the concept of escape sequences in strings)

    If you wish to provide such features (may be you have a good scenario).. you have limited options..

    1. Use upcoming compiler features like Roslyn to process the input string for you. I have never personally looked at which specific API in Roslyn will help you do that, but it has to be there, given that Roslyn is supposed to be the compiler itself.

    Note that a con of this approach is, that Roslyn may be pretty heavyweight to include in your app for only one feature.

    1. Write a small routine yourself, which tries to perform same escaping as the compiler. For production quality code, this can be tricky (you have to understand and follow the specification to exactly match it.. and perhaps keep your implementation up to date, as it may change with future versions of C# - Like what if new escape sequence is introduced).

    Although, practically speaking.. escape sequences in C# specification should not change willy nilly.. but I would not bet on it.

    1. Find a third party library, which already does it for you (included for sake of completeness of the answer.)

    EDIT: Proof that the string you see (in source code), is only an artifact of the source code in given language -

    Compile a C# app, with string "Hello\nWorld" in it. Open the compiled binary in a binary editor. The string you'd find in the compiled binary will be without the "\n", replaced with the appropriate bytes for new line character.