Search code examples
encodingcharacter-encodingspecial-characterssymbolsundefined-symbol

Symbols and encoding: what symbol is this?


I'm working with a large text file filled with data. Different data blocks there are spitted by a symbol (or a pair of similar symbols) that looks kinda strange and weird. I need to find out, what symbol this is, to properly (!) use it for splitting data blocks when I read the data file. Could you assist me with that?

Here is how the pair of symbols look in Stackoverflow "Ask Question" editing field:

Next I add some pics of how different the symbol looks from place to place:

In original data file

enter image description here

In Brackets Editor (with all the available encodings, it's the same)

enter image description here

In Brave Browser search bar

enter image description here

In Visual Studio 2019

enter image description here

In Stackoverflow (it's different when I type and when in the posted question) editing field

enter image description here

Somewhere it is converted to one of the following

enter image description here

While reading the symbol using C# with Encoding.UTF8 encoding, the console gives next result: enter image description here

But when using Encoding.Unicode, the console gives an infinite set of smth like this:

enter image description here

What exactly do I have to write to make my C# code recognize and react to that symbols?


Solution

  • I used this unicode char finder to find out what the characters are.

    in order they are...

    U+0003 : END OF TEXT [ETX]

    U+0001 : START OF HEADING [SOH]