Search code examples
javacarriage-returnlinefeed

linefeed character reading in java


I am wondering that when I open a file in notepad. I see a continuous line without any carriage return/line feed.

I made a java program to read the file. When I split the data from file by using \n or System.getProperty("line.separator");. I see lots of lines.

I found in hex editor that file has '0A' for new line ( used in UNIX ) and it appears as a rectangle in Notepad.

Well, my question is that if it doesn't have '0D' and 'OA' ( used in Windows for carriage return and line feed ). How my java program is splitting the data into lines? It should not split it.

Anyone have any idea?


Solution

  • Java internally works with Unicode.

    The Unicode standard defines a large number of characters that conforming applications should recognize as line terminators:[3]
    LF: Line Feed, U+000A
    VT: Vertical Tab, U+000B
    FF: Form Feed, U+000C
    CR: Carriage Return, U+000D
    CR+LF: CR (U+000D) followed by LF (U+000A)
    NEL: Next Line, U+0085
    LS: Line Separator, U+2028
    PS: Paragraph Separator, U+2029

    (http://en.wikipedia.org/wiki/Newline) That's why it interprets \n as newline.