Search code examples
bashshellfile-format

What changes when a file is saved in Kedit for windows that the unix2dos command doesn't do?


So I have a strange question. I have written a script that re-formats data files. I basically create new files with the right column order, spacing, and such. I then unix2dos these files (the program I am formatting these files for is DIPS for windows, and I assume that the files should be ansi). When I go to open the files in the DIPS Program however an error occurs and the file won't open.

When I create the same kind of data file through the DIPS program and open it in note pad, it matches exactly with the data files I have created with my script.

On the other hand if I open the data files that I have created with my script in Kedit first, save them, and then open them in the DIPS program everything works.

My question is what could saving in Kedit possibly do that unix2dos does not?

(Also if I try using note pad or word pad to save instead of Kedit the file doesn't open in DIPS)

Here is what was created using the diff command in unix

" 1,16c1,16
* This file is generated by Dips for Windows.
* The following 2 lines are the Title of this file.
Cobre Panama
Drill Hole B11106-GT

Number of Traverses: 0

  • Global Orientation is:
    DIP/DIPDIRECTION

    0.000000 (Declination)

    NO QUANTITY

    Number of extra columns are: 0

--
* This file is generated by Dips for Windows.
* The following 2 lines are the Title of this file.
Cobre Panama
Drill Hole B11106-GT

Number of Traverses: 0

  • Global Orientation is:
    DIP/DIPDIRECTION

    0.000000 (Declination)

    NO QUANTITY

    Number of extra columns are: 0

18c18

--

440c440

--

442c442

-1

-1
"

Any help would be appreciated! Thanks!


Solution

  • Okay! Figured it out.

    Simply when you unix2dos your file you do not strip any space characters in between the last letter in a line and the line break character. When saving in Kedit you do strip the spaces between the last letter in a line and the line break character.

    In my script I had a poor programing practice in which I was writing a string like this;

    echo "This is an example string " >> outfile.txt

    The character count is 32, and if you could see the break line character (chr(10)) the line would read;

    This is an example string <chr(10)>

    If you unix2dos outfile.txt the line looks the same as above but with a different break line character. However when you place the file into Kedit and save it, now the character count is 25 and the line looks like this;

    This is an example string<chr(10)>

    This occurs because Kedit does not preserve spaces at the end of a line. It places the return or line break character at the last letter or "non space" character in a line.

    So programs that read literal input like DIPS (i'm guessing) or more widely used AutoCAD scripting will have a real problem with extra spaces before the return character. Basically in AutoCAD scripting a space in a line is treated as a return character. So if you have ten extra spaces at the end of a line it's treated the same as ten returns instead of the one you probably intended.