Search code examples
linuxiconvtrcsv

unable to convert tab delimited .txt file to csv


I am using tr to convert tab delimited .txt file to comma delimited csv file. Here is my file output

arron@arron-Ideapad-Z570 ~/Phd $ cat test_pph_s.txt | cut -f 1,2,3
#o_acc                   o_pos  o_aa1
ENSG00000145888            455      H
ENSG00000145888            450      R
ENSG00000145888            440      M
ENSG00000145888            428      R
ENSG00000145888            428      R

here is full file link https://drive.google.com/file/d/0B0iDswLYaZ0zV3ktekhyeGxwTlk/view?usp=sharing

and trying to swap whitespace for commas using tr:

arron@arron-Ideapad-Z570 ~/Phd $ cat test_pph_s.csv | cut -f 1,2,3 | tr "\\t" ","
#o_acc               , o_pos,o_aa1
ENSG00000145888     ,   455,    H
ENSG00000145888     ,   450,    R
ENSG00000145888     ,   440,    M
ENSG00000145888     ,   428,    R
ENSG00000145888     ,   428,    R

which appears to just add a comma in and not remove the whitespace.

However I notice if I open the .txt file in libreoffice and save as .csv, it gets converted to utf8 format and then running

arron@arron-Ideapad-Z570 ~/Phd $ cat test_pph_s.csv | cut -f 1,2,3 | tr "\\t" ","

gives me

#o_acc,o_pos,o_aa1
    ENSG00000145888,455,H
    ENSG00000145888,450,R
    ENSG00000145888,440,M
    ENSG00000145888,428,R
    ENSG00000145888,428,R

which is what I want.

what am I doing wrong in my first attempt? It seems to me that using tr is not the "correct" way to convert to a .csv file.

thanks.


Solution

  • It seems you have mix of tabs and spaces

    cut -f 1,2,3 < input.txt | tr -s [:blank:] ','
    

    Here tr will collapse all white space to a single character and then replace it with comma. You also do not need cat, but you can use it if you prefer it that way :)