Search code examples
bashcut

Using cut on stdout with tabs


I have a file which contains one line of text with tabs

echo -e "foo\tbar\tfoo2\nx\ty\tz" > file.txt

I'd like to get the first column with cut. It works if I do

$ cut -f 1 file.txt
foo
x

But if I read it in a bash script

while read line
do
    new_name=`echo -e $line | cut -f 1`
    echo -e "$new_name"
done < file.txt

Then I get instead

foo bar foo2
x y z

What am I doing wrong?

/edit: My script looks like that right now

while IFS=$'\t' read word definition
do
    clean_word=`echo -e $word | external-command'`
    echo -e "$clean_word\t<b>$word</b><br>$definition" >> $2
done < $1

External command removes diacritics from a Greek word. Can the script be optimized any further without changing external-command?


Solution

  • What is happening is that you did not quote $line when reading the file. Then, the original tab-delimited format was lost and instead of tabs, spaces show in between words. And since cut's default delimiter is a TAB, it does not find any and it prints the whole line.

    So quoting works:

    while read line
    do
        new_name=`echo -e "$line" | cut -f 1`
        #----------------^^^^^^^
        echo -e "$new_name"
    done < file.txt
    

    Note, however, that you could have used IFS to set the tab as field separator and read more than one parameter at a time:

    while IFS=$'\t' read name rest;
    do
       echo "$name"
    done < file.txt
    

    returning:

    foo
    x
    

    And, again, note that awk is even faster for this purpose:

    $ awk -F"\t" '{print $1}' file.txt
    foo
    x
    

    So, unless you want to call some external command while looping the file, awk (or sed) is better.