Search code examples
stringbashshellunique

Printing specific parts from a file in shell


I'm trying to print some specific information from a file with a specific format (The file is as following : id|lastName|firstName|gender|birthday|creationDate|locationIP|browserUsed ) I want to print out just the firstName sorted out and unique. I specifically want to use these arguments when calling the script(let's call it script.sh) :

./script.sh --firstnames -f <file>

My code so far is the following :

--firstnames )
OlIFS=$IFS
content=$(cat "$3" | grep -v "#")
content=$(cat "$3" | tr -d " ") #cut -d " " -f6 )
for i in $content
do

IFS="|"
first=( $i ) 
echo ${first[2]}
IFS=$OlIFS
done | sort | uniq
;;
esac

For example for the following file :

#id|lastName|firstName|gender|birthday|creationDate|locationIP|browserUsed
933|Perera|Mahinda|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.12|Firefox
1129|Lepland|Carmen|female|1984-02-18|2010-02-28T04:39:58:781+0000|81.25.252.111|Internet Explorer

is supposed to have the output :

Carmen
Mahinda

One problem I've noticed is that the script prints the comments too. The above will print :

Carmen
firstnames
Mahinda

even though I've used grep to get rid of the lines starting with "#". This is only part of the code (it's where I believe is the problem). It's supposed to recognize the "--firstnames". Since some of the fields from the file will have spaces in between, specifically in the last section(the browser section) , I wanted to remove just that section. This is for a school project, and according to the program that grades this section, it's all wrong. The script works as far as I can tell though(I tested it). I don't know what's wrong with this therefore I don't know what to correct. Please help !


Solution

  • grep -vE '^#' "$3" | cut -d'|' -f3 should be enough :

    $ echo '#id|lastName|firstName|gender|birthday|creationDate|locationIP|browserUsed
    > 933|Perera|Mahinda|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.12|Firefox
    > 1129|Lepland|Carmen|female|1984-02-18|2010-02-28T04:39:58:781+0000|81.25.252.111|Internet Explorer
    >' | grep -vE '^#' | cut -d'|' -f3
    Mahinda
    Carmen
    

    the grep command removes lines starting with # (it uses regular expressions to do so hence the -E flag ; if you want to keep removing any line containing a #, your current grep -v # is correct), the cut -d'|' -f3 command splits the string around a | delimiter and returns its third field.