Search code examples
awkspecial-charactersaixrecords

Count number of special character combination of delimiters


I have a very large file which has columns delimited by the |^ special characters.

One important thing to be noted is that i am on an AIX server

An example of the data would be:

name|^surname|^age|^city|^country
john|^doe|^15|^chicago|^usa
george|^reese|^14|^london|^england

After searching the internet the best thing that i found is the following

cat TEST_FILE.DAT | awk -F"\|\^" '{ print NF }'

However this returns always 1 because it does not recognize the 2 characters as 1 string

The following command return the correct number of delimiters but i want to do this with the combination of |^ as a delimiter

cat TEST_FILE.DAT | awk -F"|" '{ print NF }'

Solution

  • Using \\ instead of \ works for me. Not sure why exactly. Probably related to how escape characters are interpreted in bash, awk and awk's regex engine but I'm unable to give a good explanation.

    $ cat test
    name   | ^surname| ^age | ^city    | ^country
    john   | ^doe    | ^15  | ^chicago | ^usa
    george | ^reese  | ^14  | ^london  | ^england
    
    $ cat test |awk -F'\\| \\^' '{for(i=1;i<=NF;i++){if($i)print $i}}'
    name   
    surname
    age 
    city    
    country
    john   
    doe    
    15  
    chicago 
    usa
    george 
    reese  
    14  
    london  
    england
    

    By the way, it's important to use single quotes for -F. This line also works but it's ugly:

    cat test |awk -F"\\\\| \\\\^" '{for(i=1;i<=NF;i++){if($i)print $i}}'