Search code examples
bashshellawkgrepcut

how to write bash script to filter some data in logs


I have a log file in this format.

-----------------------------------------------------------
name=abc
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:5/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

name=xyz
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:3/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

name=awd
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:2/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

I want to extract the name of the person and years lived if years lived are greater than certain years(say 2) for each name in the log file. The file will have duplicate names too with different details.

Output:

name:abc
yearLived:5
name:xyz
yearsLived: 3

I was trying to use the grep and cut commands to do that. The problem I am facing is that once I do grep or cut I lose the other part i.e. either name or address. How do I resolve this?


Solution

  • Here's a stab at it:

    awk 'BEGIN {RS = "name="} NR > 1 {match($0, "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {print $1 "\t" years[2]}' records_file
    

    Edit: Accommodating the updated log line sample and desired output:

    awk 'BEGIN {RS = "-{59}"} NR > 1 {match($0, "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {sub("=", ":", $1); print $1 "\n" yl[0]}' records
    

    Edit 2: Oops, meant to add a comment: To change the threshold for matching the number of years, change the second 2 in years[2] > 2. Hope that helps.