Search code examples
bashshelltextcutcollect

Collect the data (two parameters) between two keywords (variable+string) from ini file


I have a txt.ini file with content (I cannot modify the structure of this file):

txt.ini

txt.ini

[person_0:public]
name=john
groups=0,1,2
age=30

[person_0:private]
married=false
weight=190
height=100

[person_1:public]
name=mark
groups=0,4
age=28

[person_1:private]
married=false
weight=173
height=70

[person_2:public]
name=tony
groups=3,4
age=30

[person_3:private]
married=true
weight=202
height=120

I have a variable "person" which takes the value one of: person_0, person_1, person_3 in the loop and I would like to collect the person's data like age and groups for every 'person' one by one.

My idea is to cut out the part between $person:public and $person:private and after that collect

e.g. set variable person=person_1 output: groups=0,4 age=28

I prepared the code in bash (persons is a list of person_0, person_1, person2):

for person in ${persons[@]}; do
    file="txt.ini"
    echo "$person"
    a=$(awk -v a=$person":private" -v b=$person":public" '/a/{found=0} {if(found) print} /b/{found=1}' $file)

    IFS=$'\n' lines=($a)
    IFS='=' read grouplist grouplist_values <<< ${lines[1]}
    IFS='=' read age age_values <<< ${lines[4]}
    echo "Group list = $grouplist_values"
    echo "Age = $age_values"

Group list and age are empty. Output:

person_0
Group list =
Age =

person_1
Group list =
Age =

person_2
Group list =
Age =

Expected:

person_0
Group list =0,1,2
Age =30

person_1
Group list =0,4
Age =28

person_2
Group list =3,4
Age =30

I will use this data "per person" in another part of my code. I'm working on files with different number of "persons".

I really don't know what is wrong.

I tried also:

#export person="person_0"
#a=$(awk '/ENVIRON["person"]:private/{found=0} {if(found) print} /ENVIRON["person"]:public/{found=1}' $file)

and

private=$person":private"
public=$person":public"
echo "private=$private"
echo "public=$public"
a=$(awk -v a=$private" -v b=$public '/a/{found=0} {if(found) print} /b/{found=1}' $config_file)

but output was the same:

person_0
private=person_0:private
public=person_0:public
Group list =
Age =

What is strange for me - when I hardcode range of cutting it works properly:

a=$(awk '/person_0:private/{found=0} {if(found) print} /person_0:public/{found=1}' $file)

or

a=$(awk '/person_1:private/{found=0} {if(found) print} /person_1:public/{found=1}' $file)

Do you have any idea how can I collect this data in a clever way?


Solution

  • Would you please try the following:

    awk -v RS='' '                          # split the records on the blank lines
    /public/ {                              # "public" record
        split($1, a, /[\[:]/); print a[2]   # extract the "person_xx" substring
        for (i = 2; i <= NF; i++) {         # iterate over the lines of the record
            split($i, a, /=/)
            if (a[1] == "groups") print "Group list =" a[2]
            else if (a[1] == "age") print "Age =" a[2]
        }
        print ""                            # insert a blank line
    }
    ' txt.ini
    

    Output:

    person_0
    Group list =0,1,2
    Age =30
    
    person_1
    Group list =0,4
    Age =28
    
    person_2
    Group list =3,4
    Age =30
    
    
    • By setting awk variable RS to the null string, the records are separated by blank lines and the fields are separated by the newline character.
    • Assuming the desired data are included in the public block, we can parse the fields of the public record one by one.

    [Edit]
    According to the OP's comment, here is the updated version:

    #!/bin/bash
    
    persons=("person_0")                            # list of desired person(s)
    for person in "${persons[@]}"; do               # loop over the bash array
        awk -v RS='' -v person="$person" '          # assign awk variables
        $1 ~ person ":public" {                     # "public" record of the person
            split($1, a, /[\[:]/); print a[2]       # extract the "person_xx" substring
            for (i = 2; i <= NF; i++) {             # iterate over the lines of the record
                split($i, a, /=/)
                if (a[1] == "groups") print "Group list =" a[2]
                else if (a[1] == "age") print "Age =" a[2]
            }
        }
        ' txt.ini
        echo                                        # insert a blank line
    done
    
    • You can assign the persons array to whoever you want.
    • The pattern $1 ~ person ":public" tests if the 1st field of the record $1 (e.g. [person_0:public]) matches the awk variable person (passed with the -v option) followed by a string ":public".

    Obviously the awk script repeats reading the txt.ini file multiple times as many as the #elements in the persons array. If the text.ini file is long and/or the persons array has many elements, the loop will be inefficient. Here is another variant:

    #!/bin/bash
    
    persons=("person_0" "person_1")         # bash array just for an example
    awk -v RS='' -v persons_list="${persons[*]}" '
                                            # persons_list is a blank separated list of persons
    BEGIN {
        split(persons_list, a)              # split persons_list back to an array
        for (i in a) persons[a[i]]          # create a new array indexed by person
    }
    /public/ {                              # "public" record
        split($1, a, /[\[:]/)               # extract the "person_xx" substring
        if (a[2] in persons) {              # if the person exists in the list
            print a[2]
            for (i = 2; i <= NF; i++) {     # iterate over the lines of the record
                split($i, a, /=/)
                if (a[1] == "groups") print "Group list =" a[2]
                else if (a[1] == "age") print "Age =" a[2]
            }
            print ""                        # insert a blank line
        }
    }
    ' txt.ini
    

    Please note it assumes the person string does not contain whitespace characters. If so, change the delimiter when assigning the persons_list to an unused character such as a comma.