Search code examples
bashsedldap

Using bash to parse the output of ldapsearch


I recently wrote a bash script that had to parse the output of ldapsearch results. The script works, but I imagine there is a more efficient way to accomplish this.

The script executes an ldapsearch command, which outputs multiple records that are in a multiline format. Each record is separated by a blank line. What I ended up doing was the following:

  1. add a delimitating character to the end of each line
  2. Add the string 'DELIM' to blank lines
  3. trimmed all new lines
  4. Replaced 'DELIM' with a new line

What this effectively did was turn the multiline output of ldapsearch to multiple lines of delimited separated values. I then use cut twice to parse the lines (once to split the delimiter, and then again to spit the output of the ldap result)

Here is the code:

while IFS= read -r line ; do
 dn=$(echo "$line" | cut -d '#' -f 1 | cut -d " " -f 2)
 uid=$(echo "$line" | cut -d '#' -f 2 | cut -d " " -f 2)
 uidNumber=$(echo "$line" | cut -d '#' -f 3 | cut -d " " -f 2)
 gidNumber=$(echo "$line" | cut -d '#' -f 4 | cut -d " " -f 2)

 # Code emitted since it's not relevant

done < <(ldapsearch -x -H "$ldap_server" -D 'cn=Directory Manager' -w $ds_password -b "$searchbase" -LLL uid uidNumber gidNumber | sed 's/$/#/g' | sed 's/^#$/DELIM/g' | tr -d '\n' | sed 's/DELIM/\n/g')

The output of the ldapsearch command is the following

dn: uid=userone,ou=People,dc=team,dc=company,dc=local
uid: userone
uidNumber: 5000
gidNumber: 5000

dn: uid=usertwo,ou=People,dc=team,dc=company,dc=local
uid: usertwo
uidNumber: 5001
gidNumber: 5001

Is there a more efficient way to accomplish this? Specifically one that doesn't use piping so extensively?


Solution

  • Assumptions:

    • the ldapsearch data does not contain white space(s)
    • reformatting the data into single lines (via OP's current code or via jotne's answer) includes replacing the # delimiter with a space ( )

    Using a space (instead of a #) as the delimiter we have the following reformatted ldapsearch data (8x space-delimited fields):

    dn: uid=userone,ou=People,dc=team,dc=company,dc=local uid: userone uidNumber: 5000 gidNumber: 5000
    dn: uid=usertwo,ou=People,dc=team,dc=company,dc=local uid: usertwo uidNumber: 5001 gidNumber: 5001
    

    The while read operation can be modified to eliminate the (currently) 12x subprocess calls (4x $(echo|cut|cut)) on each pass through the while loop, eg:

    while read -r _ dn _ uid _ uidNumber _ gidNumber
    do
        echo "############"
        echo ".$dn."
        echo ".$uid."
        echo ".$uidNumber."
        echo ".$gidNumber."
    done < <(ldapsearch ... | other_code_to_reformat_ldapsearch_data_as_single_lines_but_with_space_delimiter)
    

    NOTES:

    • the _ are dummy place holders for fields we don't care about
    • periods (.) added to echo statements as visual delimiters

    This generates:

    ############
    .uid=userone,ou=People,dc=team,dc=company,dc=local.
    .userone.
    .5000.
    .5000.
    ############
    .uid=usertwo,ou=People,dc=team,dc=company,dc=local.
    .usertwo.
    .5001.
    .5001.
    

    Another awk idea for reformatting the ldapsearch results that outputs just the fields we're interested in:

    awk '{for (i=2;i<=NF;i=i+2) {printf (i==2 ? "" : " ") $i}; print ""}' RS= ORS='\n'
    

    Where:

    • we re-use jotne's RS/ORS settings
    • (i=2;i<=NF,i=i+2) - only print even numbered fields

    This generates:

    uid=userone,ou=People,dc=team,dc=company,dc=local userone 5000 5000
    uid=usertwo,ou=People,dc=team,dc=company,dc=local usertwo 5001 5001
    

    With this change (4x space-delimited fields instead of 8x space-delimited fields) the proposed while read becomes:

    while read -r dn uid uidNumber gidNumber
    do
        ....
    done < <(ldapsearch ... | awk '{for (i=2;i<=NF;i=i+2) {printf (i==2 ? "" : " ") $i}; print ""}' RS= ORS='\n')