Search code examples
awkgrepcuttr

grep, cut and remove \n from file


I'm working with an input file containing a list of user ID's on a new line. Within a bash script I run a while loop on that input file doing an ldapsearch query using grep -E to filter for my desired results. The generated output file is currently formatted as follows (/mountpoint/out_file_1.out);

uid=user_id1,cn=Users,ou=Department,dc=myORG    
LDAPresource=myORG_RESname1   LDAPresource=myORG_RESname2  
uid=user_id2,cn=Users,ou=Department,dc=myORG  
LDAPresource=myORG_RESname2   LDAPresource=myORG_RESname3

The desired output, however, should look as follows;

user_id1;myORG_RESname1
user_id1;myORG_RESname2
user_id2;myORG_RESname2
user_id2;myORG_RESname3

So far, I've tried using grep and cut to achieve the above desired output. Here the exact commands I'm running on that first results file above:

grep -E '(^uid=|myORG_RESname1|myORG_RESname2|myORG_RESname3)' /mountpoint/out_file_1.out | cut -d, -f1 >&5

which results in a second output (/mountpoint/out_file_2.out);

uid=user_id1  
LDAPresource=myORG_RESname1     
LDAPresource=myORG_RESname2  

again, running another grep with cut:

grep -E 'LDAPresource|uid=' /mountpoint/out_file_2.out | cut -d= -f2 >&6

finally produces this output (/mountpoint/out_file_3.out):

user_id1  
myORG_RESname1  
myORG_RESname2  

which is "almost" what I need. The last output I've generated, needs to get rid of the newline and repeat the user ID for every Resource Name found as already described for the desired output (/mountpoint/final_output.out):

user_id1;myORG_RESname1  
user_id1;myORG_RESname2 

Using:

tr '\n' ';' < input_file > output_file doesn't give me the desired result...

Any ideas how to achieve that? Any help is very much appreciated.

EDIT:

Here is the actual bash script I'm running for reference:

#!/bin/bash

# assign file descriptor for input fd
exec 3< /mountpoint/userlist
# assign file descriptor for output fd unfiltered
exec 4> /mountpoint/out_file_1.out
# assign file descriptor for output fd filtered
exec 5> /mountpoint/out_file_2.out
# assign file descriptor for output fd final
exec 6> /mountpoint/out_file_3.out

while IFS= read -ru 3 LINE; do
    ldapsearch -h IPADDR -D "uid=admin,cn=Users,ou=Department,dc=myDC" -w somepwd "(uid=$LINE)" LDAPresource >&4
    grep -E '(^uid=|Resource1|Resource2|Resource3)' /mountpoint/out_file_1.out | cut -d, -f1 >&5
    grep -E 'TAMresource|uid=' /mountpoint/out_file_2.out | cut -d= -f2 >&6
    #tr '\n' ';' < input_filename > file
done
# close fd #3 inputfile
exec 3<&-
# close fd #4 & 5 outputfiles
exec 4>&-
exec 5>&-
# exit with 0 success status
exit 0

Solution

  • With your shown samples, please try following. Written and tested with shown samples in GNU awk.

    awk '
    match($0,/uid=[^,]*/){
      val1=substr($0,RSTART+4,RLENGTH-4)
      next
    }
    {
      val=""
      while($0){
        match($0,/LDAPresource=[^ ]*/)
        val=(val?val OFS:"")(val1 ";" substr($0,RSTART+13,RLENGTH-13))
        $0=substr($0,RSTART+RLENGTH)
      }
      print val
    }' Input_file
    

    Explanation: Adding detailed explanation for above.

    awk '                                 ##Starting awk program from here.
    match($0,/uid=[^,]*/){                ##Using match function to match regex uid= till comma comes in current line.
      val1=substr($0,RSTART+4,RLENGTH-4)  ##Creating val1 variable which has sub string of matched regex of above.
      next                                ##next will skip all further statements from here.
    }
    {
      val=""                              ##Nullifying val variable here.
      while($0){                          ##Running loop till current line value is not null.
        match($0,/LDAPresource=[^ ]*/)    ##using match to match regex from string LDAPresource= till space comes.
        val=(val?val OFS:"")(val1 ";" substr($0,RSTART+13,RLENGTH-13))  ##Creating val which has val1 ; and sub string of above matched regex.
        $0=substr($0,RSTART+RLENGTH)      ##Saving rest of line in current line.
      }
      print val                           ##Printing val here.
    }' Input_file                         ##Mentioning Input_file name here.