Search code examples
awkpostfix-mtaldif

Create postfix aliases file from LDIF using awk


I want to create a Postfix aliases file from the LDIF output of ldapsearch.

The LDIF file contains records for approximately 10,000 users. Each user has at least one entry for the proxyAddresses attribute. I need to create an alias corresponding with each proxyAddress that meets the conditions below. The created aliases must point to sAMAccountName@other.domain.

  • Type is SMTP or smtp (case-insensitive)
  • Domain is exactly contoso.com

I'm not sure if the attribute ordering in the LDIF file is consistent. I don't think I can assume that sAMAccountName will always appear last.

Example input file

dn: CN=John Smith,OU=Users,DC=contoso,DC=com
proxyAddresses: SMTP:smith@contoso.com
proxyAddresses: smtp:John.Smith@contoso.com
proxyAddresses: smtp:jsmith@elsewhere.com
proxyAddresses: MS:ORG/ORGEXCH/JOHNSMITH
sAMAccountName: smith

dn: CN=Tom Frank,OU=Users,DC=contoso,DC=com
sAMAccountName: frank
proxyAddresses: SMTP:frank@contoso.com
proxyAddresses: smtp:Tom.Frank@contoso.com
proxyAddresses: smtp:frank@elsewhere.com
proxyAddresses: MS:ORG/ORGEXCH/TOMFRANK

Example output file

smith: smith@other.domain
John.Smith: smith@other.domain
frank: frank@other.domain
Tom.Frank: frank@other.domain

Ideal solution

I'd like to see a solution using awk, but other method are acceptable too. Here are the qualities that are most important to me, in order:

  1. Simple and readable. Self-documenting is better than one-liners.
  2. Efficient. This will be used thousands of times.
  3. Idiomatic. Doing it "the awk way" would be nice if it doesn't compromise the first two goals.

What I've tried

I've managed to make a start on this, but I'm struggling to understand the finer points of awk.

  • I tried using csplit to create seperate files for each record in the LDIF output, but that seems wasteful since I only want a single file in the end.
  • I tried setting RS="" in awk to get complete records instead of individual lines, but then I wasn't sure where to go from there.
  • I tried using awk to split the big LIDF file into separate files for each record and then processing those with another shell script, but that seemed wasteful.

Solution

  • Here a gawk script which you could run like this: gawk -f ldif.awk yourfile.ldif Please note: the multicharacter value of `RS' is a gawk extension.

    $ cat ldif.awk
    BEGIN {
        RS = "\n\n"  # Record separator: empty line
        FS = "\n"    # Field separator: newline
    }
    
    # For each record: loop twice through fields
    {
        # Loop #1 identifies the sAMAccountName
        for (i = 1; i <= NF; i++) {
            if ($i ~ /^sAMAccountName: /) {
                sAN = substr($i, 17)
                break
            }
        }
    
        # Loop #2 prints output lines
        for (i = 1; i <= NF; i++) {
            if (tolower($i) ~ /smtp:.*@contoso.com$/) {
                split($i, n, ":|@")
                print n[3] ": " sAN "@other.domain"
            }
        }
    }