Zimbra zmprov formatted file to csv and ldif

I'm learning python and my first assignment is to convert a Zimbra zmprov formatted file to csv and ldif.

Since I don't know the python builtins to accomplish the task, I'm taking the long way and iterating over the lines and printing.

I would really appreciate if you guys could show me how to do it properly.

This is the input zmp_file, to be converted to csv and ldif

ca user1@domain.com.br      ''
ma user1@domain.com.br cn   'User One'
ma user1@domain.com.br cpf  ''
ma user1@domain.com.br l    'Porto Alegre'

ca user2@domain.com.br      ''
ma user2@domain.com.br cn   'User Two'
ma user2@domain.com.br cpf  '0123456789'
ma user2@domain.com.br l    ''

The desired .csv output (order of the fields is not important)

mail,cn,cpf,l
user1@domain.com.br,"User One",,"Porto Alegre"
user2@domain.com.br,"User Two",0123456789,

And the desired .ldif output (order of the fields is not important)

dn:   'uid=user1@domain.com.br'
cn:   'User One'
l:    'Porto Alegre'
mail: 'user1@domain.com.br'

dn:   'uid=user2@domain.com.br'
cn:   'User Two'
cpf:  '0123456789'
mail: 'user2@domain.com.br'

How far I could get:

with zmp_file as input_file
    for line in input_file:
        if line.startswith('ca'):
            mail = line.split()[1]
            print "dn: uid={0}".format(mail)
            print "mail: {0}".format(mail)
        elif line.startswith('ma'):
            words = shlex.split(line)[-2:]
            print "{0}: {1}".format(words[0], words[1])
        else:
            print

Solution

Ok. Got it.

I know this is not codereview.stackexchange.com but if anyone have comments, im here to learn.

#!/usr/bin/env python

import csv
import os
import shlex
import sys
from ldif import LDIFParser, LDIFWriter

def zmp_to_csv_and_ldif(zmp_file):

    all_attrs = set()
    data      = {}
    records   = {}

    with zmp_file as input_file:
        for line in input_file:
            if line.startswith('ca'):
                cmd, mail, pwd       = line.split()
                data['mail']         = mail
                data['userpassword'] = pwd
                records[mail]        = data
                all_attrs.update(['mail','userpassword'])
            elif line.startswith('ma'):
                cmd, mail, attr, value = shlex.split(line)
                data[attr]             = value
                records[mail]          = data
                all_attrs.add(attr)
            else:
                data = {}

    with open('/tmp/rag-parsed.csv', 'w') as output_file:
        csv_writer = csv.DictWriter(output_file, fieldnames=all_attrs, extrasaction='ignore', lineterminator='\n')
        csv_writer.writeheader()
        for mail, data in sorted(records.items()):
            csv_writer.writerow(data)

    with open('/tmp/rag-parsed.ldif', 'w') as output_file:
        b64_attrs   = map(str.lower, ['jpegPhoto', 'userPassword'])
        ldif_writer = LDIFWriter(output_file, base64_attrs=b64_attrs, cols=999)
        for mail, data in sorted(records.items()):
            dn = "uid={0}".format(mail)
            data_in_ldap_fmt = dict([k, v.split('\n')] for k, v in data.items() if v)
            ldif_writer.unparse(dn, data_in_ldap_fmt)