Search code examples
node.jstypescriptemail-attachments

How to parse email attachments in node.js using mailparser npm module?


I am using the mailparser npm module, typescript and node 14. I am reading a message and trying to parse the attachment. This email message has one file attachment and that is a csv file, at least to the eye.

So in the code, I have the following:

const MailParser = require('mailparser').MailParser;

where I process the correct email message, I have

const parser = new MailParser();

    parser.on('headers', headers => {
        console.log(headers.get('subject'));
    });

    parser.on('data', data => {
        if (data.type === 'attachment') {
            console.log(data.filename);
            console.log(data.contentType);
            data.content.pipe(process.stdout);
            data.content.on('end', () => data.release());
        }
    });

What I see is the following as the output, I am truncating some of the control characters:

Outlook-dxbeseix.png
image/png
�PNG

IHDR�SLs IDATx�wtW�-z�zo���[��߽?ۀ ��y���4Nc�gl�x<cόA�Ev�1��8�4��d����[�d���������A��~k�%5
....
�)IEND�B`�2020-8-24-20-4-24 (1).csv
application/vnd.ms-excel
IMEI,Result
353071093175234,UNPAID
356759089843552,UNLOCKED
358709098168945,UNLOCKED

So I see a png stream ending with the actual attached file. Can someone explain what is going on here? And how to find the csv file attachment content in this buffer so I can parse it?


Solution

  • When using MailParser class content is not a Buffer but a Stream.

    I have added some code comments

    parser.on('data', data => {
        if (data.type === 'attachment') {
            // prints the file name of the attachment
            console.log(data.filename);
    
            // prints the contentType of the attachment
            console.log(data.contentType);
    
            // Content is a stream which is piped to stdout
            data.content.pipe(process.stdout);
    
            // call release after attachment processing to continue
            // message processing. Message processing will be paused
            // until release is called
            data.content.on('end', () => data.release());
        }
    });
    

    how to find the csv file attachment content

    You can use the data.contentType to get the csv attachment stream like the following

    // RFC 7111 and MS Excel types
    const csvContentTypes = ['text/csv','application/vnd.ms-excel'];
    
    parser.on('data', data => {
        if (data.type === 'attachment') {
            if(csvContentTypes.includes(data.contentType)) {
              // stream to a file, s3 etc
    
              // call release
              data.content.on('end', () => data.release());
            } else {
              // skip the attachment
              data.release();
            }
        }
    });