I'm having trouble parsing specific body parts of MIME messages.
I have an email client web interface. I want to allow the user to download the attachments of an email. In the past, each time I wanted to download an attachment I would make a call to the IMAP server with the argument RFC822 to obtain the whole message, that I could easily parse with Python.
However, this is not efficient and I need a way to obtain just the required attachment. I'm using the alternative of making a call to the IMAP server with the BODY[1], BODY[2], etc index of the specific bodypart.
When I make this IMAP call I obtain back the correct body part (when I make a call to BODYSTRUCTURE, the number of bytes in the part I'm looking for adds up, so I'm definitely obtaining the correct part).
However, I cannot parse this body part into something useable, or save it for that matter.
A specific example: I make a call to obtain the BODY[1] of an email and obtain back
('4 (UID 26776 BODY[2] {5318}', '/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAoHBwkHBgoJCAkLCwoMDxkQDw4ODx4WFxIZJCAmJSMg\r\nIyIoLTkwKCo2KyIjMkQyNjs9QEBAJjBGS0U+Sjk/QD3/2wBDAQsLCw8NDx0QEB09KSMpPT09PT09\r\nPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT3/wAARCABRAQIDASIA\r\nAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQA\r\nAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3\r\nODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWm\r\np6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEA\r\nAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSEx\r\nBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElK\r\nU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3\r\nuLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD2aiii\r\ngAooooAKKKKACorm4jtLaS4mbbFEpdzjOAOtS1neIf8AkXdR/wCvaT/0E1UVeSRM3yxbRbtbuC9t\r\n1ntZUliboyHIqauA+F+f+JkMnH7vj/vqu/rSvS9lUcE9jLDVvbUlNq1wooorE3CiiigAooooAKKK\r\nKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKzvEP8AyLuo/wDX\r\ntJ/6Cabruu23h/T2urlJZAPuxxLuZv8A63vXlWo+M/EXjC9FnpMUsMRORBb8sR6u3p+QrehSlJ83\r\nRGFerGKcerOp+F/XUv8Atn/7NXfE4GT0FcN4fx4Ss5jqTwS6jPt3w2vRcZxuPQHnnH5VDe6vqOuS\r\n+RGG2t0hi7/X1/HiniZqpVckThKbpUVCW534IYAggg8gilqG0QxWcKMMMsaqR6YFTVznSFFFFABR\r\nRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFITgZPSuen8daJDqLW\r\na3Xmug3SPENyRjIHJ+p7ZqowlLSKuTKcYq8nY6Kio1uIngEyyIYmGQ4PBH1qCLVLSa5Nuky+bjIU\r\n8bh7etSUcn8TGaOz05kYqwlYgg4I4qbRiLXw7ZG3SOFrmLzJmjQKZGyeSRUHxO/48dP/AOurfyqb\r\nTf8AkXNK/wCvf+prtn/usPVnn0/98n6Ikh8KzXt9LPdP5UDOWAXlmH9K6Wy0+20+Ly7WJUHc9z9T\r\n3pXdotOZ0OGWLcPqBXF6V4m1S4k0xpLvzPtTESRvaeWijBPyv0Y8dBXPClKabXQ6qlaNNpPqd5RX\r\nKWuvX8ug6HdPIhlvLsRTHYMFSW6Dt0FM0HxBqF/faZHcSIyXEM7yAIBkq5A+nFU8PNJvt/wf8iVi\r\nYNpd7fjb/M66iuZvrzVZ/Ed3ZWV7HbRQWqzDdAH3E5461lr4w1E2clyRESumLcBNvHmGTZn1x7UR\r\nw8pK6/q4SxMIuzT/AOGO6orjL7XdW0QzR3FzDds1ibmNvJ2bGBAxgHkc1saXJeJewx3+rxTySwea\r\nLcW4Q445yD0HSlKi4q9xxxCk+VJ/h/mbdFcodS1G6uNVnOqw2FnY3Bh+a3D8DHJJPqa09B1G4v7j\r\nVEuHVlt7too8Lj5QBSlSaVxxrxk7W/r+kbFFcW+v6t9in1ZbiEW0V79n+y+T1XcFzvznPNVF8T6q\r\nS8gvBuF55Iia0xGV345l6A4q1hpvqZvFwW6Z39FcRdeJr+P7VeJfWqrBeGBbEoNzoGC7s5znv0rf\r\n0HUbi/l1MXDBhb3jwx4XGFGMD3qZUJRjzMuGIhOXKv6/qxsUVzWva9d6XqsqQ7Gij057gIy9XDYH\r\nPpVOTXNV0lomu7iG8W4sZLlR5Pl+WyqGA4PI5ojQk0muopYmEW0+h2NFcjBrOq2U+nteXMN1HfWs\r\nk+wQ7PLKpvABB5Haq665rMFnpt5LdQzLqKviEQBfKO0kYOeenen9Xl3X9X/yF9aj2f8AVv8ANHbU\r\nVysXiG6e18PyCVHa7jd7gBR821CT9ORVTT/EmoPJpc0t/azi/Yq9skYDQcEjkHPbvR9Xn/Xz/wAh\r\n/Woaf12/zO1orkdI8R6tLpVpdXsFmYZiVEzT7Gc5PATHXjpntUdnrmriHSL+4uYZINSm8s24h2+X\r\nnOCGzk9O9H1eSuhLFQaTSev9fqdlRXB6Z4m1K4Gnyf2lbXM9xcCOSyWEB0XJy2Qc9Bn8as6T4gvp\r\n9Rgj1K+kt5JJin2Y2JCnk4USU5Yaav5Cji4Stbr6f5/8E7OikornOoWoLyWWC0llt4DcSopKxBgp\r\nc+mTwKnrP1LWrPTFInkzJ2jTlj/h+NAHkPiLxL4m8Rak2lyW9xbEnH2GFCGP+8erfyq7F4FvdA8P\r\n3ep6jIiSsixrbpztBdeWPTPHQV1g8Q3OoavDsVYIySMIPmIwTgt6e3SpPFF5JdeD71ZcEoY/m9cs\r\nK7qWIbnGEVZXRwV6CVOc5O7syj4SYnwlgkkLdsACeg2iofFhGn2cF8gkmldSBCi8gKfvZ9Km8I/8\r\nim3/AF+N/wCgiqXjpikGkMpKsElIIOCPmFDpqpinF92TGq6WDjNdkcoPEGu+IZ0SQrc28fAjkHyJ\r\n77uuffOa9JtvIXSbKG1mEqwQ7GI7H0NadnpFnqHh6xWeFQTAjb0G1gSAScj3qnZ+EDb3xka8byl+\r\n6EGGb2NY1qvN7iVkjpo0lH327tm5cyJFpEskgLIkBZgvUgLziuUtLPTrWSEhbyQWsgFtbTXWfnLB\r\nQQmOB82c+hrspII5rd4ZFDRupRl9QRjFVZNHs5m3Sxs5Awu6Rjs5B+Xn5eg6Y6VjGco6JmsqcZWc\r\nkc+ulWulXsDyWtz5UKyXSRG63xQ7cbiq+vzcVAunaUmlwXJkurcwRyGAQ3XzupO5hkDrk11v2C3I\r\nUOhfajRguxY7WxkEnrnA601dOt1tHtdrtA67SjyM3GMY5PAqvbT7k+wp/wApy0thpxa2nM2qCS5z\r\nbl1ucMwWQJ8x78t+VW9T0TSLEW8EkM4iuo1sSUfiNAdwJz/tAc+9bDaJYOWLQH5ju4dhtJYMSvPy\r\n/MAeMVLNplrcWwgnjMsYVlxIxY4YEHknPQmj20+4ewp/ynM3Fxp+rySzGxuJRFF9kIaUIGRpNoP4\r\nkA59DUxt7Tw7qMM5a6muBauQJ7oEKgK5Vc9TzwK310myQyFbdR5hUtgnnacj8jUzWsL3KzvGrSqp\r\nRWPYEg/zApe0la1x+yhfmtqcZfQ6ZdiS4eO+hh1Bt+xbkIkpDqhLA/d5INTQ2dvNfLNajUIzdXD7\r\n/KvNqF1zk8dRgf0rpRotgH3fZwfm3AFiQp3BuBnA+YA8VLHp9tE6tHCqlXaRcZ4Zup/Gn7WdrXF7\r\nCne9jlpLLTJLi2eOC8YXpS6S18/bCXbJyR+GahmsdMgeRZReNAswklhW8BQykb+F7jpz/hXULodg\r\ni4WAjGNpEjZTGcBTn5RyeBjrSLoGmLHsFnGRkHJyWzjHXr0o9tU7h7Cn/KZd1oWkDSWmltmIuZ0m\r\naTI8xWd1/i7AE9PTNU47SyfXrmO1lv4pPPMswW72ITk5IUDn7p49q6ePTreO0e2Cs0DrtKSOzjGM\r\nY5Jpkek2cRjMUbJsQINkjDKgk4bB+bknrnqaPaz7h7Gn2OfvZbHVLuKS7tLsSXlqIYxE4OYny2cd\r\nj8v61JfRaZdxFpYrkrZxC0G1wCVkAB/EVuPo9k6xgw48pFjjKsVKKvTBByOtMGh6eCuLcAKANoZs\r\nHGcEjOCRk8nml7SS2ZTpwd7o55b/AE+efS1FuzSQQbYB9oUpsZMEOR/FgdPerMOj2Wm6rAsUV1N9\r\nkj84JLcEx26kkfKO54NbJ0Wx3xusGx40CI0bshVRkAZBB7mppNPt5Z0mdW8xF2bg7DK9cHB5H1zR\r\n7Se1xeyhvY5nT4dPtbxmtrCYXF9GDCjSgqscgZjt/ufdOR9Kgt00m3h0xl05oHiKSQO0qKXDBgPM\r\nb8Dx7iupg0iyt3R44cNGRsJZm24BAAyeAAx46c0DSbIGEi3T9yAsecnaBnH8zR7WfcPYw7HJ2Mel\r\nB5LiGC6eCxDzfZ5LnKIwBJ2J0I54Oe9T/ZNK0u8siIdSfEp8i3eQlYWJAyEJ/wBr+ddO+mWryyO0\r\nZzKNsih2CuMbeVzg8cdKi/sLTyOYCzZyHZ2LA8YO4nPG0Y9MU/az7iVGmvsmEtppkQstNtrSdpYJ\r\nVmhLOFbdlyQzdcDaePpTLRLK4vrW7dtSnjSdNpnudyxyuMj5fbIGa6EaJYgDbCVYHO9XYPnLHO7O\r\nc/M3PfNPj0qziQJHAqqHWQAZ4ZQAp/AAUvaz7j9jDsW6KKKg0AjIIPQ1zV34NjluxJBcskbHLq/z\r\nEfQ/4101FAGbDpFpptjMttF85QgueWbj1rlNYu7ebSriwDl3mKZZOQu056967zqOawtX8LW17DM9\r\noqwXLL8pyQmfcD+lVB2kmiZpSi01c5HToZLO0aZLlLOyhY75pXwgOORj+I9OKx/EXijTddntrW2e\r\nRFtlZEmkTCyliM8clenGf0pn/CBeJ9W1U2t6iQQRHPmlv3IB7oB1P6+pr0Pw34G0rw4Fkij+0Xg6\r\n3EoBYf7o6L+HPvXYpQoy5+bmkccoSrx9ny2ibGkRtFo9lHIpV0gRWB7EKKuUUVxN3dztSsrBRRRS\r\nGFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQA\r\nneloooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKA\r\nCiiigD//2Q==\r\n').
This specific response corresponds to a JPEG image attachment.
I tried extracting the string representing the body part (so, I'm talking about the string starting in '/9j' and ending in '2Q==\r\n') and saving that to a file as a .jpg, but it's not a valid file.
I then though that, as there are multiple instances of \r\n in that string, that the string might be split with newline/carriage return, so I split the string and stripped it of the \r\n, then joined the substrings and tried to save that to a file. Still not a valid JPEG file.
What can I do to try and parse this response?
Thank you.
You need to parse the BODYRESPONSE
string to see what format the data is encoded in, see the IMAP RFC 3501, section 7.4.2. The 5th field is the content encoding:
['IMAGE', 'JPEG', ['NAME', 'image001.jpg'], '<image001.jpg@01CDE914.6E62F850>', None, 'BASE64', 5318, None, None, None]
The fields are, in order, the type and subtype (so image/jpeg
in this case), body parameters (such as characterset, format-flowed, or the filename in this case), the attachment id, description, encoding, size, MD5 signature (if any), disposition and language.
In this case the data is base-64 encoded:
>>> imagedata = datastring.decode('base64')
>>> imagedata[:10]
'\xff\xd8\xff\xe0\x00\x10JFIF'
which looks like JPEG data to me.