Search code examples
c#pdfium

PDFiumSharp PDF to Image size


I am using PDFiumSharp to generate JPGs from PDF file. Here is my code:

using (WebClient client = new WebClient())
{
    byte[] pdfData = await client.DownloadDataTaskAsync(pdfUrl);

    using (var doc = new PdfDocument(pdfData))
    {
        int i = 0;
        foreach (var page in doc.Pages)
        {
            using (var bitmap = new PDFiumBitmap((int)page.Width, (int)page.Height, true))
            using (var stream = new MemoryStream())
            {
                page.Render(bitmap);
                bitmap.Save(stream);
                ...
                i++;
            }
        }
    }
}

The codes work very well, images are generated accurately. However, each JPG is about 2mb. With multi-page PDF, the overall image size adds up quickly. Is there any way to reduce the JPG file size? I only need the JPG for preview purposes, not for printing. So lower resolution or quality is fine.


Solution

  • When you call bitmap.Save(...), the resulting byte[] that gets put into the MemoryStream stream represents a BMP. You should convert it into JPG yourself.

        public static byte[] Render(PdfDocument pdfDocument, int pageNumber, (int width, int height) outputSize)
        {
            var page = pdfDocument.Pages[pageNumber];
    
            using var thumb = new PDFiumBitmap((int)page.Width, (int)page.Height, false);
            page.Render(thumb);
    
            using MemoryStream memoryStreamBMP = new();
            thumb.Save(memoryStreamBMP);
    
            using Image imageBmp = Image.FromStream(memoryStreamBMP);
    
            using MemoryStream memoryStreamJPG = new();
            imageBmp.Save(memoryStreamJPG, ImageFormat.Jpeg);
    
            return memoryStreamJPG.ToArray();
        }