I am looking to convert PDF files into images. Docnet is able to convert the pdf into bytes[]
and their samples show how to save this byte[]
into an image file using Bitmap
. Documentation
However, the solution won't work on linux machine since Bitmap
requires few libraries pre-installed on the system.
I've tried ImageSharp to convert the byte[]
using SixLabors.ImageSharp.Image.Load<Bgra32>(rawBytes)
, however, it throws Unhandled exception. SixLabors.ImageSharp.InvalidImageContentException: PNG Image does not contain a data chunk
.
Does anyone knows any alternative to achieve this.
PS - I'm open to explore any other cross platform FREE supported alternatives to convert PDF files to images.
This works fine with ImageSharp assuming Docnet works then ImageSharp will work fine for you.
The trick is you want to be using the Image.LoadPixelData<Bgra32>(rawBytes, width, height);
API not the Image.Load<Bgra32>(encodedBytes);
one.
using Docnet.Core;
using Docnet.Core.Models;
using SixLabors.ImageSharp;
using SixLabors.ImageSharp.PixelFormats;
using SixLabors.ImageSharp.Processing;
using var docReader = DocLib.Instance.GetDocReader(
"wikipedia_0.pdf",
new PageDimensions(1080, 1920));
using var pageReader = docReader.GetPageReader(0);
var rawBytes = pageReader.GetImage();
var width = pageReader.GetPageWidth();
var height = pageReader.GetPageHeight();
// this is the important line, here you are taking a byte array that
// represents the pixels directly where as Image.Load<Bgra32>()
// is expected an encoded image in png, jpeg etc format
using var img = Image.LoadPixelData<Bgra32>(rawBytes, width, height);
// you are likely going to want this as well otherwise you might end up with transparent parts.
img.Mutate(x => x.BackgroundColor(Color.White));
img.Save("wikipedia_0.png");