Search code examples
c#.netpdf-conversionghostscript.net

cannot convert pdf page to image


I want to convert a pdf file's each page to a new image. To do this, i use GhostScript.Net. The problem is i can't figure out why pageImage returns null in the System.Drawing.Image pageImage = rasterizer.GetPage(dpi, i); line. Here is the method i use:

 public static List<string> GetPDFPageText(Stream pdfStream, string dataPath)
    {

        try
        {
            int dpi = 100;
            GhostscriptVersionInfo lastInstalledVersion =
           GhostscriptVersionInfo.GetLastInstalledVersion(
                   GhostscriptLicense.GPL | GhostscriptLicense.AFPL,
                   GhostscriptLicense.GPL);
            List<string> textParagraphs = new List<string>();

            using (GhostscriptRasterizer rasterizer = new GhostscriptRasterizer())
            {
                rasterizer.Open(pdfStream, lastInstalledVersion,false);

                for (int i = 1; i <= rasterizer.PageCount; i++)
                {
                    // here is the problem, pageImage returns null
                    System.Drawing.Image pageImage = rasterizer.GetPage(dpi, i);

                    // rest of code is unrelated to problem..
                    
                }
            }

            return textParagraphs;
        }
        catch (Exception ex)
        {
            throw new Exception("An error occurred.");
        }
        
    }

Function parameter Stream pdfStream comes from the below code:

            using (StreamCollection streamCollection = new StreamCollection())
            {
                FileStream imageStream = new FileStream(imagePath, FileMode.Open, FileAccess.Read);
                // This is the parameter I used for "Stream pdfStream"
                FileStream pdfStream = new FileStream(pdfPath, FileMode.Open, FileAccess.Read);
                streamCollection.Streams.Add(imageStream);
                streamCollection.Streams.Add(pdfStream);
                PDFHelper.SavePDFByFilesTest(dataPath, streamCollection.Streams,mergedFilePath);
            }

I am already comfortable with the use of StreamCollection class because i used it before in a similar situation and it worked. I verified that the filepath is true and stream has the file correctly. Also i tried using MemoryStream instead of FileStream and filename instead of stream just to see if the problem is related to them or not. Is there any suggestion you could suggest? I would really appreciate that.


Solution

  • Okay, i figured out why it didn't work. I use the latest version of Ghostscript (9.56.1) as K J mentioned (thank you for the response) and it uses a new PDF interpreter as default PDF interpreter. I assume it didn't work properly for some reason because it is a really new tool and still may have little problems for now. I added the following line to use good old PDF interpreter:

    rasterizer.CustomSwitches.Add("-dNEWPDF=false");
    

    Also defined resolution for produced image by following line:

    rasterizer.CustomSwitches.Add("-r300x300");
    

    Furthermore, i will share the structure of StreamCollection class, I used here as reference to implement this class. Hope it helps someone.

    public class StreamCollection :  IDisposable
        {
            private bool disposedValue;
            
            public List<Stream> Streams { get; set; }
    
            public StreamCollection()
            {
                Streams = new List<Stream>();
            }
            
            protected virtual void Dispose(bool disposing)
            {
                if (!disposedValue)
                {
                    if (disposing)
                    {
                        // TODO: dispose managed state (managed objects)
                        if (this.Streams != null && this.Streams.Count>0)
                        {
                            foreach (var stream in this.Streams)
                            {
                                if (stream != null)
                                    stream.Dispose();
                            }
                        }
                    }
    
                    // TODO: free unmanaged resources (unmanaged objects) and override finalizer
                    // TODO: set large fields to null
                    disposedValue = true;
                }
            }
    
            // // TODO: override finalizer only if 'Dispose(bool disposing)' has code to free unmanaged resources
            // ~StreamCollection()
            // {
            //     // Do not change this code. Put cleanup code in 'Dispose(bool disposing)' method
            //     Dispose(disposing: false);
            // }
    
            public void Dispose()
            {
                // Do not change this code. Put cleanup code in 'Dispose(bool disposing)' method
                Dispose(disposing: true);
                GC.SuppressFinalize(this);
            }
        }