Search code examples
itext7

Reduce Runtime of iText7 html to pdf


My program converts HTML code into PDFs and I have tried compiling it multiple different times using the Visual Studio Release functionality. I have tried targeting win-x64 and win-x86 runtimes, removing unused code and not. I have it produce a single file and use net6.0.

static void ConvertHTMLtoPDF(string source_code, string output_file)
{
    string executablePath = AppDomain.CurrentDomain.BaseDirectory;

    string HTMLFilePath = System.IO.Path.Combine(executablePath, source_code);
    string PdfFilePath = System.IO.Path.Combine(executablePath, output_file);

    if (!File.Exists(HTMLFilePath))
    {
        throw new FileNotFoundException(
            "The source file: ["+ HTMLFilePath + "] does not exist, please check that your code is correct");
    }

    Console.WriteLine("Valid Arguments, Converting to PDF");

    PdfWriter writer = new PdfWriter(PdfFilePath);
    PdfDocument pdfDocument = new PdfDocument(writer);
    pdfDocument.SetDefaultPageSize(PageSize.LETTER);

    HtmlConverter.ConvertToPdf(new FileStream(HTMLFilePath, FileMode.Open), pdfDocument);

    pdfDocument.Close();
}
static void Main(string[] args)
{
    int totalCount = args.Length;

    if (totalCount != 2) 
    {
        throw new ArgumentException(
            "An unacceptable amount of arguments [" + totalCount + "] was provided, this program requires two args, {source_code, output_file}");
    }

    string source_code = args[0];
    string output_file = args[1];

    if (string.IsNullOrEmpty(source_code))
    {
        throw new ArgumentException(
            "Your input for the source file is either Null or Empty, please correct this in your code");
    }

    if (string.IsNullOrEmpty(output_file))
    {
        throw new ArgumentException(
            "Your input for the output file is either Null or Empty, please correct this in your code");
    }

    ConvertHTMLtoPDF(source_code, output_file);
}

I compiled the same code a month back with x86 target, single file, net6.0 (on 2019 VS Community) and recently (2022 VS Community). The old code averaged at 181s / 100 iterations and the new code averages 237s / 100. Both codes were compiled on the same machine but I forced to reset Windows in between. Targeting x64 only slightly improves runtime to 230s / 100.

Suggestions to improve the efficiency to reduce runtime?


Solution

  • So after going through the code a few more times and testing, it looks like I was compiling a debug version of the code. Since converting HTML to PDF is independent of the other instances, I decided to multithread the program running on n number of threads and added the beta version of .NET CommandLine (see more here).

    If you're curious on how I did this, feel free to check out the full code on GitHub. By recompiling for release and multithreading, I saw an 87% decrease in runtime. It is called from Python so you may have to change it to suit your needs.