Search code examples
docxpandoc

Faster way to achieve document conversion/preview task


I created win forms app to convert docx to html using pandoc, and a web browser control to display the html file. This application is much needed for my colleagues in the university, to preview docx files since we dont have MS Office access any more...

I tested this at my PC and it is working fine on each item click in the listbox, it loads the preview in webbrowser quicky. But I just want to make it more quick, is there any recommendations to make it faster (I can provide full code if needed), but the following is the main listbox selected item changed event:

Also tell me which one is faster from: setting wb.DocumentText as blank or navigate it to about:blank page

  private void lbFiles_SelectedIndexChanged(object sender, EventArgs e)
        {
            try
            {
                wb.DocumentText = "";

                // Two string lists
                SelectedFile = AllFiles[lbFiles.SelectedIndex];
                NameOnly = AllNamesOnly[lbFiles.SelectedIndex];

                if (NameOnly.EndsWith(".txt") || NameOnly.EndsWith(".docx"))
                {
                    #region MediaFolder
                    if (Directory.Exists("MF")) Directory.Delete("MF", true);
                    Directory.CreateDirectory("MF");
                    #endregion

                    string cmd = "pandoc --extract-media ./MF \"" + SelectedFile + "\" -o " + "output.html";

                    File.WriteAllText("BatchFile.bat", cmd);

                    StartHidden("BatchFile.bat"); //Process object with: ProcessWindowStyle.Hidden; and with 3 seconds exit wait

                    wb.Navigate(Environment.CurrentDirectory + "\\" + "output.html");
                }
            }
            catch(Exception ex) { throw ex; }
            
        }

Solution

  • I tried hard with various solutions.

    Most of them were not free, so were not useable, using pendoc as in OP, was not feasible since it doesn't support the variety of fonts and formats

    After going through many possible alternatives (Gembox, Spire.Doc, etc.), I finally moved to Syncfusion community edition it is already free and its libraries allow to convert all major word processor based formats and its result was same as other non-free solutions. And works faster then pandoc.

    Another thing to note is, I switched from WebBrowser to CefSharp as well, as its faster and lighter than WebBrowser, and it works better for pdf file previews in browser (you can use zoom level, page number as part of URL)