Search code examples
c#pdfwordprocessingmlspire.doc

Spire doc loses formatting when converting from docx to pdf


I'm writing a system that modifies a template letter (via OpenXml Wordprocessing) for different individuals and then converts them into pdfs for printing. However upon the conversion to pdf the address is losing it's spacing switching from a normal address line

mrs1 Test2 Name2
that
house
down
inr32m

to a flat address line

mrs1 Test2 Name2thathousedowninr32m

The xml produced when writing the same in word is

 <w:r>
    <w:t>Mrs</w:t>
  </w:r>
  <w:r>
    <w:br />
    <w:t>test</w:t>
  </w:r>
  <w:r>
    <w:br />
    <w:t>value</w:t>
  </w:r>
  <w:r>
    <w:br />
    <w:t>for</w:t>
  </w:r>
  <w:r>
    <w:br />
    <w:t>the</w:t>
  </w:r>
  <w:r>
    <w:br />
  </w:r>
</w:p>

And the xml from my outputted version is

<w:r>
    <w:t>
      <w:r>
        <w:t> mrs1 Test2 Name2<w:br /></w:t>
      </w:r>
      <w:r>
        <w:t> that<w:br /></w:t>
      </w:r>
      <w:r>
        <w:t> house<w:br /></w:t>
      </w:r>
      <w:r>
        <w:t> down<w:br /></w:t>
      </w:r>
      <w:r>
        <w:t> inr32m<w:br /></w:t>
      </w:r>
    </w:t>
  </w:r>

My generated word doc and resulting pdf Image of word doc and resulting pdf

And a manually written word doc and resulting pdf Manually genned word doc and resulting pdf

This conversion is currently running through 2 main methods

private void ConvertToPdf()
    {
        try
        {
            for (int i = 0; i < listOfDocx.Count; i++)
            {
                CurrentModalText = "Converting To PDF";
                CurrentLoadingNum += 1;

                string savePath = PdfTempStorage + i + ".pdf";
                listOfPDF.Add(savePath);

                Spire.Doc.Document document = new Spire.Doc.Document(listOfDocx[i], FileFormat.Auto);
                document.SaveToFile(savePath, FileFormat.PDF);
            }

        }
        catch (Exception e)
        {
            throw e;
        }
    }

and

 private string ReplaceAddressBlock(string[] address, string localDocText)
    {
        //This is done to force the array to have 6 indicies (with one potentially being empty
        string[] addressSize = new string[6];
        address.CopyTo(addressSize, 0);
        //defines the new save location of the object

        //add an xml linebreak to each piece of the address
        var addressString ="";
        var counter = 0;
        foreach (var t in address)
        {
            if (counter != 0)
            {
                addressString += "<w:r><w:t> ";
            }

            addressString += t + "<w:br />";
            if (counter != 4)
            {
                addressString += "</w:r></w:t> ";
            }
            counter += 1;

        }

        //look for the triple pipes then replace everything in them and them with the address
        var regExp = @"(\|\|\|).*(\|\|\|)";
        Regex regexText = new Regex(regExp, RegexOptions.Singleline);
        localDocText = regexText.Replace(localDocText, addressString);
        return localDocText;
    }

with localDocText being a copy of the full documents xml

I need it to output the address to the normal format and i'm not sure what would cause this


Solution

  • Using line breaks did not work, had to change it to a paragraph style. Thanks to Kevin for giving me this prompting. Below is the updated code for generating the address.

            /// <summary>
    /// This replaces the address block
    /// </summary>
    /// <param name="address">The address array </param>
    /// <param name="localDocText">the text we want to modify</param>
    /// <returns></returns>
    private string ReplaceAddressBlock(string[] address, string localDocText)
    {
        //This is done to force the array to have 6 indicies (with one potentially being empty
        string[] addressSize = new string[6];
        address.CopyTo(addressSize, 0);
        //defines the new save location of the object
    
        //add an xml linebreak to each piece of the address
        var addressString ="";
        var counter = 0;
        foreach (var t in address)
        {
            if (counter != 0)
            {
                addressString += " <w:p> <w:r><w:t> ";
            }
    
            addressString += t ;
            if (counter != 4)
            {
                addressString += "</w:t> </w:r></w:p> ";
            }
            counter += 1;
    
        }
    
        //look for the triple pipes then replace everything in them and them with the address
        var regExp = @"(\|\|\|).*(\|\|\|)";
        Regex regexText = new Regex(regExp, RegexOptions.Singleline);
        localDocText = regexText.Replace(localDocText, addressString);
        return localDocText;
    }