I parse MS Word documents with OpenXML to find track changes with a Author value in order to blank them. It works except for one particular Word document and I don't know why. Only 6 track changes out of 9 are found.
This is how I look for track changes :
var file = Path.GetFileName(filePath);
using (WordprocessingDocument document = WordprocessingDocument.Open(filePath, true))
{
var types = typeof(OpenXmlCompositeElement).Assembly.GetTypes()
.Where(p => !p.IsInterface && p.GetProperty("Author") != null && p.GetProperty("Author").PropertyType.Equals(typeof(StringValue)))
.ToList();
var body = document.MainDocumentPart.Document.Body;
var changes = body.Descendants().Where(x =>types.Contains(x.GetType()));
}
Is there a better way to achieve this ?
I have also tried by passing the list of types found on this page but the result was the same. Still these 3 ignored track changes.
EDIT
Using Open XML SDK Productivity tool, here's one of the node that should be found as a track change but is missing by the code above :
<w:ins w:id="283" w:author="Mr Smith" w:date="2019-09-11T12:34:00Z">
<w:r w:rsidR="008458AA">
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial"/>
</w:rPr>
<w:t>
2
</w:t>
</w:r>
</w:ins>
I found the problem.
The line var body = document.MainDocumentPart.Document.Body;
does not include footer and header !
So I also have to search for track changes within footer and header.
e.g for footer :
var footer = document.MainDocumentPart.FooterParts.ToList();
var footerChanges = new List<OpenXmlElement>();
footer.ForEach(f =>
footerChanges.AddRange(f.Footer.Descendants().Where(x =>
types.Contains(x.GetType()))));