Search code examples
c#.netms-office

programmatically comparing word documents


I need to compare two office documents, in this case two word documents and provide a difference, which is somewhat similar to what is show in SVN. Not to that extent, but at least be able to highlight the differences.

I tried using the office COM dll and got this far..

object fileToOpen = (object)@"D:\doc1.docx";
string fileToCompare = @"D:\doc2.docx";

WRD.Application WA = new WRD.Application();

Document wordDoc = null;

wordDoc = WA.Documents.Open(ref fileToOpen, Type.Missing, Type.Missing, Type.Missing, Type.Missing,      Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
wordDoc.Compare(fileToCompare, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

Any tips on how to proceed further? This will be a web application having a lot of hits. Is using the office com object the right way to go, or are there any other things I can look at?


Solution

  • I agree w/ Joseph about diff'ing the string. I would also recommend a purpose-built diffing engine (several found here: Any decent text diff/merge engine for .NET?) which can help you avoid some of the normal pitfalls in diffing.