I'm trying to obtain the text shown in a MS Word window in C# using Microsoft.Office.Interop.Word. Please note it's not the whole document or even the page; just the same content the user sees.
The following code seems to work with simple documents:
Application word = new Application();
word.Visible = true;
object fileName = @"example.docx";
word.Documents.Add(ref fileName, Type.Missing, Type.Missing, true);
Rect rect = AutomationElement.FocusedElement.Current.BoundingRectangle;
Range r1 = word.ActiveWindow.RangeFromPoint((int)rect.Left, (int)rect.Top);
Range r2 = word.ActiveWindow.RangeFromPoint((int)rect.Left + (int)rect.Width, (int)rect.Top + (int)rect.Height);
r1.End = r2.Start;
Console.WriteLine(r1.Text.Replace("\r", "\r\n"));
However, when the document includes other structures such as headers, only parts of the text are returned.
So, what's the correct way to achieve this?
Thanks a lot!
Updated Code
Rect rect = AutomationElement.FocusedElement.Current.BoundingRectangle;
foreach (Range r in word.ActiveDocument.StoryRanges) {
int left = 0, top = 0, width = 0, height = 0;
try {
try {
word.ActiveWindow.GetPoint(out left, out top, out width, out height, r);
} catch {
left = (int)rect.Left;
top = (int)rect.Top;
width = (int)rect.Width;
height = (int)rect.Height;
}
Rect newRect = new Rect(left, top, width, height);
Rect inter;
if ((inter = Rect.Intersect(rect, newRect)) != Rect.Empty) {
Range r1 = word.ActiveWindow.RangeFromPoint((int)inter.Left, (int)inter.Top);
Range r2 = word.ActiveWindow.RangeFromPoint((int)inter.Right, (int)inter.Bottom);
r.SetRange(r1.Start, r2.Start);
Console.WriteLine(r.Text.Replace("\r", "\r\n"));
}
} catch { }
}
There may be some problems with this:
enumerator = r1.StoryRanges.GetEnumerator();
{
while (enumerator.MoveNext()
{
Range current = (Range) enumerator.Current;
}
}
Have you tried to look at How to programmatically extract the text of the currently viewed page of an Office.Interop.Word.Document object ?