Search code examples
c#ms-wordoffice-interop

Get text under Paragraph programmatically


I have a large word document which contains some headings. These headings have respectively one table as a child. (depicted in the screenshot)

enter image description here

Therefore I used the Microsoft Interop.Word library. My code looks like this. How can I get the children of a heading paragraph? Maybe there's a better way of doing this.

Application word = new Application();
Document doc = new Document();
object missing = System.Type.Missing;
doc = word.Documents.Open(ref m_FileName,
        ref missing, ref missing, ref missing, ref missing,
        ref missing, ref missing, ref missing, ref missing,
        ref missing, ref missing, ref missing, ref missing,
        ref missing, ref missing, ref missing);

foreach (Paragraph paragraph in doc.Paragraphs)
{
    Style style = paragraph.get_Style() as Style;
    string text = paragraph.Range.Text;
    paragraph.Range.Tables // does not get the table under the paragraph
}

Solution

  • I'd do it using ranges. Find first title, find next title (or anything you can use as an end of the chapter and get the content in between:

    Range r1 = doc.Content; 
    Range r2 = doc.Content; 
    r1.Find.Execute("Heading 1"); 
    r2.Find.Execute("Heading 2");
    
    Range chapter = doc.Range(r1.Start, r2.Start); 
    //Console.WriteLine(chapter.Text);
    
    foreach (Table t in chapter.Tables)
    {
        foreach(Row r in t.Rows)
        {
            foreach (Cell c in r.Cells)
            {
                //Console.WriteLine(c.Range.Text);
            }
        }
    }