Search code examples
c#.netms-wordfindoffice-interop

How to get all strings matching wildcards from word document


I am trying to write an application to search a word document for all occurrences of , where some_text is any string of characters between < and >. As I find each match, I'd like to store/display/do something with each one.

Here's what I have so far:

Word._Application word = new Word.Application();
Word.Documents d = word.Documents;
Word._Document doc;

doc = d.Open(strFileName);
doc.Activate();

foreach (Word.Range myStoryRange in doc.StoryRanges)
{
    myStoryRange.Find.MatchWildcards = true;
    myStoryRange.Find.Text = "[<]*[>]";
    myStoryRange.Find.Execute();

    // Somehow get the result string that matched the wildcard
}

Solution

  • It turns out that the Range is re-defined for each found string. You can access each found text as:

    rng.Text
    

    And you can get the found text character positions within the greater Range:

    rng.Start
    rng.End
    

    So I was able to do this by declaring a local Range containing only the found string within the Find loop. I was replacing each text with a DocProperty, but you could do anything you like with it:

        Word.Range rng = this.Content;
        rng.Find.MatchWildcards = true;
        rng.Find.Text = "[<]*[>]";
    
        while (rng.Find.Execute())
        {
            // create a local Range containing only a single found string
            object cstart = rng.Start;
            object cend   = rng.End;
            Word.Range localrng = this.Range(ref cstart, ref cend);
    
            // replace the text with a custom DocProperty
            Word.Field newfld = localrng.Fields.Add(localrng, Word.WdFieldType.wdFieldDocProperty, "MyDocProp", false);
            localrng.Fields.Update();
        }