Search code examples
c#ms-wordnetoffice

How can I get desired range in the protected word document (NetOffice.Word C#)?


I have a protected document (doc.ProtectionType == wdAllowOnlyFormFields). It has areas that can be edited. Everything else is protected even from copying. I'm using the NetOffice.Word library and I'm trying to programmatically find text and create a bookmark in the found range. The problem is that when I try to call the method wordDoc.Content.Duplicate.Find.Execute(smthParams), the exception "COMException: This method or property is not available because the object refers to a protected area of the document." occurs. And, I can get any range of text manually without any problems:

var range = doc.Content.Duplicate;
range.SetRange(start, end);

In a range obtained in this way, I can create a bookmark with no problem. But I can't find the range corresponding to the text I'm looking for in this way. I am trying to create a bookmark this way:

public void CreateBookmarkTest()
{
    Document doc = Context.WordDocument;

    var searchText = "smth text";
    var bookmarkName = "newBookmark";
    
    using Range docRange = doc.Content.Duplicate;

    foreach (var paragraph in docRange.Paragraphs)
    {
        using Range paragraphRange = paragraph.Range;
        var text = paragraphRange.Text;
        var startParagraph = paragraphRange.Start;
        var endParagraph = paragraphRange.End;

        var startIndex = text.IndexOf(searchText);
        if (startIndex >= 0)
        {
            text = GetParagraphTextWithHiddenSymbols(paragraphRange, text);
            startIndex = text.IndexOf(searchText);
            var startFoundRange = startParagraph + startIndex;
            var end = startFoundRange + searchText.Length;

            paragraphRange.SetRange(startFoundRange, end);

            var foundText = paragraphRange.Text;
            if (foundText == searchText)
            {
                doc.Bookmarks.Add(bookmarkName, paragraphRange);
                break;
            }
        }
    }
}

private string GetParagraphTextWithHiddenSymbols(Range paragraphRange, string initialText)
{
    var text = initialText;
    foreach (Field field in paragraphRange.Fields)
    {
        int index = text.IndexOf(field.Result.Text);
        if (index >= 0)
        {
            text = text.Replace(field.Result.Text, $"{{{field.Code.Text}}} {field.Result.Text}{(char)21}");
        }
    }
    return text;
}

The problem is that, in this case, not always foundText == searchText. Sometimes foundText is offset and I can't figure out how to fix it yet. And this way seems to me slow and suboptimal. Perhaps there is some way to correctly implement search and text replacement (it would be ideal through Find.Execute). I'm also wondering if there's any way to get the areas allowed for editing (or just find out if the current Range is allowed for editing or not)?

I tried to convert using Oscar's idea from the answer below. The code works much better, but it also bugs out on large paragraphs with lots of unprotected input fields.

Thanks a lot for your help, friend!


Solution

  • It should be that hidden text like Field's code text in it results in this problem. Whatever NetOffice, Microsoft.Office.Interop.Word or VBA, etc. You can try my code first. Although it's not a perfect solution so far, notice this block snippet:

    if (range.Text != searchText)
                    {
    
                        Console.WriteLine(range.Text);
                        System.Diagnostics.Debugger.Break();
                    }
    

    at least it points the way to debugging, knowing what the problem is. You can follow this direction for further refinement.

    using NetOffice.WordApi.Enums;
    using Word = NetOffice.WordApi;
    
    Test();
    
    //The following code applies only to the content( main body) of the document itself and does not include the footnote, comments, header, footer ......, and other parts of the document.
    void Test()
    {
        //just test file for me
        //const string fFullnameStr = @"C:\Users\oscar\Dropbox\VS\VBA\stackoverflow.docm";
        const string fFullnameStr = @"C:\Users\oscar\Dropbox\VS\stackoverflow\VBA\Naive Bayes classifier.docx";
        Word.Application wordApplication = new Word.Application();
        wordApplication.DisplayAlerts = WdAlertLevel.wdAlertsNone;
        wordApplication.Visible = true; //just for test to watch
        Word.Document doc = wordApplication.Documents.Open(fFullnameStr);//Context.WordDocument;
    
        /* for test
        if(doc.ProtectionType!= WdProtectionType.wdAllowOnlyFormFields)
            Console.WriteLine(doc.ProtectionType);
        doc.Close();
        doc.Protect(WdProtectionType.wdAllowOnlyFormFields);
        just for test */
        int i = 0;
    
        //var searchText = "smth text";
        // https://github.com/Aldman/ProtectedRangeSearch/blob/main/FindTextTests.cs#L15
        var searchText = "based on a common";//"diameter features";//"based on a common";//"assume that the value";
        var bookmarkName = "newBookmark";
    
        Word.Range rng = doc.Content;//doc.Content.Duplicate;
    
        if (doc.ProtectionType != WdProtectionType.wdAllowOnlyFormFields)
        {
            if (doc.ActiveWindow.View.ShowFieldCodes)
                doc.ActiveWindow.View.ShowFieldCodes = false;
            while (rng.Find.Execute(findText: searchText, matchCase: true, matchWholeWord: true, matchWildcards: false,
                    matchSoundsLike: false, matchAllWordForms: false, forward: true, wrap: WdFindWrap.wdFindStop))
            {
                rng.Bookmarks.Add(bookmarkName + i++.ToString()); //rng.Select();//just for test
            }
    
        }
        else
        {
            foreach (var paragraph in rng.Paragraphs)//http://msdn.microsoft.com/en-us/en-us/Iibrary/office/ff837006.aspx 轉址為:https://learn.microsoft.com/en-us/office/vba/api/Word.Range.Paragraphs
            {
                Word.Range range = paragraph.Range;
                var text = range.Text;
                var index = text.IndexOf(searchText); int indexPre = index;
                var start = 0;
    
    
                #region GetParagraphTextWithHiddenSymbols
                foreach (Word.Field item in range.Fields)
                {
    
                    index = text.IndexOf(item.Result.Text, start);
                    if (index >= 0)
                    {
                        text = text.Substring(0, index) + "{" + item.Code.Text + "}" + item.Result.Text + ((char)21).ToString()
                            + text.Substring(index + item.Result.Text.Length);
                        start = (text.Substring(0, index) + "{" + item.Code.Text + "}" + item.Result.Text + ((char)21).ToString()).Length;
                    }
                    //text = text.Replace(item.Result.Text, 
                    //"{" +item.Code.Text+"}"+ item.Result.Text + (char)21);
                    //fieldsResultLength += item.Result.Text.Length + 2 + 1;//2="{}" of field code,1=chr(21) placehold of the fields
                }
    
                start = 0;
                //there will be "" both the start and end of a ContentControl object, so have to plus 2 for the two placeholders
                foreach (Word.ContentControl item in range.ContentControls)
                {
                    text = text.Substring(start, item.Range.Start - 1) + " " + item.Range.Text + " " + text.Substring(item.Range.End - 1);
                }
                #endregion
    
    
                while (index >= 0)
                {
    
                    index = text.IndexOf(searchText);
    
                    start = range.Start;
                    var end = range.End;
    
                    start += index; //+ fieldsResultLength;
                    end = start + searchText.Length;
                    range.SetRange(start, end);
    
                    while (range.Text != searchText && end <= range.End)
                    {
                        range.SetRange(++start, ++end);
                        if (range.Text == searchText) break;
                    }
    
                    if (range.Text != searchText)
                    {
                        Console.WriteLine(range.Text);
                        System.Diagnostics.Debugger.Break();
                    }
    
                    range.Bookmarks.Add(bookmarkName + i++.ToString());
    
                    text = paragraph.Range.Text; start = 0;
                    index = text.IndexOf(searchText, indexPre + 1);
                    indexPre = index;
                }
            }
        }
    
    
        wordApplication.Visible = true; //just for test to watch
        doc.ActiveWindow.View.ReadingLayout = false;//just for test to watch
        if (doc.ProtectionType != WdProtectionType.wdNoProtection)
            doc.Unprotect(123.ToString());//just for test
    
    }
    

    It is a logical necessity that Find objects cannot execute searched when the protection type is like this wdAllowOnlyFormFields. I think it's because the Find object class is not just a find class, but also includes a replace (edit) facility. Either you need to unprotect it, or change the way it is protected, or choose to use the current alternative, both of which I have conditioned flows in the code above. In addition to using this foreach paragraph approach to locate, you can also consider using a regular expression to achieve this. No matter which method you use, you have to do proper processing of the hidden text such as Fields' code text in order to get accurate results.

    • .csproj file:
    <Project Sdk="Microsoft.NET.Sdk">
    
      <PropertyGroup>
        <OutputType>Exe</OutputType>
        <TargetFramework>net6.0</TargetFramework>
        <ImplicitUsings>enable</ImplicitUsings>
        <Nullable>enable</Nullable>
      </PropertyGroup>
    
      <ItemGroup>
        <PackageReference Include="NetOfficeFw.Core" Version="1.9.3" />
        <PackageReference Include="NetOfficeFw.Word" Version="1.9.3" />
      </ItemGroup>
    
      <ItemGroup>
        <FrameworkReference Include="Microsoft.WindowsDesktop.App.WindowsForms" />
      </ItemGroup>
    
    </Project>
    
    
    void Test_ShowFieldCodes()
    {
        //just test file for me
        const string fFullnameStr = @"C:\Users\oscar\Dropbox\VS\VBA\stackoverflow.docm";
        Word.Application wordApplication = new Word.Application();
        wordApplication.DisplayAlerts = WdAlertLevel.wdAlertsNone;
        //wordApplication.Visible = true; //just for test to watch
        Word.Document doc = wordApplication.Documents.Open(fFullnameStr);//Context.WordDocument;
    
    
        int i = 0;
        var searchText = "smth text";
        var bookmarkName = "newBookmark";
    
        Word.Range rng = doc.Content;//doc.Content.Duplicate;
    
        if (doc.ProtectionType != WdProtectionType.wdAllowOnlyFormFields)
        {
    
            while (rng.Find.Execute(findText: searchText, matchCase: true, matchWholeWord: true, matchWildcards: false,
                    matchSoundsLike: false, matchAllWordForms: false, forward: true, wrap: WdFindWrap.wdFindStop))
            {
    
                if ((bool)rng.Information(WdInformation.wdInContentControl))
                    rng.SetRange(rng.Paragraphs[1].Range.ContentControls[1].Range.End + 1,
                        rng.Paragraphs[1].Range.ContentControls[1].Range.End + 1);
                rng.Bookmarks.Add(bookmarkName + i++.ToString());
            }
    
        }
        else
        {        //rng = doc.Content.Duplicate;
            foreach (var paragraph in rng.Paragraphs)//http://msdn.microsoft.com/en-us/en-us/Iibrary/office/ff837006.aspx 轉址為:https://learn.microsoft.com/en-us/office/vba/api/Word.Range.Paragraphs
            {
                Word.Range range = paragraph.Range;
                var text = range.Text;
                var index = text.IndexOf(searchText); int indexPre = 0;
                var start = 0;
    
                while (index >= 0)
                {
    
                    if (paragraph.Range.Fields.Count > 0)
                    {
    
                        doc.ActiveWindow.View.ShowFieldCodes = true;
                        text = paragraph.Range.Text;
                        //if there are fields this will be the index of ShowFieldCodes=false + index of ShowFieldCodes=true and plus 1
                        index = index + text.IndexOf(searchText, indexPre) + 1;
                        doc.ActiveWindow.View.ShowFieldCodes = false;
                    }
    
                    start = range.Start;
                    var end = range.End;
    
                    start += index;
                    end = start + searchText.Length;
                    range.SetRange(start, end);
    
                    while (range.Text != searchText && end <= range.End && range.End < doc.Content.End - 1)
                    {
                        //range.Select();//just for test
                        range.SetRange(++start, ++end);
                        if (range.Text == searchText) break;
                    }
    
                    if (range.Text != searchText && range.End < doc.Content.End - 1)
                    {
                        Console.WriteLine(range.Text);
                        System.Diagnostics.Debugger.Break();
                    }
    
                    if (range.Text == searchText)
                    {
                        if ((bool)range.Information(WdInformation.wdInContentControl))
                            range.SetRange(range.Paragraphs[1].Range.ContentControls[1].Range.End + 1,
                                range.Paragraphs[1].Range.ContentControls[1].Range.End + 1);
                        range.Bookmarks.Add(bookmarkName + i++.ToString());
                    }
                    text = paragraph.Range.Text; start = 0;
                    index = text.IndexOf(searchText, indexPre + 1);
                    indexPre = index;
                }
            }
        }
    
    
        wordApplication.Visible = true; //just for test to watch
        //doc.Unprotect(1.ToString());//just for test
    
    }
    

    20230712 ContentControls ,either

    So the answer is in your file there is no field in it, and all of the file it has is plenty of ContentControl not Fields! ActiveDocument.ContentControls.Count is 3. ActiveDocument.Fields.Count is 0. The new code is updated above.