Search code examples
c#regexnovacode-docx

Docx - Removing section of document


Is there a way to remove sections of a document where i can specify the beginning and ending tags?

i need a way that i can remove a section of the document by passing in both my start and end catches, (@@DELETEBEGIN and @@DELETEEND)

for example i have this in my document:

Hello, welcome to this document

@@DELETEBEGIN{Some values to check in the code}

Some text that will be removed if the value is true

@@DELETEEND

Final Line


Solution

  • If you need to delete text from @@DELETEBEGIN to @@DELETEEND, where @@DELETEBEGIN is not at the beginning of a Paragraph and @@DELETEEND is not at the end of a Paragraph, this code should work.

    DocX document = DocX.Load("C:\\Users\\phil\\Desktop\\text.docx");
    bool flag = false;
    List<List<string>> list1 = new List<List<string>>();
    List<string> list2 = new List<string>();
    foreach (Novacode.Paragraph item in document.Paragraphs)
    {
        //use this if you need whole text of a paragraph
        string paraText = item.Text;
        var result = paraText.Split(' ');
        int count = 0;
        list2 = new List<string>();
        //use this if you need word by word
        foreach (var data in result)
        {
            string word = data.ToString();
            if (word.Contains("@@DELETEBEGIN")) flag = true;
            if (word.Contains("@@DELETEEND"))
            { 
                flag = false;
                list2.Add(word);
            }
            if (flag) list2.Add(word); 
            count++;
        }
        list1.Add(list2);
    }
    for (int i = 0; i < list1.Count(); i++)
    {
        string temp = "";
        for (int y = 0; y < list1[i].Count(); y++)
        {
            if (y == 0) 
            {
                temp = list1[i][y];
                continue;
            }
            temp += " " + list1[i][y];                   
        }
        if (!temp.Equals("")) document.ReplaceText(temp, "");
    }
    document.Save();
    

    I have to give some credit to this post for looping through each word.