I'm trying to automate Word(2010) documents (all most 40-50 docs) using python and win32 component. Specifically in that, need to select a part of line and replace it all together with some content. For Example, if in the original file there is "Label: 096-4296-05A ", I want it to be replaced by "Label: ___________ ". Using search and replace will WORK only if the numbers present in all the files are same, but actually they are not. So in that case, I want to have a generic approach to perform this task.
So what I'm thinking is, if by someway I could select the line containing "Label 096-4296-05A " and delete it and then again write a new line like "Label _______".
For this I did had a look @ Selection Object http://msdn.microsoft.com/en-us/library/bb221235%28v=office.12%29.aspx and http://msdn.microsoft.com/en-us/library/bb208865%28v=office.12%29.aspx and even tried to write some equivalent python code for VB.
Here is what I have written till now :
...///
########################
#
# Purpose : Replace all occurrences of `find_str` with `replace_str`
# in `word_file
#
#######################
def delete_and_add_line(word_file, find_str, replace_str):
wdFindContinue = 1
wdReplaceAll = 2
# Dispatch() attempts to do a GetObject() before creating a new one.
# DispatchEx() just creates a new one.
app = win32com.client.DispatchEx("Word.Application")
app.Visible = 0
app.DisplayAlerts = 0
app.Documents.Open(IP_Directory_Dest + "\\" + word_file) ## (word_file)
# expression.Execute(FindText, MatchCase, MatchWholeWord,
# MatchWildcards, MatchSoundsLike, MatchAllWordForms, Forward,
# Wrap, Format, ReplaceWith, Replace)
app.Selection.Find.Execute(find_str, True, True, \
False, False, False, True, \
wdFindContinue, False, replace_str, wdReplaceAll)
app.Selection.EndKey(Extend=win32com.client.constants.wdExtend)##.Select()
# determine if the text is selected or not
if (app.Selection.Type == win32com.client.constants.wdSelectionIP ):
print 'Nothing is selected'
else:
print 'Text Selected '
# to delete the selected line
app.Selection.Delete()
app.ActiveDocument.Close(SaveChanges=True)
app.Quit()
...///
When I execute this code, I found that app.Selection.Find.Execute is successfully able to find and replace the text which is provided to it. Even it prints "Text Selected", which means that the text to end of line is selected , but it never deletes the Selected Line.
Also, I'm not sure if this is the correct way to fully select a line till its End (using Select with this gives me attribute error "AttributeError: 'int' object has no attribute 'Select'")
**### **IS THIS THE CORRECT WAY TO SELECT A LINE TILL ITS END** ???**
app.Selection.EndKey(Extend=win32com.client.constants.wdExtend)##.Select()
Let me know if I'm missing something over here. Any suggestions are welcome.
Notice that you are executing a replace of all the matches that the function "Selection.Find" gets and then trying to extend the selection after the last match, I don't think that is what you want. I also got an error with the way you are extending the selection since this constant (wdExtend) was not accepted by Word.
Besides it is a good practice to close the document as part of a finally clause to avoid leaving Word in memory in an unknown status.
I think the right solution would be iterating over all the paragraphs that the document has and then using regular expressions to match and replace the text that you want to replace. Regular expressions are much more powerful than the word find function. You can access the text of the paragraph by using the Text property of the Range properties. Something like:
import win32com.client
import re
# This is the regular expression to match the text you are after
regexp = "Label: [0-9A-Z-]+"
def replace_label(word_file):
app = win32com.client.DispatchEx("Word.Application")
app.Visible = 0
app.DisplayAlerts = 0
app.Documents.Open("C:\\" + word_file)
try:
doc = app.ActiveDocument
# Iterate over all the paragraphs
for parNo in range(1,doc.Paragraphs.Count):
paragraph = doc.Paragraphs(parNo)
# Get the text of the paragraph.
current_text = paragraph.Range.Text
# Check if there is a match in the paragraph
if re.search(regexp,current_text):
# We found a match... do the replace
paragraph.Range.Text = re.sub(regexp,"Label _______",current_text)
finally:
app.ActiveDocument.Close(SaveChanges=True)
app.Quit()
I am not sure of the regular expression that I am suggesting so you may have to tweak it. The best guide for regular expressions that I know is:
http://www.zytrax.com/tech/web/regex.htm And http://docs.python.org/2/library/re.html