I am facing 3000 docx in several directories and subdirectories. I have to prepare a list which consists of the filename and extracted information from the tables in the docx. I have successfully added all the docx to the list targets_in_dir
separating it from non relevant files.
Question : I would like to iterate through targets_in_dir
extract all tables from the docx,
len_target =len(targets_in_dir)
file_processed=[]
string_tables=[]
for i in len_target:
doc = docx.Document(targets_in_dir[i])
file_processed.append(targets_ind[i])
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
str.split('MANUFACTURER')
string_tables.append(cell.text)
I get the error 'int' object is not iterable
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-39-4847866a9234> in <module>
4 string_tables=[]
5
----> 6 for i in len_target:
7
8 doc = docx.Document(targets_in_dir[i])
TypeError: 'int' object is not iterable
What am I doing wrong?
It looks like you are trying to iterate through len_target = len(targets_in_dir)
, which is an int. Because int
is not an iterable object, your for-loop fails.
You need to iterate through an iterable object for the for
loop to work.
fixing it to
for i in range(len_target):
# do stuff
or
for i in targets_in_dir:
# do stuff
is a good place to start.
Also, your file_processed.append(targets_ind[i])
has a typo.