I am using a modified version of this scikit-image demo to create contours from the edges resulting from watershed segmentation of an image. In this result, each level has one contour only, made of row-column index pairs.
It is easy to display contours as in the demo. But what I'd like to do is use the enumerate
loop to append each vertex of each contour to a Pandas DataFrame, separating the row and column index, and then add a level/contour index in a separate column.
To illustrate I will start with a small toy example where each contour has one index only. With this code:
np.random.seed(131)
test = np.random.randint(50, size=5)
n_list = []
t_list = []
for n, t in enumerate(test):
n_list.append(n)
t_list.append(t)
contours_df = pd.DataFrame({'contour': n_list, 'contour': t_list})
contours_df
I get this DataFrame:
A more representative example would be something like this:
np.random.seed(131)
test1 = np.random.randint(50, size=(5, 2, 2))
n_list1 = []
t_list1 = []
for n1, t1 in enumerate(test1):
n_list1.append(n1)
t_list1.append(t1)
contours_df1 = pd.DataFrame({'contour': n_list1, 'points': t_list1})
contours_df1
which gives me this DataFrame:
I can export this to an Excel file using XlsxWriter
, like this:
# using XlsxWriter documentation example
writer = pd.ExcelWriter('contours_df1.xlsx', engine='xlsxwriter')
contours_df1.to_excel(writer, sheet_name='Sheet1')
writer.save()
To get this:
But what I would really like is to split the contours so as to get something like this as a final Excel output:
I would use pandas concatenation. For reasonably-sized data, it's a matter of taste whether you build up a list per column (though you would need a second nested loop to allow for arbitrary-sized contours). For larger data, I think this method should be faster because it makes use of NumPy/pandas vectorization where possible.
Here's an example:
import numpy as np
import pandas as pd
contours = [np.random.random((i, 2)
for i in np.random.randint(3, 10, size=5)]
dataframes = []
for contour_id, contour in enumerate(contours):
current_dataframe = pd.DataFrame(contour, columns=['row', 'column'])
current_dataframe['contour'] = contour_id
dataframes.append(current_dataframe)
contours_data = pd.concat(dataframes)
contours_data.to_excel('filename.xlsx', sheet_name='Sheet1')
Side note: you don't need to create an ExcelWriter if you are only writing a single sheet.