I'm sorry for the not very specific title. My problem is pretty strange, here it is:
I am a PhD student and I am trying to code a Tkinter interface to manipulate my data (using matplotlib and the pandas library to create dataframe from csv files) so I can visualize and create figures easily. Each open csv file is loaded as an element (that I call widget) in a list on the right of the window. Here is how it looks: interface with two csv files loaded
It already works well for basic plot but I need to be able to select specific zones of the plots (using matplotlib RectangleSelector). The objective is to add as many rectangle selection as I want by clicking on "add selection" each time I make a rectangle and then plot the selection by clicking on "plot selection" to open a new instance of the interface with only the selected data. Then I can also click on "cancel selection" Here is the new instance appearing after clicking "plot selection": second instance called by the first one
Problems:
Here is the initialization of the two classes "PlotWindow" (a bad name for the main window interface) and "Data_Widget" (each widget is the materialisation of a dataframe that you can see on the right of the main window), it's long because I left the commentary, the whole code is about 600 lines:
class PlotWindow(tk.Frame):
""" This Class creates the main window with the curve plot and general buttons like open files,
Clear all, Plot all.
Arguments:
-master: a parent window, not necessary
-existing_data_widgets_list: used by the button "plot multiple selection" to create a new instance of PlotWindow with already selected data
-existing_filepath_list: the corresponding list of filepath for the existing_data_widgets_list
Methods:
-create_general_controls : creates general controls like Open file, clear all, plot all...
-matplotlib_spec : defines default matplotlib specifications
-display_error : displays a message of error, used by other methods (example: "no data loaded" or "no selection") can also be used for informative messages (example: "plot cleared")
-load_folder : allows to select multiple folders and detects all CSV file in them
-load_CSVfile : allows to select multiple CSV files
-plot_widget : called by each widget method master_plot to plot widget curve with selected widget variables
-clear_widget_plots : called by each widget method master_clear_plot to clear the widget plot(s)
-clear_all_plot : clear all plots
-add_selection : allows to select a range of data from multiple curves with rectangle selector can be called again to add selection to already selected data
-cancel_selection : empty the selection of the add_selection method
-plot_add_selection : plot the selection of the add_selection method in another PlotWindow instance creates a widget in this new instance for each widget that have been selected in previous PlotWindow
-create_createvar_popup : create popup that allows to the create a new variable as function of existing variables
-linear_regression_selection : make a linear regression of rectangle selection. plot the line on the rectangle selection
"""
def __init__(self, master=None, existing_data_widgets_list=[], existing_filepath_list=[]):
super().__init__(master)
root.title("its plotin time")
self.master = master
self.pack()
self.widgets_list = [] #stores all instances of Data_Widget class created
self.lines = [] #stores the plt.plot objects (the curves) to be able to hide/show them
self.lines_ids = [] #stores a number for each plot (each plot_id = widget_id it is plotted from)
self.current_widget_id = 0 #number for each widget created so each widget is identifiable
self.widget_ids_list=[] #stores each widget ids (1 data widget=1id)
self.selected_widget_data_list= [] #used for multiple selection
self.selected_widget_filepath_list = [] #used for multiple selection
self.existing_data_widgets_list= existing_data_widgets_list #only used if PlotWindow is created with existing widgets
self.existing_filepath_list=existing_filepath_list #only used if PlotWindow is created with existing widgets
self.create_general_controls() #create all buttons and graphic elements
self.matplotlib_spec()
...
class Data_Widget():
""" This Class creates the main window with the curve plot and general buttons like open files, Clear all, Plot all.
Arguments:
-master: the parent window, automatically the PlotWindow class
-data: data loaded into the widget self.data used to plot the curve
-filepath: the filepath of the loaded data
-color: color of the curve
methods:
-create_widget_controls : create the controls (buttons...)
-set_widget_color : set the color of the associated curve
-set_x_var and set_y_var : set the x and y var of the plotted curve
-master_plot : gives the order of plotting the curve to the master class (PlotWindow) (method plot_widget)
-master_clear_plot : gives the order of clearing the plot to the master class (PlotWindow) (method clear_widget_plots)
-delete_widget : delete the widget
-create_exportload_popup : export the data contained in the widget as CSV file.
Used where widget has been created by selecting data from another widget with PlotWindow multiple selection method
"""
def __init__(self, master, data = pd.DataFrame(), filepath='Nothing loaded', color="blue"):
super().__init__()
self.master = master
# Create variables for data and plot
self.widget_id=0 #id of the widget
self.filepath=filepath #filepath of the loaded data
self.data = data #dataframe loaded into the widget
self.selected_data = pd.DataFrame() #data selected by the user
self.x_var = None #x variable of the plotted curve
self.y_var = None #y variable of the plotted curve
self.color = color #color of the plotted curve
self.create_widget_controls() #create the controls (buttons...)
...
Here is the code for the functions "add selection", "plot selection" and "cancel selection":
def add_selection(self):
if not self.widgets_list:
self.display_error('no data selected')
return
onselect_x1, onselect_y1, onselect_x2, onselect_y2 = self.selection_coords
selected_data_index=[]
mask=[]
present_selected_data=[]
idx=[]
for i in self.widgets_list: # this variable will be the list
of index where the row corresponds to the selection
mask = (i.data[i.x_var] >= onselect_x1) & (i.data[i.x_var] <= onselect_x2) & (i.data[i.y_var] >=
onselect_y1) & (i.data[i.y_var] <= onselect_y2)
selected_data_index = np.where(mask)[0]
present_selected_data=i.data.iloc[selected_data_index]
if i.selected_data.empty: # tests if we already have selected
data and, if yes, (.empty is false) we put the selection in growing order with a row of NaN values
inbetween
i.selected_data = i.data.iloc[selected_data_index]
else:
# present_selected_data = i.data.iloc[selected_data_index]
idx = np.searchsorted(i.selected_data.index, present_selected_data.index[0])
# I don't quite understand this but chatgpt is too strong
i.selected_data = pd.concat([pd.DataFrame(np.nan, index=[0], columns=i.data.columns),
i.selected_data.iloc[:idx], pd.DataFrame(np.nan, index=[0], columns=i.data.columns),
present_selected_data, i.selected_data.iloc[idx:], pd.DataFrame(np.nan, index=[0],
columns=i.data.columns)])
i.selected_data.reset_index(drop=True, inplace= True)
self.selected_widget_data_list.append(i.selected_data)
self.selected_widget_filepath_list.append(i.filepath)
if self.selected_widget_data_list == []:
self.display_error('no data selected')
return
if self.cancel_selection_button.winfo_viewable() == 0:
self.cancel_selection_button.pack()
if self.plot_add_selection_button.winfo_viewable() == 0:
self.plot_add_selection_button.pack()
self.display_error('added to selection')
def cancel_selection(self):
self.selected_widget_data_list = []
self.selected_widget_filepath_list = []
self.cancel_selection_button.pack_forget()
self.plot_add_selection_button.pack_forget()
self.display_error('selection canceled')
def plot_add_selection(self):
self.display_error('opening another window')
root = tk.Tk() #creates a new window to put the new PlotWindow into
PlotWindow(root, self.selected_widget_data_list, self.selected_widget_filepath_list)
root.mainloop()
I tried resetting all the temporary variables used by the "add_selection" function by placing
selected_data_index=[]
mask=[]
present_selected_data=[]
idx=[]
at the end of the function but it changed nothing.
I am still learning a lot in python and still consider myself a beginner, maybe there are obvious issues I didn't see but I feel a bit overwhelmed with all this code. This is my first post here so I hope I was clear enough.
Thank you @TheLizzard for your answer but I just found the solution! The problem was that I needed to reset the value of each widget selected_data. I just added the line i.selected_data=pd.DataFrame() at the end of the for loop in the add_selection function and it worked perfectly. I then moved this operation in the cancel function for clarity.
My intuition was also about not destroyed instances of Data_Widget or PlotWindow and maybe it could have also been resolved that way but that's too deep for me.
Also after correcting this I noticed another bug, after clicking on plot selection, if I made another selection it would create a second widget for the same dataframe. The first widget containing the first selection and the second widget containing the first and second selection. I corrected it by moving the population of self.selected_widget_data_list and self.selected_widget_filepath_list in the plot function.
I also added an intermediary function inside the add function for readability.
Here is the new version of the 3 functions:
def add_selection(self):
# print('before add, selected widget data list: \n', self.selected_widget_data_list)
if not self.widgets_list:
self.display_error('no data selected')
return
def process_widget_dataselection(widget):
onselect_x1, onselect_y1, onselect_x2, onselect_y2 = self.selection_coords
mask = (widget.data[widget.x_var] >= onselect_x1) & \
(widget.data[widget.x_var] <= onselect_x2) & \
(widget.data[widget.y_var] >= onselect_y1) & \
(widget.data[widget.y_var] <= onselect_y2)
selected_data_index = np.where(mask)[0]
present_selected_data = widget.data.iloc[selected_data_index]
if widget.selected_data.empty:
widget.selected_data = present_selected_data
else:
idx = np.searchsorted(widget.selected_data.index, present_selected_data.index[0])
widget.selected_data = pd.concat([
pd.DataFrame(np.nan, index=[0], columns=widget.data.columns),
widget.selected_data.iloc[:idx],
pd.DataFrame(np.nan, index=[0], columns=widget.data.columns),
present_selected_data,
widget.selected_data.iloc[idx:],
pd.DataFrame(np.nan, index=[0], columns=widget.data.columns)
])
widget.selected_data.reset_index(drop=True, inplace=True)
nb_widget_selected = 0
for widget in self.widgets_list:
process_widget_dataselection(widget)
if not widget.selected_data.empty:
nb_widget_selected += 1
if nb_widget_selected == 0:
self.display_error('no data selected')
return
if self.cancel_selection_button.winfo_viewable() == 0:
self.cancel_selection_button.pack()
if self.plot_add_selection_button.winfo_viewable() == 0:
self.plot_add_selection_button.pack()
self.display_error('added to selection')
def cancel_selection(self):
self.selected_widget_data_list = []
self.selected_widget_filepath_list = []
for widget in self.widgets_list:
widget.selected_data = pd.DataFrame()
self.cancel_selection_button.pack_forget()
self.plot_add_selection_button.pack_forget()
self.display_error('selection canceled')
def plot_add_selection(self):
self.display_error('opening another window')
self.selected_widget_data_list=[]
self.selected_widget_filepath_list=[]
widgets_x_var=[]
widgets_y_var=[]
for widget in self.widgets_list:
if widget.selected_data.empty:
self.display_error('no data selected in ' + str(widget.filepath))
else:
self.selected_widget_data_list.append(widget.selected_data)
self.selected_widget_filepath_list.append(widget.filepath)
widgets_x_var.append(widget.x_var)
widgets_y_var.append(widget.y_var)
print('before plot cancel, number of data selected_widget_data_list: \n', len(self.selected_widget_data_list))
root = tk.Tk() #creates a new window to put the new PlotWindow into
PlotWindow(root, self.selected_widget_data_list, self.selected_widget_filepath_list, widgets_x_var, widgets_y_var)
root.mainloop()
Sorry for the useless post, somebody tell me if I should delete it or not.