I am working on an NLP project analyzing the words spoken by characters in The Office. Part of this project involves making a network diagram of which characters talk to each other for a given episode.
This will be shown in a Dash app by allowing a user to select dropdowns for 4 parameters: season, episode, character1, and character2.
Here is a relevant snippet of my code so far:
#Import libraries
import pandas as pd
import numpy as np
import dash
import dash_core_components as dcc
import dash_html_components as html
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output, State
#Load data
sheet_url = 'https://docs.google.com/spreadsheets/d/18wS5AAwOh8QO95RwHLS95POmSNKA2jjzdt0phrxeAE0/edit#gid=747974534'
url = sheet_url.replace('/edit#gid=', '/export?format=csv&gid=')
df = pd.read_csv(url)
#Set parameters
choose_season = df['season'].unique()
choose_episode = df['episode'].unique()
choose_character = ['Andy','Angela', 'Darryl', 'Dwight', 'Jan', 'Jim','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Roy','Ryan','Stanley','Toby']
#Define app layout
app = dash.Dash()
server = app.server
app.layout = html.Div([
dbc.Row([
dbc.Col(
dcc.Dropdown(
id='dropdown1',
options=[{'label': i, 'value': i} for i in choose_season],
value=choose_season[0]
), width=3
),
dbc.Col(
dcc.Dropdown(
id='dropdown2',
options=[{'label': i, 'value': i} for i in choose_episode],
value=choose_episode[0]
), width=3
),
dbc.Col(
dcc.Dropdown(
id='dropdown3',
options=[{'label': i, 'value': i} for i in choose_character],
value=choose_character[0]
), width=3
),
dbc.Col(
dcc.Dropdown(
id='dropdown4',
options=[{'label': i, 'value': i} for i in choose_character],
value=choose_character[1]
), width=3
)
])
])
if __name__=='__main__':
app.run_server()
In order to have this work efficiently, I would like to have the following dependencies in the dropdown menus:
1.) The selection of the first dropdown menu updates the dropdown menu ie: Season updates possible episodes
2.) The selection of the first two dropdown menus updates the 3rd and 4th dropdown menus ie: Season, Episode updates possible characters (if a character was not in that episode, they will not appear)
3.) The selection of the third dropdown menu updates the fourth dropdown menu ie: If a character is selected in the third dropdown menu, they can not be selected in the fourth (can't select the same character twice)
I understand one way to do this is to make a massive season to episode dictionary and then an even larger season to episode to character dictionary.
I've already made the code to process the season to episode dictionary:
@app.callback(
Output('dropdown2', 'options'), #--> filter episodes
Output('dropdown2', 'value'),
Input('dropdown1', 'value') #--> choose season
)
def set_episode_options(selected_season):
return [{'label': i, 'value': i} for i in season_episode_dict[selected_season]], season_episode_dict[selected_season][0]
I can definitely build these dictionaries, but this seems like a really inefficient use of time. Does anyone know of a way to build these dictionaries with just a few lines of code? Not sure how to approach building these in the easiest way possible. Also, if you have an idea for a better way to approach this problem, please let me know that too.
I think I see what you're asking about now. Something like this should get you a basic dictionary, which you could then modify for the options
param for the dropdowns.
df = pd.read_csv(url)
season_episode_character_dictionary = {}
for season in df['season'].unique.tolist():
df_season = df[df['season'].eq(season)]
season_episode_character_dictionary[season] = {}
for episode in df_season['episode'].unique.tolist():
df_episode = df_season[df_season['episode'].eq(episode)]
characters = df_episode['characters'].unique.tolist()
season_episode_character_dictionary[season][episode] = characters