I'm trying to load and edit a dataframe from a xlsx
file. The file is located in the path which I defined in the variable einlesen
. As soon as the bug is fixed, I want to delete a row and save the new dataframe in a new xlsx
file in a specific path.
import os
import re
import pandas as pd
import glob
import time
def setwd():
from pathlib import Path
import os
home = str(Path.home())
os.chdir(home + r'\...\...\Staffing Report\Input\...\Raw_Data')
latest = home + r'\...\...\Staffing Report\Input\MyScheduling\Raw_Data'
folders = next(os.walk(latest))[1]
creation_times = [(folder, os.path.getctime(folder)) for folder in folders]
creation_times.sort(key=lambda x: x[1])
most_recent = creation_times[-1][0]
print('test' + most_recent)
os.chdir(latest + '\\' + most_recent + '\\')
print('current cwd is: ' + os.getcwd())
save_dir = home + '\...\...\Staffing Report\Input\MyScheduling\Individual Status All\PBI\\' + 'Individual_Status.xlsx'
def rowdrop():
einlesen = os.getcwd()
print('test einlesen: ' + einlesen)
df = pd.DataFrame()
df = pd.read_excel('Individual Status.xls', sheet_name = 'Individual Status Raw Data')
df = pd.DataFrame(df)
#main
setwd()
rowdrop()
df.to_excel(save_dir, index = False)
print(df)
If im trying to run the code, it always states:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-92-060708f6b065> in <module>
2 rowdrop()
3
----> 4 df.to_excel(save_dir, index = False)
5
6 print(df)
NameError: name 'df' is not defined
You get the error because you only defined df
inside the rowdrop
function; variables defined inside function can only be accessed inside the functions unless you do something to change that.
Change your function to return the df
:
def rowdrop():
einlesen = os.getcwd()
print('test einlesen: ' + einlesen)
df = pd.DataFrame()
df = pd.read_excel('Individual Status.xls', sheet_name = 'Individual Status Raw Data')
df = pd.DataFrame(df)
return df
And assign the returned value of the function call to a variable:
df = rowdrop()
Another way that is considered bad practice is to use the global
method to make the df
variable global:
def rowdrop():
global df
einlesen = os.getcwd()
print('test einlesen: ' + einlesen)
df = pd.DataFrame()
df = pd.read_excel('Individual Status.xls', sheet_name = 'Individual Status Raw Data')
df = pd.DataFrame(df)
With the above method, you won't need to assign the function call to a variable, but please do not use that method, see Why are global variables evil?