Intro
I am writing a piece of python code that uses pcycopg2, pandas and datetime that retrieves information from a PGAdmin database, which filters the data to the last 14 days of activity.
The columns I am pulling from the database are: id (Unique Number), createdby (Name), created (Date)
I pull all of that information as I thought I could transfer it into a dataframe and then use a mask to get the bits that I want.
Code
import psycopg2
import pandas as pd
import datetime
Today = datetime.datetime.now() #gets todays date
TimeFrameInDays = datetime.timedelta(days = 14) #gets int amount of days
TwoWeeksAgo = Today - TimeFrameInDays #calculates date two weeks ago
Query = 'SELECT createdby, created, id FROM "02 Planning"."fibre"'
print("Robo-Cop Initialised") #DEV
try:
connection = psycopg2.connect(database = DB_Name,
user = DB_User,
password = DB_Pass,
host = DB_Host,
port = DB_Port)
print("Database connected")
except (Exception, psycopg2.Error) as error:
#if error occurs, message is returned
print("Error Occured Trying to Connect")
finally:
cursor = connection.cursor() #makes refrencing the cursor easier
cursor.execute(Query) #executes query
Data = cursor.fetchall() #saves results
DataFrame = pd.DataFrame(Data) #assembles into dataframe
DataFrame.rename(columns={0:'Created By', 1:'Created On', 2:'Database Id'}) #renames columns
#print(DataFrame)
mask = (DataFrame['Created On'] > TwoWeeksAgo.date) & (DataFrame['Created On'] <= Today.date)
DataFrame = DataFrame.loc[mask]
print(DataFrame)
Error
Traceback (most recent call last):
File "c:\Users\\Documents\GitHub\\main.py", line 42, in <module>
mask = (DataFrame['Created On'] > TwoWeeksAgo.date) & (DataFrame['Created On'] <= Today.date)
File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\frame.py", line 3805, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\indexes\range.py", line 395, in get_loc
raise KeyError(key)
KeyError: 'Created On'
Why my question is not a duplicate
My question has nothing to do with renaming columns, I am not trying to edit how I rename my columns. What i am trying to do is apply a mask to a dataframe that only shows me dates in the last 14 days.
New Error
With this piece of code
DataFrame = pd.DataFrame(Data, columns=['Created By','Created On','Database Id'])
,
i get
File "c:\Users\\Documents\GitHub\\main.py", line 42, in <module>
mask = (DataFrame['Created On'].dt.date > TwoWeeksAgo.date) & (DataFrame['Created On'].dt.date <= Today.date)
File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\generic.py", line 5902, in __getattr__
return object.__getattribute__(self, name)
File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\accessor.py", line 182, in __get__
accessor_obj = self._accessor(obj)
File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\indexes\accessors.py", line 512, in __new__
raise AttributeError("Can only use .dt accessor with datetimelike values")
AttributeError: Can only use .dt accessor with datetimelike values. Did you mean: 'at'?
Complete solution:
#pass columns names to DataFrame constructor
DataFrame = pd.DataFrame(Data, columns=['Created By','Created On','Database Id'])
Today = datetime.datetime.now() #gets todays date
TimeFrameInDays = datetime.timedelta(days = 14) #gets int amount of days
#calculates date two weeks ago
TwoWeeksAgo = Today - TimeFrameInDays
#convert column to datetimes
DataFrame['Created On'] = pd.to_datetime(DataFrame['Created On'])
#compare dates
mask = (DataFrame['Created On'].dt.date > TwoWeeksAgo.date()) &
(DataFrame['Created On'].dt.date <= Today.date())
DataFrame1 = DataFrame.loc[mask]
print(DataFrame1)
Or use Timestamp.normalize
for datetimes without times (times are 00:00:00
), so possible compare datetimes columns:
Today = pd.Timestamp.now().normalize()
TwoWeeksAgo = Today - pd.Timedelta(days = 14)
mask = (DataFrame['Created On'] > TwoWeeksAgo) & (DataFrame['Created On'] <= Today)
DataFrame1 = DataFrame.loc[mask]
print(DataFrame1)
Or:
Today = pd.Timestamp.now().normalize()
TwoWeeksAgo = Today - pd.Timedelta(days = 14)
mask = DataFrame['Created On'].between(TwoWeeksAgo, Today, inclusive='right')
DataFrame1 = DataFrame.loc[mask]