I have two columns one with values that represents time and another with values that represent a date (both values are in floating type), I have the following data in each column:
df['Time']
540.0
630.0
915.0
1730.0
2245.0
df['Date']
14202.0
14202.0
14203.0
14203.0
I need to create new columns with the correct data format for these two columns, to be able to analyze data with date and time in distinct columns.
For ['Time']
I need to convert the format to:
540.0 = 5h40 OR TO 5.40 am
2245.0 = 22h45 OR TO 10.45 pm
For ['Date']
, I need to convert the format to:
Each number we can say that represent "days":
where 0 ("days") = 01-01-1980
So if I add 01-01-1980 to 14202.0 = 18-11-1938
and if I add: 01-01-1980 + 14203.0 = 19-11-1938,
this way is possible to do with excel but I need a way to do in Python.
I tried different types of code but nothing works, for example, one of the codes that I tried was the one below:
# creating a variable with the data in column ['Date'] adding the days into the date:
Time1 = pd.to_datetime(df["Date"])
# When I print it is possible to see that 14203 in row n.55384 is added at the end of the date created but including time, and is not what I want:
print(Time1.loc[[55384]])
55384 1970-01-01 00:00:00.000014203
Name: Date, dtype: datetime64[ns]
# printing the same row (55384) to check the value 14203.0, that was added above:
print(df["Date"].loc[[55384]])
55384 14203.0
Name: Date, dtype: float64
For ['Time']
I have the same problem I can't have time without a date, I also tried to insert ':', but is not working even converting the data type to string.
I hope that someone can help me with this matter, and any doubt please let me know, sometimes is not easy to explain.
from datetime import datetime
from datetime import timedelta
startdate_string = "1980/01/01"
#defining start date in string format
startdate_object = datetime.strptime(startdate_string, "%Y/%m/%d").date()
# changing string format date, to date object using strptime function
startdate_object
# print startdate_object to check date
import math
datenew = []
dates = df['UTS_Date']
# data from the original column 'UTS_Date'
for values in dates:
# using an if statement to accept null values and appending them into the new list
if math.isnan(values):
`datenew.append('NaN')`
`continue `
`currentdate1 = startdate_object + timedelta(days= float(values))` # add the reference data (startdate_object) to a delta (which is the value in each row of the column)
`datenew.append(str(currentdate1)) ` # converte data into string format and add in the end of the list, removing any word from the list (such: datetime.date)
print (len(datenew))
# check the length of the new list datenew, to ensure that all rows on the data are in the new list
df.insert(3, 'Date', datenew)
#creating a new column in data frame for date format