I have a data frame column that has dates in the format %Y-%m-%d %H:%M:%S
. What I want to do is add +8h to every date and then change the format to %Y-%m-%d
, so that everything at or after 16:00 will show the next day. Here is an excerpt of my data frame:
Tweets:
Company,Datetime_UTC,Negative,Neutral,Positive,Volume
AXP,2013-06-01 16:00:00+00:00,0,2,0,2
AXP,2013-06-01 17:00:00+00:00,0,2,0,2
AXP,2013-06-01 22:00:00+00:00,0,1,0,1
AXP,2013-06-02 05:00:00+00:00,0,1,0,1
My code:
Tweets$Datetime_UTC <- as.POSIXct(Tweets$Datetime_UTC, format='%Y-%m-%d %H:%M:%S')
Tweets$Datetime_UTC <- as.integer(Tweets$Datetime_UTC)
Tweets$Datetime_UTC <- (Tweets$Datetime_UTC + 28800) / 86400
Tweets$Datetime_UTC <- as.Date(Tweets$Datetime_UTC, origin="1970-01-01")
As you can see, I first change the date into POSIXct and then into integer, so that it can be manipulated. After that, I add 28800 (8h = 28800sec) and divide it by 86400 (24h = 86400sec). Then, I change it back into date.
Desired outcome:
Company,Datetime_UTC,Negative,Neutral,Positive,Volume
AXP,2013-06-02,0,2,0,2
AXP,2013-06-02,0,2,0,2
AXP,2013-06-02,0,1,0,1
AXP,2013-06-02,0,1,0,1
Actual outcome:
Company,Datetime_UTC,Negative,Neutral,Positive,Volume
AXP,2013-06-01,0,2,0,2
AXP,2013-06-01,0,2,0,2
AXP,2013-06-02,0,1,0,1
AXP,2013-06-02,0,1,0,1
Somehow, the first two rows have the wrong date, while the 3rd was correctly changed to 2013-06-02
. What can I improve so that it works correctly for every row?
Thanks in advance!! :-)
I don't believe you need to go through that whole process, would something as simple as the below work? Note I already did the Tweets$Datetime_UTC <- as.POSIXct(Tweets$Datetime_UTC, format='%Y-%m-%d %H:%M:%S')
Tweets$Datetime_UTC <- as.Date(Tweets$Datetime_UTC + 28800)
Tweets
#Company Datetime_UTC Negative Neutral Positive Volume
#1 AXP 2013-06-02 0 2 0 2
#2 AXP 2013-06-02 0 2 0 2
#3 AXP 2013-06-02 0 1 0 1
#4 AXP 2013-06-02 0 1 0 1