I have a data.frame representing the time sheet for several staff over a period of several months spanning 2 years. The data looks like:
Name Month 1 2 3 ... 31 Total Job ... [more columns]
John Smith Aug 2017 1:20 1:20 Typing
Mary Jones Sep 2017 Prooing
John Smith Oct 2017 0:15 1:10 1:25 Typing
...
Jim Miles Feb 2018 1:30 2:10 3:40 Admin
There are 31 columns, each representing a date in the corresponding month. There will be multiple rows with the same Name.
So looking at the first entry, John Smith did 1 hour and 20 minutes of work on 1 August 2017.
What I want to do is to analyse these data in a granular way, e.g.
I am a bit stuck on how to proceed in order to have the data to analyse. Suggestions appreciated.
Added for clarification:
Having read three very helpful replies and looked at tidyr
, I have clarified my thoughts and think that I need to modify the data so there is one row for each entry, so the example table will become:
Name Date Duration Job ... [more columns]
John Smith 01 Aug 2017 1:20 Typing
John Smith 02 Oct 2017 0:15 Typing
John Smith 31 Oct 2017 0:15 Typing
...
Jim Miles 02 Feb 2018 1:30 Admin
Jim Miles 03 Feb 2018 2:10 Admin
Date will need to be formatted correctly but that is not major. The problem is matching the day of month to the relevant Month and year to produce the composite date. Any ideas welcome.
After following up on the suggestions of @Khlick, I succeeded in using gather()
:
mydata <- mydata %>% gather(new_date, time_spent, "1":"31")
This produced two new columns, new_date
and time_spent
, then created a new row for each data point of columns 1–31.
So now I had, for each data point, the month, e.g. Aug 2017
in one column, the day the work was done, e.g. 12
in another. I changed the month to a date in the original spreadsheet, so it became 2017-08-01
(all dates now have 01). Then in R
I used substr()
and paste()
to replace the day to the correct one, i.e. 2017-08-12
.
Finally, I was left with a large number of rows with no value in time_spent
. I removed those rows.
I now have:
Name Date Duration Job ... [more columns]
John Smith 2017-08-01 1:20 Typing
John Smith 2017-10-02 0:15 Typing
John Smith 2017-10-31 0:15 Typing
...
Jim Miles 2018-02-02 1:30 Admin
Jim Miles 2018-02-03 2:10 Admin
I did a few spot checks and seems data was transformed correctly. Thanks to all, especially to @Khlick.