Search code examples
rdataframedata-manipulationorganization

how to re-organized a column based on other number repetition column in a datframe in r?


My dataframe (1700000x3) has 3 columns:

  • date: from 2011 to 2019 (throughout the dates random days and months were chosen)
  • light: go from 350 to 2299 (one by one) and then restarts
  • ID: the target
  • V: value

This is how my data looks like (quick example) :

date light ID V
2013-06-17 350 p01 0.1
2013-06-17 351 p01 0.1
2013-06-17 352 p01 0.2
2013-06-17 353 p01 0.3
2013-06-17 354 p01 0.1
2013-06-17 355 p01 0.1
2013-04-18 ... p01 0.1
2013-06-17 2297 p01 0.2
2013-06-17 2298 p01 0.3
2013-06-17 2299 p01 0.2
2014-04-18 350 r03 0.1
2014-04-18 351 r03 0.4
2014-04-18 352 r03 0.1
2014-04-18 353 r03 0.6
2014-04-18 354 r03 0.2
2014-04-18 355 r03 0.1
2014-04-18 ... r03 0.1
2014-04-18 2297 r03 0.5
2014-04-18 2298 r03 0.5
2014-04-18 2299 r03 0.6

All good until here! The problem was that in the middle of the data frame the LIGHT column instead of having the numbers from 350 to 2299 it had the numbers from 2299 to 350. These were verified for several IDs. A section of the data frame looks like this:

date light ID V
2014-07-31 2299 s01 0.1
2014-07-31 2298 s01 0.1
2014-07-31 2297 s01 0.2
2014-07-31 2296 s01 0.3
2014-07-31 2295 s01 0.1
2014-07-31 2294 s01 0.1
2014-07-31 ... s01 0.1
2014-07-31 352 s01 0.2
2014-07-31 351 s01 0.3
2014-07-31 350 s01 0.2
2014-07-31 2299 x03 0.1
2014-07-31 2298 x03 0.4
2014-07-31 2297 x03 0.1
2014-07-31 2296 x03 0.6
2014-07-31 2295 x03 0.2
2014-07-31 2294 x03 0.1
2014-07-31 ... x03 0.1
2014-07-31 352 x03 0.5
2014-07-31 351 x03 0.5
2014-07-31 350 x03 0.6

What I want is to have the LIGHT column ALWAYS with the number intervals from 350 to 2299 and then restarts from 350 to 2299 and so on. This operation has to respect the other columns, date, V "values", and the ID.

IMPORTANT NOTE: I have the same code for different dates!!

I'm having trouble finding an answer to such a specific question.

Any help will be much appreciated.


Solution

  • Just sort it on ID then light?

    your.data %<>% arrange( ID, light )