Apologies. Whenever I try to make them into tables rather than into code it seems to think I have a code embedded and won't let me post this.
So here's an example of What I have
ID | File | Period | Begin | End | Laser1 | Laser2 | Lead |
---|---|---|---|---|---|---|---|
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Recovery | 68500 | 158000 | qfin | plethh | plethi |
A01 | longname2.zip | Baseline | 2000 | 43000 | qfin | plethh | plethi |
A01 | longname2.zip | Run | 45000 | 135000 | qfin | plethh | plethi |
A01 | longname2.zip | Recovery | 135000 | 305000 | qfin | plethh | plethi |
Here's an example of What I want
ID | File | Period | Begin | End | Laser1 | Laser2 | Lead |
---|---|---|---|---|---|---|---|
A01 | longname.zip | Baseline | 0 | 6000 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 1000 | 7000 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 2000 | 8000 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 3000 | 9000 | qfin | plethh | plethi |
etc.
ID | File | Period | Begin | End | Laser1 | Laser2 | Lead |
---|---|---|---|---|---|---|---|
A01 | longname.zip | Baseline | 24000 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 36500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 31500 | 37500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 32500 | 38500 | qfin | plethh | plethi |
I've managed to filter by the unique file names and duplicate the rows required
What I can't seem to do is change the Begin and End values and segment them by Period. What I currently end up with, likely due to my row duplication is something like this
ID | File | Period | Begin | End | Laser1 | Laser2 | Lead |
---|---|---|---|---|---|---|---|
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
In both Python and R I seem to get stuck in the same place. I'm more comfortable with R at the moment but started trying with Python.
I can't seem to fix the numbers in the Begin and End columns.
In R it thinks I want it to loop over 1000 columns which i don't have rather than adding 1000 to every row. Unfortunately not all files start at 0 and there may be a gap between End and Begin columns.
R
Period = dupdf$Period
for (period in Period) {
End_Final = max(dupdf$End)
dupdf_period <- dupdf%>%
filter(Period == period)
for (i in 2:nrow(dupdf_period)){
dupdf_period[i,Begin ] <- dupdf_period[i,Begin ] + 1000
dupdf_period[i,End ] <- dupdf_period[i,Begin ] + 6000
if (dupdf_period$End < End_Final){
dupdf_period$End
} else {
End_Final
break
}
}
dupdf_period[1,End ] <- dupdf_period[1,Begin ] + 6000
dupdf <- rbind(dupdf_period)
}
write.csv(dupdf, filename)
}
In Python
for period in Period:
row_index = 2
for row_index in concat_df.index:
#for row in concat_df.itertuples:
concat_df.at[row_index , "Begin"] += 1000
row_index2 = 1
for row_index2 in concat_df.index:
concat_df.at[row_index2, "End"] += (Begin + 6000)
concat_df['End'] = np.where((concat_df.End >= End_Final), concat_df.End.replace(End_Final), concat_df.End)
Edit Thanks to r2evans now without rowwise()
.
Perhaps this is what you are looking for:
library(dplyr)
library(tidyr)
df %>%
mutate(Begin_New = Map(seq, Begin, End - 6000, list(by = 1000))) %>%
unnest(Begin_New) %>%
group_by(ID, File, Period) %>%
mutate(End_New = ifelse(Begin_New + 7000 > End, End, Begin_New + 6000))
returns
# A tibble: 428 x 10
ID File Period Begin End Laser1 Laser2 Lead Begin_New End_New
<chr> <chr> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <dbl>
1 A01 longname.zip Baseline 0 30500 qfin plethh plethi 0 6000
2 A01 longname.zip Baseline 0 30500 qfin plethh plethi 1000 7000
3 A01 longname.zip Baseline 0 30500 qfin plethh plethi 2000 8000
4 A01 longname.zip Baseline 0 30500 qfin plethh plethi 3000 9000
5 A01 longname.zip Baseline 0 30500 qfin plethh plethi 4000 10000
6 A01 longname.zip Baseline 0 30500 qfin plethh plethi 5000 11000
7 A01 longname.zip Baseline 0 30500 qfin plethh plethi 6000 12000
8 A01 longname.zip Baseline 0 30500 qfin plethh plethi 7000 13000
9 A01 longname.zip Baseline 0 30500 qfin plethh plethi 8000 14000
10 A01 longname.zip Baseline 0 30500 qfin plethh plethi 9000 15000
11 A01 longname.zip Baseline 0 30500 qfin plethh plethi 10000 16000
12 A01 longname.zip Baseline 0 30500 qfin plethh plethi 11000 17000
13 A01 longname.zip Baseline 0 30500 qfin plethh plethi 12000 18000
14 A01 longname.zip Baseline 0 30500 qfin plethh plethi 13000 19000
15 A01 longname.zip Baseline 0 30500 qfin plethh plethi 14000 20000
16 A01 longname.zip Baseline 0 30500 qfin plethh plethi 15000 21000
17 A01 longname.zip Baseline 0 30500 qfin plethh plethi 16000 22000
18 A01 longname.zip Baseline 0 30500 qfin plethh plethi 17000 23000
19 A01 longname.zip Baseline 0 30500 qfin plethh plethi 18000 24000
20 A01 longname.zip Baseline 0 30500 qfin plethh plethi 19000 25000
21 A01 longname.zip Baseline 0 30500 qfin plethh plethi 20000 26000
22 A01 longname.zip Baseline 0 30500 qfin plethh plethi 21000 27000
23 A01 longname.zip Baseline 0 30500 qfin plethh plethi 22000 28000
24 A01 longname.zip Baseline 0 30500 qfin plethh plethi 23000 29000
25 A01 longname.zip Baseline 0 30500 qfin plethh plethi 24000 30500
26 A01 longname.zip Run 30500 68500 qfin plethh plethi 30500 36500
27 A01 longname.zip Run 30500 68500 qfin plethh plethi 31500 37500
28 A01 longname.zip Run 30500 68500 qfin plethh plethi 32500 38500
29 A01 longname.zip Run 30500 68500 qfin plethh plethi 33500 39500
30 A01 longname.zip Run 30500 68500 qfin plethh plethi 34500 40500
I named the columns Begin_New
and End_New
, you could change that easily into Begin
and End
.