I have a data frame of just one column that looks like this:
>df
Sample_Name
1 GW16F1_A-1
2 GW16F1_A-10
3 GW16F1_A-12
4 GW16F2_A-2
5 GW16F2_A-3
6 GW16F2_A-5
7 GW16V1_A-6
8 GW16V1_A-7
9 GW16V2_A-8
10 GW16V2_A-9
I want to append a second column to this data frame based on the contents of the Sample_Name column, so the output would look like this:
>df
SampleName SampleGroup
1 GW16F1_A-1 F1
2 GW16F1_A-10 F1
3 GW16F1_A-12 F1
4 GW16F2_A-2 F2
5 GW16F2_A-3 F2
6 GW16F2_A-5 F2
7 GW16V1_A-6 V1
8 GW16V1_A-7 V1
9 GW16V2_A-8 V2
10 GW16V2_A-9 V2
Is there a function that will read through the contents of a column and output a new vector based on it?
substr
should be sufficient for this, given your sample input.
Try:
> transform(df, sampleGroup = substr(df$Sample_Name, 5, 6))
Sample_Name sampleGroup
1 GW16F1_A-1 F1
2 GW16F1_A-10 F1
3 GW16F1_A-12 F1
4 GW16F2_A-2 F2
5 GW16F2_A-3 F2
6 GW16F2_A-5 F2
7 GW16V1_A-6 V1
8 GW16V1_A-7 V1
9 GW16V2_A-8 V2
10 GW16V2_A-9 V2