I have a problem with unique command. When I run unique for each variable I have no problems , but when I try to run for more than one variable I get trouble.
This is my database:
structure(list(M1 = c("AMZN PE Equity", "ANDAHUC1 PE", "ANDAHUC1 PE",
"ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity"), M2 = c("AMZN PE Equity",
"ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity")), row.names = c(NA, 13L), class = "data.frame")
Both variables( M1, M2) have the same characters but are repeated in different proportions( ALICPRLl1 Pe Equity is repeated six times in M1 , but is repeated seven times in M2).
This is my code for more than one variable:
morethan1<- unique(testrep[c("M1", "M2")])
and this is my output:
M1 M2
1 AMZN PE Equity AMZN PE Equity
2 ANDAHUC1 PE ANDAHUC1 PE
7 ANDAHUC1 PE ALICPRLl1 Pe Equity
8 ALICPRLl1 Pe Equity ALICPRLl1 Pe Equity
As you can see there are some repeated names yet.
This is my code for just one variable:
just1 <- unique(testrep[c("M1")])
I have no problem in this case:
M1
1 AMZN PE Equity
2 ANDAHUC1 PE
8 ALICPRLl1 Pe Equity
Also I have the numbers (1 , 2 , 8), I want numbers to appear in order(1 ,2 3)
Finally, If I have 100 columns( M1 to M100) how can I modify my first code(M1:M00)?
Another way:
bothvariables <- subset(unique(testrep), M1 != M2)
This is my output:
M1 M2
7 ANDAHUC1 PE ALICPRLl1 Pe Equity
But I want this output:
M1 M2
AMZN PE Equity AMZN PE Equity
ANDAHUC1 PE ANDAHUC1 PE
ALICPRLl1 Pe Equity ALICPRLl1 Pe Equity
No matter if the names are repeated different number of times in M1, M2 I want an output that show me the unique names inside M1 and M2 columns.
Now I try with a one more column : M3
This is my new database:
structure(list(M1 = c("AMZN PE Equity", "ANDAHUC1 PE", "ANDAHUC1 PE",
"ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity"), M2 = c("AMZN PE Equity",
"ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity"), M3 = c("AMZN PE Equity", "AMZN PE Equity",
"AMZN PE Equity", "ANDAHUC1 PE", "ANDAHUC1 PE", "ANDAHUC1 PE",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity", "ALICPRLl1 Pe Equity",
"ALICPRLl1 Pe Equity")), row.names = c(NA, 13L), class = "data.frame")
This is the code:
testrep %>%
distinct(M1, .keep_all = TRUE)
This is the output:
M1 M2 M3
1 AMZN PE Equity AMZN PE Equity AMZN PE Equity
2 ANDAHUC1 PE ANDAHUC1 PE AMZN PE Equity
3 ALICPRLl1 Pe Equity ALICPRLl1 Pe Equity ALICPRLl1 Pe Equity
I have the same problem now, just because I added a new column.
this is my expected output:
M1 M2 M3
1 AMZN PE Equity AMZN PE Equity AMZN PE Equity
2 ANDAHUC1 PE ANDAHUC1 PE ANDAHUC1 PE
3 ALICPRLl1 Pe Equity ALICPRLl1 Pe Equity ALICPRLl1 Pe Equity
We can use paste
testrep[paste0("M", 1:100)]
Or with startsWith
testrep[startsWith(names(testrep), "M")]
Or using grep
testrep[grep("^M\\d+$", names(testrep))]
Regarding the row number issue, it can be set to NULL
to get the default numbering
row.names(just1) <- NULL
We can use distinct
with .keep_all
library(dplyr)
testrep %>%
distinct(M1, .keep_all = TRUE)
# M1 M2
#1 AMZN PE Equity AMZN PE Equity
#2 ANDAHUC1 PE ANDAHUC1 PE
#3 ALICPRLl1 Pe Equity ALICPRLl1 Pe Equity
We could get the unique
from. each column separately and then unnest
it
library(tidyr)
testrep2 %>%
summarise_all(list(~ list(unique(.)))) %>%
unnest(everything())
# A tibble: 3 x 3
# M1 M2 M3
# <chr> <chr> <chr>
#1 AMZN PE Equity AMZN PE Equity AMZN PE Equity
#2 ANDAHUC1 PE ANDAHUC1 PE ANDAHUC1 PE
#3 ALICPRLl1 Pe Equity ALICPRLl1 Pe Equity ALICPRLl1 Pe Equity