r# Data sub-setting using strings including sign greater than or less than

I would like to generate a full non-duplicate (row wise and column wise) combination of strings that contain instructions as greater than and less than (possibly adding other mathematical sign).

How can I do it? Please see below example including partial solution, which however is missing the ">" and "<" sign, so basically the variable name, here in this example named as `a:e`

plus the sign for sub-setting in case variable is less or greater than 0.

The `comb`

object includes the variables including the desired sign for sub-setting.

```
comb <- data.frame(in1=c("a > 0","b > 0","c > 0","d > 0","e > 0"),
in2=c("a < 0","b < 0","c < 0","d < 0","e < 0"))
comb.vars <- with(comb, expand.grid(in1,in2, stringsAsFactors=F))
comb.vars <- rbind(data.frame(data.frame(Var3="y > 0"),comb.vars),
data.frame(data.frame(Var3="y < 0"),comb.vars));
comb.vars
```

This does not give the desired outcome since in the same line it can occur the same variable shows opposing sign, example: `y > 0 a > 0 a < 0`

in first line and also line 7 gives `y > 0 b > 0 b < 0`

```
dup <- apply(comb.vars, 1, function(x) length(which(duplicated(x)))>0)
remdup1 <- comb.vars[!dup, ]
onlyvars <- apply(remdup1, 2, function(x) substr(x, 1, regexpr("\\>", x)-1))
# remove row-wise duplicats
dup <- apply(onlyvars, 1, function(x) length(which(duplicated(x)))>0)
remdup2 <- onlyvars[!dup, ]
# remove among rows duplicates
uniq <- remdup1[!duplicated(apply(remdup2, 1, function(row) paste(sort(row), collapse=""))), ]
uniq
```

Base `r`

solution required only.

Solution

You can find the number of times the first character is repeated across a row and then only keep rows where the values where the value does not duplicate.

Using `tidyverse`

:

```
library(tidyverse)
comb.vars %>%
rowwise() %>%
mutate(
repvals = sum(duplicated(str_extract(c(Var1, Var2, Var3), "^\\w")))
) %>%
ungroup() %>%
filter(repvals == 0) %>%
select(-repvals)
```

Returns:

```
# A tibble: 40 × 3
Var3 Var1 Var2
<chr> <chr> <chr>
1 y > 0 b > 0 a < 0
2 y > 0 c > 0 a < 0
3 y > 0 d > 0 a < 0
4 y > 0 e > 0 a < 0
5 y > 0 a > 0 b < 0
6 y > 0 c > 0 b < 0
7 y > 0 d > 0 b < 0
8 y > 0 e > 0 b < 0
9 y > 0 a > 0 c < 0
10 y > 0 b > 0 c < 0
# ℹ 30 more rows
```

A base R version to do the same:

```
comb.vars$rep = apply(comb.vars, 1, function(x) {
sum(duplicated(sapply(regmatches(x, gregexec("^\\w", x)), function(x) x[[1]])))
})
comb.vars <- comb.vars[comb.vars$rep == 0, ]
```

- Installing R on Linux: configure: error: libcurl >= 7.28.0 library and headers are required with support for https
- How to do ensembles with time series using AICc?
- planes3d expands and draws the area based on the sphere's radius
- How to extract tag code itself using R, rvest
- How to Display or Print Contents of Environment in R
- How to use Windows user credentials for proxy authentication in R/RStudio
- R reticulate specifying python executable to use
- Replace multiple Instances of a variable name in an R function and save the modified function
- Standardizing address formatting in R
- How to fix "failed to load cairo DLL" in R?
- Using grepl to filter columns names in specific range of columns
- changing the legends in ggplot2 to have groups of similar labels
- How to keep only unique rows but ignore a column?
- convert string date to R Date FAST for all dates
- Add subgroup text to plotly pie chart
- R Shiny : adjust height of DT datatable when fillContainer=TRUE,
- Why do R external pointers' "unusual copying semantics" mean they should not be used stand-alone?
- How to extract somo character after a string with a number of word which can change in R
- What does `se` stand for in geom_smooth(..., se = FALSE)?
- How to find number of rows greater than any values in R
- Align text and reduce space between text and parentheses in plotly hover info box
- Remove outer box of geom_bar plot with broken y-axis
- How to use lag/lead in mutate with an initial value?
- Is it possible to have a Shiny ConditionalPanel whose condition is a global variable?
- counting elements in one list in another list
- How to vectorize nested loops in R?
- Replace NA values with an incrementing sequence starting from the previous non-NA value
- How can I calculate the number of uniques in a row within a species matrix?
- How to perform operations on pairs of rows, based on a "distinguishing" column's values
- Mutate variable based on previous observations