Was exploring ways to avoid the for loop and instead make use of the any() function to implement a function that gives true when the passed argument is prime and false other wise.
Here is what I have:
prime2 <- function(n) {
rangeOfNumbers <- range(2:(n-1))
if(any(n%%rangeOfNumbers == 0)){
return(FALSE)
}
else return(TRUE)
}
Looks straight forward, but prime(55)
gives TRUE
instead of false.
What am I doing wrong?
As you have stated, we can easily call an isprime
function from many packages to obtain your answer quickly, but that is not your aim here. You are learning and trying to understand basic principals in R
and programming in general. Many kudos!!
As many have pointed out, the use of range
here is inappropriate in this situation. This function simply returns the minimum and maximum of a given vector (See ?range
for more info). So for your example of 55, since range(2:54)
returns 2
and 54
, 55 is not divisible by either of these numbers, hence the return of TRUE
.
As noted by @LAP, you need the colon :
operator. So if you change range(2:(n-1))
to simply 2:(n-1)
, your algorithm will work (for the most part... it won't work for n <= 2
).
Since you state that you are in the process of learning, you should also think about ways of coding in a more efficient and thoughtful manner. In order to check whether a number is prime, we only need to check up to the square root of n. This will reduce many checks.
Let's also think a little harder on how we can avoid further checks. We know when checking for primality, we only need to consider prime numbers. Since 2 is the only even prime number, we could first check if the input is even, and if it isn't, only check divisibility by odd numbers only. We can achieve the last bit by generating a sequence with fixed steps by calling the function seq
or seq.int
.
Let's see this in action:
## Pseudo-Corrected OP function (2 should return TRUE, 1 should return FALSE, etc.)
prime2 <- function(n) {
rangeOfNumbers <- 2:(n-1)
if(any(n%%rangeOfNumbers == 0)){
return(FALSE)
}
else return(TRUE)
}
prime2Enhanced <- function(n) {
if (n < 2)
return(FALSE)
else if (n %in% c(2, 3, 5, 7))
return(TRUE)
else if (n %% 2 == 0)
return(FALSE)
else if (any(n %% seq.int(3, sqrt(n), 2) == 0))
return(FALSE)
else
return(TRUE)
}
all.equal(sapply(3:1000, prime2), as.logical(gmp::isprime(3:1000)))
[1] TRUE
all.equal(sapply(3:1000, prime2Enhanced), as.logical(gmp::isprime(3:1000)))
[1] TRUE
We can do even better by looking at extending our programming concepts to vectorization. This is when we can pass a vector and get the result all at one time instead of using a loop. This is a very important concept in R
and I highly recommend learning about it. The R Inferno is an excellent resource for such a topic (see Circle 3).
prime2Vectorized <- function(v) {
result <- rep(TRUE, length(v))
testVec <- as.integer(c(2, seq(3, sqrt(max(v)), 2)))
## Although we are using a loop here, we are taking
## advantage of vectorization concepts with each x
for (x in testVec) {
s <- which(v >= x^2)
result[s[v[s] %% x == 0]] <- FALSE
}
result
}
all.equal(prime2Vectorized(3:1000), as.logical(gmp::isprime(3:1000)))
[1] TRUE
I will save it as an exercise to handle edge cases and further optimization.
Now, let's see how much improvements we have in efficiency using the library microbenchmark
:
library(microbenchmark)
microbenchmark(orig = sapply(3:1000, prime2),
improved = sapply(3:1000, prime2Enhanced),
vectorize = prime2Vectorized(3:1000))
Unit: microseconds
expr min lq mean median uq max neval
orig 3863.279 4198.5595 5470.2178 4403.5050 4723.485 15775.377 100
improved 1670.418 1740.1240 1937.2396 1809.6680 1935.365 9912.629 100
vectorize 202.006 218.5505 306.4068 233.9045 249.696 6243.121 100
Hopefully you can see that there are many fundamental concepts that can be taken away from this simple exercise. If you really want to learn about primality tests, you should check out Miller-Rabin primality test which is implemented in gmp
, numbers
, and other packages.