I would like to create my own Heapsort algorithm in R.
That is my code
heapify <- function(array, n, i)
{
parent <- i
leftChild <- 2 * i + 1
rightChild <- 2 * i + 2
if ((leftChild < n) & (array[parent] < array[leftChild]))
{
parent = leftChild
}
if ((rightChild < n) & (array[parent] < array[rightChild]))
{
parent = rightChild
}
if (parent != i)
{
array = replace(array, c(i, parent), c(array[parent], array[i]))
heapify(array, n, parent)
}
}
heapSort <- function(array)
{
n <- length(array)
for (i in (n+1):1)
{
heapify(array, n, i)
}
for (i in n:1)
{
array = replace(array, c(i, 0), c(array[0], array[i]))
heapify(array, i, 1)
}
print(array)
}
However that implementation seems to be incorrect. That's an example of an input and output.
array <- c(5, 14, 3, 70, 64)
heapSort(array)
Output: [1] 5 14 3 70 64
I have spent quite a while and I have no idea where the problem is. I would appreciate any hints or tips.
My guess is that you were trying to convert the algorithm posted on GeeksforGeeks where they implement this in many zero based languages. This is one of the sources of your problem (R starts indexing at 1 instead of 0).
Example 1:
## We also need to swap these indices
array = replace(array, c(i, 0), c(array[0], array[i]))
heapify(array, i, 1)
Should be:
array <- replace(array, c(i, 1), array[c(1, i)])
array <- heapify(array, i, 1)
Example 2:
leftChild <- 2 * i + 1
rightChild <- 2 * i + 2
Should be:
leftChild <- 2 * (i - 1) + 1
rightChild <- 2 * (i - 1) + 2
In R, you cannot pass an object by reference (see this question and answers Can you pass-by-reference in R?). This means that when we call a recursive function we must assign it and the recursive function must return something.
In heapify
we must return array
. Also every call to heapify
we must assign array
to the output.
Here is the amended code:
heapify <- function(array, n, i)
{
parent <- i
leftChild <- 2 * (i - 1) + 1
rightChild <- 2 * (i - 1) + 2
if ((leftChild < n) & (array[parent] < array[leftChild]))
{
parent <- leftChild
}
if ((rightChild < n) & (array[parent] < array[rightChild]))
{
parent <- rightChild
}
if (parent != i) {
array <- replace(array, c(i, parent), array[c(parent, i)])
array <- heapify(array, n, parent)
}
array
}
heapSort <- function(array)
{
n <- length(array)
for (i in floor(n / 2):1) {
array <- heapify(array, n, i)
}
for (i in n:1) {
array <- replace(array, c(i, 1), array[c(1, i)])
array <- heapify(array, i, 1)
}
array
}
Here are some tests (note this algorithm is extremely inefficient in R.. do not try on vectors much larger than below):
array <- c(5, 14, 3, 70, 64)
heapSort(array)
[1] 3 5 14 64 70
set.seed(11)
largerExample <- sample(1e3)
head(largerExample)
[1] 278 1 510 15 65 951
identical(heapSort(largerExample), 1:1e3)
[1] TRUE