I'm trying to find the median of two different sized sorted arrays. But there are a few situations where it doesn't work and I wasn't able to figure out why. I've included my implementation below.
I do realize there are similar solutions online. But I'm just starting to learn algorithms and so want to do as much as possible by myself. Thanks a lot in advance for your help!
public double median(Point[] arr, int start, int end) {
int n = end - start + 1;
if (n % 2 == 0) {
return (double) (arr[start + (n/2)].getSz() + arr[start + (n/2) - 1].getSz())/2;
}
else {
return (double) arr[start + (n/2)].getSz();
}
}
public double getMedian(int aStart, int aEnd, int bStart, int bEnd) {
int m = aEnd - aStart + 1;
int n = bEnd - bStart + 1;
double median = 0;
if (m == 0 && n > 0) {
return median(arr2, 0, bEnd);
}
if (m > 0 && n == 0) {
return median(arr1, 0, aEnd);
}
if (m == 1 && n == 1) {
return (double) (arr1[0].getSz() + arr2[0].getSz())/2;
}
if (m == 2 && n == 2) {
median = (double) (Math.max(arr1[aStart].getSz(), arr2[bStart].getSz()) + Math.min(arr1[aEnd].getSz(), arr2[bEnd].getSz()))/2;
return median;
}
double m1 = median(arr1, aStart, aEnd);
double m2 = median(arr2, bStart, bEnd);
if (m1 == m2) {
return m1;
}
if (m1 < m2) {
if (m % 2 == 0) {
aStart = aStart + (m/2) - 1;
index = 1;
}
else {
index = 2;
aStart = aStart + m/2;
}
bEnd = bStart + n/2;
}
else {
if (n % 2 == 0) {
index = 3;
bStart = bStart + n/2 - 1;
}
else {
index = 4;
bStart = bStart + n/2;
}
aEnd = aStart + m/2;
}
return (getMedian(aStart, aEnd, bStart, bEnd));
}
Here's a example for which the logic breaks:
arr1 = 6, 20, 28, 29, 36, 41
arr2 = 14, 25, 33, 47, 53, 66, 79, 98
Correct median = 34.5
Estimated median = 31
Looks like there are a few problems in the algorithm.
Firstly 0 is being used instead of aStart and bStart in a couple of places:
if (m == 0 && n > 0) {
return median(arr2, ->0<-, bEnd);
}
if (m > 0 && n == 0) {
return median(arr1, ->0<-, aEnd);
}
if (m == 1 && n == 1) {
return (double) (arr1[->0<-].getSz() + arr2[->0<-].getSz())/2;
}
Secondly; in the last block you must be careful to throw out as many values above the median as below it.
if (m1 < m2) {
if (m % 2 == 0) {
aStart = aStart + (->m<-/2) - 1;
index = 1;
}
else {
index = 2;
aStart = aStart + ->m<-/2;
}
bEnd = bStart + ->n<- /2;
}
and also you must not throw out the values closest to the median where the median is calculated on an even number of data. Hope that helps.