Search code examples
algorithmsortingstatisticsquartile

Quartile Array: Inclusive or Exclusive of Median Elements


Let's say that we have the following array:

[3.0, 4.0, 4.0, 4.0, 7.0, 10.0, 11.0, 12.0, 14.0, 16.0, 17.0, 18.0]

The inter-quartile range would be as follows:

  • 25%: 4.0
  • 50%: 10.5
  • 75%: 15

If one then were to try to count the elements in each quartile, would this typically include the inter-quartile cutoffs or exclude the inter-quartile cutoffs?

e.g. for this example, would the counts typically be listed as option A (inclusive) or option B (exclusive):

Option A (inclusive):

  • 0-25%: 4
  • 25-50%: 2
  • 50-75%: 3
  • 75-100%: 3

Option B (exclusive):

  • 0-25%: 1
  • 25-50%: 5
  • 50-75%: 3
  • 75-100%: 3

Solution

    1. Yes
    2. No. The dictionary definition of a quartile is "each of four equal groups into which a population can be divided ..." So you don't start by finding the quartile values 4.0, 10.5, 15, and then count the elements in a quartile. You start by dividing the sorted array into four equal groups, and then find the quartile values. As an extreme example, what if all 12 samples have the value 4.0? Then what's your answer? The correct answer is that for a sample size of 12, there are 3 samples per quartile

    Credit to user @user3386109