Search code examples
pythonspsssignificancequantile-regression

How to find out whether coefficients of different quantiles are significantly different in a quantile regression? (SPSS or Python)


I'm examining whether the rates of increase in income within a certain profession are significantly different in different parts of the income distribution, to see if the income gap is significantly stretching or closing.

THE QUANTREG MODEL

I have performed a quantile regression in SPSS (I'm new to coding, only have vey basic knowledge of Python so I need your help). The dependent variable is indexed income, the independent variables are time (quarter in this dataset), demographic groups, segments of the profession. I have also added the interaction terms of each dummy with the time variable.

So (at least the way I see it), this model allows comparison of changes in income on three levels:

  1. How does belonging to a certain demographic group or segment impact income (e.g., compared with data entry jobs: data analytics adds 100€, data science adds 200€)
  2. How are the effects of each different category/dummy changing over time (e.g., compared with data entry jobs, the positive effect of being a data scientist has increased 10% and now adds 220€)
  3. How do these changing effects differ between different parts of the income distribution (e.g., the coefficient of time*data_scientist is much larger in the 90%Q than in the 10%Q, indicating that the higher-earning data scientists have seen a bigger increase in income over time than the lower-earning data scientists)

QUESTION

So I've got my output of this quantile regression on SPSS, a huge table with all the coefficients and their significance and confidence intervals.

Now I want to find out whether the differences between the 90%Q and the 10%Q are statistically significant, in order to make statements about whether the income gap in this profession has significantly increased or decreased. I thought to do this on Python instead of SPSS, I've searched how to cut the data into quantiles, and how to perform a quantile regression. But how should one get on with testing the significance of difference between 90%Q and 10%Q?


Solution

  • I found a method to test whether the regression coefficients are significantly different, the 50%-rule using the standardised beta weights and their 95% confidence intervals (which can be estimated via bias corrected bootstrap; for quantile regressions, they are usually already provided in the output). The rule basically states that if the 95% confidence intervals of two sample means have less than 50% overlap, then there is a significant difference (p=0.05). In case of less than 14% overlap, significance level is at p=0.01.

    This is the Youtube video in which I found this method:

    https://www.youtube.com/watch?v=qKnpiGwNDMk

    And the paper that the Youtube video referred to:

    Cumming, G. (2009). Inference by eye: reading the overlap of confidence intervals. Statistics in Medicine, 28(2), 205-220.