Search code examples
statisticslimitstata

Calculate decile limits in Stata


Here is my problem: I have monthly income data and have used the "xtile" command to calculate the 5% quantiles

xtile income_decile=bbh5101, nq(20)

How can I find out which borders Stata used to allocate the observation to a certain quantile bin, e.g. first quantile bin from 0 to 800€, second quantile bin from 801 to 1600€ and so on?


Solution

  • I believe you just want the percentiles. Use the corresponding _pctile command. For example:

    clear all
    set more off
    
    sysuse auto
    
    xtile q = weight, nq(10)
    
    _pctile weight, nq(10)
    
    sort weight
    list weight q
    
    return list
    

    Checking those two lists, should be useful. See also the Methods and formulas section in [D] pctile.

    The result:

    . list weight q
    
         +-------------+
         | weight    q |
         |-------------|
      1. |  1,760    1 |
      2. |  1,800    1 |
      3. |  1,800    1 |
      4. |  1,830    1 |
      5. |  1,930    1 |
         |-------------|
      6. |  1,980    1 |
      7. |  1,990    1 |
      8. |  2,020    1 |
      9. |  2,040    2 |
     10. |  2,050    2 |
         |-------------|
     11. |  2,070    2 |
     12. |  2,110    2 |
     13. |  2,120    2 |
     14. |  2,130    2 |
     15. |  2,160    2 |
         |-------------|
     16. |  2,200    3 |
     17. |  2,200    3 |
     18. |  2,230    3 |
     19. |  2,240    3 |
     20. |  2,280    3 |
         |-------------|
     21. |  2,370    3 |
     22. |  2,410    3 |
     23. |  2,520    3 |
     24. |  2,580    4 |
     25. |  2,640    4 |
         |-------------|
     26. |  2,650    4 |
     27. |  2,650    4 |
     28. |  2,670    4 |
     29. |  2,690    4 |
     30. |  2,730    4 |
         |-------------|
     31. |  2,750    5 |
     32. |  2,750    5 |
     33. |  2,830    5 |
     34. |  2,830    5 |
     35. |  2,930    5 |
         |-------------|
     36. |  3,170    5 |
     37. |  3,180    5 |
     38. |  3,200    6 |
     39. |  3,210    6 |
     40. |  3,220    6 |
         |-------------|
     41. |  3,250    6 |
     42. |  3,260    6 |
     43. |  3,280    6 |
     44. |  3,300    6 |
     45. |  3,310    6 |
         |-------------|
     46. |  3,330    7 |
     47. |  3,350    7 |
     48. |  3,370    7 |
     49. |  3,370    7 |
     50. |  3,400    7 |
         |-------------|
     51. |  3,420    7 |
     52. |  3,420    7 |
     53. |  3,430    8 |
     54. |  3,470    8 |
     55. |  3,600    8 |
         |-------------|
     56. |  3,600    8 |
     57. |  3,670    8 |
     58. |  3,690    8 |
     59. |  3,690    8 |
     60. |  3,700    8 |
         |-------------|
     61. |  3,720    9 |
     62. |  3,740    9 |
     63. |  3,830    9 |
     64. |  3,880    9 |
     65. |  3,900    9 |
         |-------------|
     66. |  4,030    9 |
     67. |  4,060    9 |
     68. |  4,060    9 |
     69. |  4,080   10 |
     70. |  4,130   10 |
         |-------------|
     71. |  4,290   10 |
     72. |  4,330   10 |
     73. |  4,720   10 |
     74. |  4,840   10 |
         +-------------+
    
    . 
    . return list
    
    scalars:
                     r(r1) =  2020
                     r(r2) =  2160
                     r(r3) =  2520
                     r(r4) =  2730
                     r(r5) =  3190
                     r(r6) =  3310
                     r(r7) =  3420
                     r(r8) =  3700
                     r(r9) =  4060
    

    You can put the percentiles in a variable. Just use:

    pctile p = weight, nq(10)