Search code examples

Return a matrix from distinct command

I have a simple question about the distinct command in Stata.

When using with a by prefix, can it return a one dimension matrix of r(N)?

For example:

sysuse auto,clear
bysort foreign: distinct rep78

Can I store a [2,1] matrix, with each row representing the number of distinct values of rep78?

The manual seems to suggest that it only stores the number of distinct values of the last by value.


  • You can easily create your own wrapper for that:

    sysuse auto,clear
    sort foreign                
    levelsof foreign, local(foreign_levels)
    local number_of_foreign_levels : word count `foreign_levels'
    matrix distinct_mat = J(`number_of_foreign_levels', 1, 0)
    forvalues i = 1 / `number_of_foreign_levels' {
         quietly distinct rep78 if foreign == `i' - 1
         matrix distinct_mat[`i', 1] = r(ndistinct)
    matrix list distinct_mat
    r1   5
    r2   3

    Note that the number of distinct observations is stored in r(ndistinct), not r(N).