I have a simple question about the distinct
command in Stata.
When using with a by
prefix, can it return a one dimension matrix of r(N)
For example:
sysuse auto,clear
bysort foreign: distinct rep78
Can I store a [2,1]
matrix, with each row representing the number of distinct values of rep78
The manual seems to suggest that it only stores the number of distinct values of the last by value.
You can easily create your own wrapper for that:
sysuse auto,clear
sort foreign
levelsof foreign, local(foreign_levels)
local number_of_foreign_levels : word count `foreign_levels'
matrix distinct_mat = J(`number_of_foreign_levels', 1, 0)
forvalues i = 1 / `number_of_foreign_levels' {
quietly distinct rep78 if foreign == `i' - 1
matrix distinct_mat[`i', 1] = r(ndistinct)
matrix list distinct_mat
r1 5
r2 3
Note that the number of distinct observations is stored in r(ndistinct)
, not r(N)