By household, keep data only if observations started after Feb. 2000 - Stata

I am working in Stata and have data that lists out portfolios(houseID), the year and month, the stockID and the stock's return. The data spans several years. And looks like:

My data

I am essentially trying to isolate a sub-sample of the data. I would like to keep only those houses and their data if their first portfolio observation was in February 2000. In the above data, I'd like to drop houses 223 and 382 and only keep the data for 448.

My first attempt was to do something like:

by HouseID: keep if....

but I am continually botching it. Does anyone have any ideas? Thanks for the help!!

Solution

clear all
set more off

input ///
houseid year month
223 1997 1
223 1997 2
223 1998 1
223 2000 1
223 2000 2
223 2000 3
448 2000 2
448 2000 3
end

list

bysort houseid (year month): keep if year[1] == 2000 & month[1] == 2

list

keep will delete unwanted observations. Instead, you could also mark the subsample of interest and work with that. For example

bysort houseid (year month): gen ok = year[1] == 2000 & month[1] == 2

<some command> if ok

For more advanced date manipulations try working with date variables. See for example

http://www.stata.com/help.cgi?dates_and_times

http://www.stata.com/support/faqs/data-management/handling-date-information/