Modelling the feature space for a text document is quite easy.
For example, I can take every word from a text (training data) as a feature.
If a particular word (e.g. "dog") encounters multiple times in (classified) training examples (e.g. classified as spam), then I can take this word to classify new data.
How do I model my features, if they arent only words?
In my specific case, I have features like name, age and family size.
I don't think it is the right way to make an entry for every possible age in my feature vector.
If I assume that humans die by no later than 100, I would have 100 digits only for my age feature.
So I thought about data binning: Partitionate the feature "age" in maybe 1-20 yo, 21-40 yo, 41-60,...
To model a person with the age of 30 I would only need 5 digits now (01000).
Is there a better way to model features like these?
It seems that i found an answer1 answer2. Hence one can model the feature with either data binning or by using a (normal) distribution that fits the continuous feature.