I would like examine a large collection of emails that are known to be spam email, to discover if there are sub-types of spam mail.
Should I using a supervised learning algorithm or an unsupervised learning algorithm ?
Thank you.
Supervised Learning. Look into Naive Bayes. It has been used to solve exactly this problem with great success in the past.