Search code examples
filtersearch-engine

"Learning" filter engines


Are there any "intelligent" or "learning" engines out there, that are able to identify "evil" phrases in texts ( maybe something like a learning Spamfilter... e.g. used in Thunderbird? )

For example if i want to filter texts with mailadresses:

asdasd asd as d dgfdgfdgfdg sadasd(at)asfsdf.com

At first the tool wouldn't recognize this as an emailadress... but if the user "teached" ( clicked a "text contains an mailadress"-button for example ) the tool several times, that text which contains phrases like "xxxxx(at)xxxxx.xx" is suspicious, it "learns" that it should mark these text automatically in the future...

Question: Is there anything like it on the market? I foudn some libs ( like SpamAssasin, etc. ) but these are "specialized" on emails...


Solution

  • Yeah, this seems to be good start: http://nbayes.codeplex.com/ ( C# implementation of the bayesian algorithm )