I need to implement an automated email reply system.
Here for the system i need to check the incoming emails and reply the email in the same language in which the email was received.
How can i do such a thing , please suggest some ideas? Thanks in advance.
Appending one more query:
In the email headers there is one more header of the kind:
Content-Type: text/plain; charset=ISO-8859-1
How good it can prove in determining the language of the email body?
e.g (all headers taken out from gmail):
for Chinese subject and body Content-Type: text/plain; charset=GB2312
for Korean subject and body Content-Type: text/plain; charset=EUC-KR
for french/italian subject and body Content-Type: text/html; charset=ISO-8859-1
Also is there any list somebody can direct me that have mappings defined for language to charset?
Thanks in advance
Google translate can guess the language of a sample text. Have a look at the API, it could be a solution for your problem (if you're connected to the internet anyway and don't care, sending fragments of mails to google servers...).
For offline evaluation I found the Java Text Categorizing Library.