I do url slugging by replacing(similar to this). My problem is that in Korean language connected words make problem for me:
// Korean
'ㄱ'=>'k','ㅋ'=>'kh','ㄲ'=>'kk','ㄷ'=>'t','ㅌ'=>'th','ㄸ'=>'tt','ㅂ'=>'p',
'ㅍ'=>'ph','ㅃ'=>'pp','ㅈ'=>'c','ㅊ'=>'ch','ㅉ'=>'cc','ㅅ'=>'s','ㅆ'=>'ss',
'ㅎ'=>'h','ㅇ'=>'ng','ㄴ'=>'n','ㄹ'=>'l','ㅁ'=>'m', 'ㅏ'=>'a','ㅓ'=>'e','ㅗ'=>'o',
'ㅜ'=>'wu','ㅡ'=>'u','ㅣ'=>'i','ㅐ'=>'ay','ㅔ'=>'ey','ㅚ'=>'oy','ㅘ'=>'wa','ㅝ'=>'we',
'ㅟ'=>'wi','ㅙ'=>'way','ㅞ'=>'wey','ㅢ'=>'uy','ㅑ'=>'ya','ㅕ'=>'ye','ㅛ'=>'oy',
'ㅠ'=>'yu','ㅒ'=>'yay','ㅖ'=>'yey',
the problem is that korean characters combine and make new characters: 및 which is made of three characters. so how to slug korean urls?
First of all, you need to extract the three characters building up the one. So ('ㅁ', 'ㅣ', 'ㅊ')
extracted from 및
I found some useful links (assuming unicode), however I didn't test any of the codes listed below:
If you manage to extract the three characters, I think the remaining part is simple. Here is a Google link to start searching by yourself.