I want to match characters via regex on:
/\p{Script=Han}/gu
AND /\p{Alphabetic}/gu
This would mean that:
﨑
matches, it's both Han and Alphabetic〆
doesn't match, it's alphabetic but not Han⺁
doesn't match, it's Han but not alphabetic (it's a radical)Ideally someone can show me how to do it with browser-based JavaScript.
PS:
I was using this before: /[\u4e00-\u9faf\u3400-\u4dbf]/g
but the issue is that it won't match all Han characters like 﨑
so I rather use /\p{Script=Han}/gu
but avoid any non-alphabetic characters like radicals etc.
You can use a positive lookahead assertion to match only Alphabetic results that also match Han with something like the following:
const expr = /(?=\p{Script=Han})\p{Alphabetic}/gu;
This gives your desired output I believe