Search code examples
javascripthtmlregexfontshtml-parsing

How to Parse the Fancy Text From Text Area


I am facing some issues when copy paste the fancy texts and emojis in a textarea,

Like πŸ˜‹ and πŸ…΅πŸ…°πŸ…½πŸ…²πŸ†ˆ πŸ†ƒπŸ…΄πŸ†‡πŸ†ƒ πŸ…ΆπŸ…΄πŸ…½πŸ…΄πŸ†πŸ…°πŸ†ƒπŸ…ΎπŸ†

I have removed the emojis with following code:

e.content.replace(/([\u2700-\u27BF]|[\uE000-\uF8FF]|\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDFFF]|[\u2011-\u26FF]|\uD83E[\uDD10-\uDDFF])/g, '')

Also wanted to remove the special fonts and fancy texts as well, but not finding a way.

is there any way around for this, like i did for the emojis.


Solution

  • ECMAScript 6 regex solution to match the squared letters is

    .replace(/[\u{1F170}-\u{1F189}]+/gu, '')
    

    To also match math and punctuation symbols, you can use the following ECMAScript 2018+ compliant regex:

    .replace(/[\u{1F170}-\u{1F189}\p{P}\p{S}]+/gu, '')
    

    The u flag is required to make \u{XXXX} notation and \p{X} Unicode categories work.

    Pattern details

    • \u{1F170}-\u{1F189} - squared letters
    • \p{P} - punctuation proper
    • \p{S} - math symbols.