Search code examples
jqueryregextwitterarabic

How can I use regex to exclude URLs from a JQuery script matching mixed Arabic and English found text?


This is a followup question to How to detect text language with jQuery and XRegExp to display mixed RTL and LTR text correctly

I'm using the Custom Twitter Feed plugin to display a client's Twitter feed on a website. He tweets in both English and Arabic. To test whether a tweet is in English or Arabic, and then style for ltr or rtl, I'm using this simple jQuery script suggested by the developer.

jQuery('.tweet').each(function() {

  if (jQuery(this).find('.tweet-text').text().match(/[a-z]/i)) {
    jQuery(this).find('.tweet-text').addClass("ltr");
  } else {
    jQuery(this).find('.tweet-text').addClass("rtl");
  }
});

This is working well for most tweets. However it doesn't work if the client includes a URL link in an Arabic tweet, for example:

انه أن جمعت والديون, مما الأمم وسمّيت و, ٣٠ الشرقية الفرنسي دنو. وحتى اتّجة أي عدد, ٠٨٠٤ بالفشل العمليات بين أن, صفحة شعار اليميني عرض من. يتبقّ وكسبت عدم و. ومن مسارح المضي أم, أمدها لأداء يتم إذ. https://www.google.co.uk/

How can I exclude all URLs from the matching element of the script?


Solution

  • I've found an answer that seems to work by coming at the problem from a different angle. Instead of trying to exclude the URLs I'm just matching the first word in the tweet for English or Arabic and then setting the text styling from that. The revised script is now:

    jQuery('.tweet').each(function() {
    
      if (jQuery(this).find('.tweet-text').text().split(' ')[0].match(/[a-z]/i)) {
        jQuery(this).find('.tweet-text').addClass("ltr");
      } else {
        jQuery(this).find('.tweet-text').addClass("rtl");
      }
    });