Search code examples
androidregexhighlight

Highlight search word with apostrophe android


I'm using FTS4 in my android application to implement full-text search. The data in the app, coming from an API, has diacritics and accents. I've created 2 columns in the database, one which stores the original data and the other column stores data without diacritics or accents (stripped using Normalizer). The search gets executed successfully when I search for words without diacritics or accents. The problem arises when I want to highlight the searched query found in the text.

So for eg. this sentence which I got from SO:

James asked, “’Tis Renée’s and Noël’s great‐grandparents’ 1970's-ish summer‐house, t'isn’t it?” Receiving no answer, he shook his head--and walked away.

If I run a search for Renee, it will highlight Renée but when I execute a search for Renees, it successfully finds text which contain the word Renée’s but because of the apostrophe it will not highlight it.

    Search Term: Renee
    Highlighted Output: Renée
    
    Search Term: Renees
    Highlighted Output: <whitespace>Renée’ <-- doesn't show the expected output
    Expected Output: Renée’s

If I use replaceAll to remove all the apostrophes to highlight the searched query, it will show the highlighted word Renée’s but only till the apostrophe like so -> Renée’ highlighting even the whitespace before the word. But it pushes highlighted word back even more if there are more apostrophes in the paragraph which have been stripped.

Basically I want to show Renée’s in the paragraph displayed to the user and highlight the whole word even if the user searches for Renees.

Here's my code to highlight searched text:

 if (searchQuery != null){
                String paragraph = data.getParagraph();
                SpannableStringBuilder sb = new SpannableStringBuilder(paragraph);

                String normalizedText = Normalizer.normalize(paragraph, Normalizer.Form.NFD).replaceAll("\\p{InCombiningDiacriticalMarks}+", "").toLowerCase();

                //String normalizedText = Normalizer.normalize(paragraph, Normalizer.Form.NFD).replaceAll("\\p{InCombiningDiacriticalMarks}+", "").replaceAll("'", "").toLowerCase(); //remove all apostrophes -- this works but pushes back the highlighted text color because it doesn't count all stripped apostrophes in the original paragraph.


                Pattern word = Pattern.compile(searchQuery, Pattern.CASE_INSENSITIVE);
                Matcher match = word.matcher(normalizedText);

                while (match.find()) {
                    BackgroundColorSpan fcs = new BackgroundColorSpan(Color.YELLOW); 
                    sb.setSpan(fcs, match.start(), match.end(), Spannable.SPAN_EXCLUSIVE_EXCLUSIVE);
                }
                text.setText(sb);
            }

How do I highlight the searched word even with apostrophe?


Solution

  • You can add ['’]? pattern (that matches an optional ' or char) between each char in the searchQuery:

    Pattern word = Pattern.compile(TextUtils.join("['’]?", searchQuery.split("")), Pattern.CASE_INSENSITIVE);
    

    This way, you will make sure the search phrase will match even if there is a single apostrophe anywhere inside it.

    See a regex demo.