Search code examples
javaregexjtextpane

Highlighting numbers in JTextPane using Regex


I'm trying to highlight numbers written inside a JTextPane.

This is my code:

//Highlight.java

import java.awt.*;
import java.awt.event.*;
import javax.swing.*;
import javax.swing.event.*;

class Highlight

{

 public static void main(String[] abc)

 {

  JFrame frame = new JFrame("Highlighting");
  frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

  JTextPane textPane = new JTextPane();
  textPane.setDocument(new NumberHighlight());

  frame.add(textPane);

  frame.setSize(450,450);
  frame.setLocationRelativeTo(null);
  frame.setVisible(true);

 }

}

//NumberHighlight.java

import java.awt.*;
import java.awt.event.*;
import javax.swing.*;
import javax.swing.text.*;
import javax.swing.text.MutableAttributeSet.*;

class NumberHighlight extends DefaultStyledDocument

{
private static final  MutableAttributeSet BOLD = new SimpleAttributeSet();

 private static int findLastNonWordChar (String text, int index)
 {
  while (--index >= 0)
  {
   if (String.valueOf(text.charAt(index)).matches("\\W"))
   {
    break;
   }
  }
  return index;
 }

 private static int findFirstNonWordChar (String text, int index)
 {
  while (index < text.length())
  {
   if (String.valueOf(text.charAt(index)).matches("\\W"))
   {
    break;
   }
   index++;
  }
  return index; 
 }
  final StyleContext cont = StyleContext.getDefaultStyleContext();

  final AttributeSet attp = cont.addAttribute(cont.getEmptySet(), StyleConstants.Foreground, new Color(255,0,255));
  final AttributeSet attrBlack = cont.addAttribute(cont.getEmptySet(), StyleConstants.Foreground, Color.BLACK);

  public void insertString (int offset, String str, AttributeSet a) throws BadLocationException
  {
   super.insertString(offset, str, a);
   String text = getText(0, getLength());
   int before = findLastNonWordChar(text, offset);
   if (before < 0) before = 0;
   int after = findFirstNonWordChar(text, offset + str.length());
   int wordL = before;
   int wordR = before;
   while (wordR <= after)
   {
    if (wordR == after || String.valueOf(text.charAt(wordR)).matches("\\W"))
    {
     if (text.substring(wordL, wordR).matches("(\\W)*(\\d+$)"))
     {
       setCharacterAttributes(wordL, wordR - wordL, BOLD, false);
       setCharacterAttributes(wordL, wordR - wordL, attp, false);
     }      
     else
     {
      StyleConstants.setBold(BOLD, false);
      setCharacterAttributes(wordL, wordR - wordL, BOLD, true);
      setCharacterAttributes(wordL, wordR - wordL, attrBlack, false);
     }  
     wordL = wordR;
    }
    wordR++;
   }
  }
  public void remove (int offs, int len) throws BadLocationException
  {
   super.remove(offs, len);
   String text = getText(0, getLength());
   int before = findLastNonWordChar(text, offs);
   if (before < 0) before = 0;
   int after = findFirstNonWordChar(text, offs);
   if (text.substring(before, after).matches("(\\W)*(\\d+$)"))
   {
    setCharacterAttributes(before, after - before, BOLD, false);
    setCharacterAttributes(before, after - before, attp, false);
   } 
   else
   {
    StyleConstants.setBold(BOLD, false);
    setCharacterAttributes(before, after - before, BOLD, true);
    setCharacterAttributes(before, after - before, attrBlack, false);
   }
  }

 }

I'm facing a problem with my regex. Everything works fine but if I write an alphanumeric character before a number, then that alphanumeric character's attributes also change.

I don't face this problem if when an alphanumeric character is inserted after a number.

Example:

JTextPaneregex

What am I doing wrong with my regex?

Thanks!


Solution

  • I don't quite understand what your goal is, but I can tell you what your regular expressions are doing, and hopefully that should help you.

    String.valueOf(text.charAt(index)).matches("\\W")
    

    This takes a single character and returns true if it is not a word character (a word character is a letter, digit, or underscore).

    text.substring(wordL, wordR).matches("(\\W)*(\\d+$)"))
    

    This returns true if the substring

    1. starts with zero-or-more non-word characters: (\\W)*
    2. and is followed by one-or-more digits. (\\d+$)
    3. This match must be anchored to both string start and end. The dollar-sign in your regex is redundant with the matches function, although it does no harm.

    Because it is anchored, the string @*#&$%^@( ||3 matches, as does @3.

    These strings do not match: 3@ (because it ends with a non-digit), 3@3 (because it starts with a word-character--a digit).

    There is also no use for the capture groups, but again, they do no harm. In other words, you could just as well use \\W*\\d+.

    (The ^ and $ are in the below regexes to simulate the anchoring as required by matches.)

    ^(\W)*(\d+$)
    

    Regular expression visualization

    Debuggex Demo

    ^\W*\d+$
    

    Regular expression visualization

    Debuggex Demo


    Please consider bookmarking the Stack Overflow Regular Expressions FAQ for future reference. All the links in this answer come from it.