Search code examples
javaunicodejeditorpanedomhtmleditorkit

Access/Change JEditorPane's html loaded elements + HTMLEditorKit problem with Unicode (Java)


that's going to be a long question so bear with me :)

My Application

I'm developing a Java (with JFrame GUI) desktop application that does the following:

  1. Scan (.txt) files.
  2. Parses some numbers from these files, performs some calculations on them and finally stores the results in String variables.
  3. Outputs these numbers in a special (table) format. (Note: the format includes some Unicode (Arabic) Characters.)

Problem

The first two parts went smoothly. However when I came to the 3th part (the formatted output) I didn't know how to display this special format so,

  • What is the best way to display a special formatted output (table) in Java?

Note: Formatter is not going to help because it has no proper support for tables.

Solution One:

I did my research and found that I could use JEditorPane, since it can display special formats such as "html". So I decided to create an "html" page with the needed (table) format and then display this page on [JEditorPane][4]. I did that and it went smoothly until I wanted to change some html elements' values to the parsed numbers from those (.txt) files.

  • How can I have an access to an html element(e.g. <td></td>) and change its value?

Note that the (.html) is loaded inside JEditorPane using setPage(url).

The Unicode characters are displayed properly but I couldn't change some of the elements values (e.g. I want to change the value of <td> 000,000,000 </td> to <td> MainController.getCurrentTotalPayment() </td>

Solution Two:

I've found a workaround to this which involves using HTMLDocument and HTMLEditorKit, That way I can create the (.html) using HTMLEditorKit from scratch and display it on the JEditorPane using kit.insertHTML.

I have successfully added the content using the above method and I also was able to add the parsed numbers from (.txt) files because I have them stored in my (MainController) class. Unfortunately, the Unicode Arabic characters were not displayed properly.

  • How can I display these Unicode characters properly?

So the first solution lacks the access to html elements and the second lacks the Unicode support!

My colleagues advised me to use JSP code in the html document that can have an access to my MainController.java class. Therefore, loading the page into JEditorPane with the html elements changed already. Isn't there a way to do that without the help of JSP?

Some other people recommended the use of JTidy but isn't there a way to do it within Java's JDK?

I'm open to all possible solutions. Please help.

My Code: Some code content were omitted because they are not relevant

MainController.java

class MainController 
{
    private static String currentTotalPayment;

    public static void main(String[] args) 
    {
        CheckBankFilesView cbfView = new CheckBankFilesView();
        cbfView.setVisible(true);
    }

    public static void setCurrentTotalPayment(String totalPayment) {
        MainController.currentTotalPayment = totalPayment;
    }

    public static String getCurrentTotalPayment() {
        return currentTotalPayment;
    }
}

MyFormattedOuputSolutionOne.java:

public class MyFormattedOuputSolutionOne extends javax.swing.JFrame {

    private void MyFormattedOuputSolutionOne() {

        jPanel1 = new javax.swing.JPanel();
        jScrollPane1 = new javax.swing.JScrollPane();
        myFormattedOuput = new javax.swing.JEditorPane();

        myFormattedOuput.setContentType("text/html");
        //myFormattedOuput.setContentType("text/html; charset=UTF-8"); //Doesn't seem to work

        myFormattedOuput.setEditable(false);

        jScrollPane1.setViewportView(myFormattedOuput);

        myFormattedOuput.setComponentOrientation(ComponentOrientation.RIGHT_TO_LEFT);

        try{
            myFormattedOuput.setPage(getClass().getResource("resources/emailFormat2.html"));

            //How can I edit/change html elements loaded in 'myFormattedOuput'?
        }catch(Exception e){
        }
    }
}

MyFormattedOuputSolutionTwo.java:

public class MyFormattedOuputSolutionTwo extends javax.swing.JFrame {

    private void MyFormattedOuputSolutionTwo() {

        jPanel1 = new javax.swing.JPanel();
        jScrollPane1 = new javax.swing.JScrollPane();
        myFormattedOuput = new javax.swing.JEditorPane();

        myFormattedOuput.setContentType("text/html");
        //myFormattedOuput.setContentType("text/html; charset=UTF-8"); //Doesn't seem to work

        myFormattedOuput.setEditable(false);

        jScrollPane1.setViewportView(myFormattedOuput);

        HTMLEditorKit kit = new HTMLEditorKit();

        HTMLDocument doc = new HTMLDocument();

        myFormattedOuput.setEditorKit(kit);

        myFormattedOuput.setDocument(doc);

        myFormattedOuput.setComponentOrientation(ComponentOrientation.RIGHT_TO_LEFT);

        try{
            // Tried to set the charset in <head> but it doesn't work!
            //kit.insertHTML(doc, 1, "<meta http-equiv = \"Content-Type\" content = \"text/html; charset=UTF-8\">", 0, 0, HTML.Tag.META);

            kit.insertHTML(doc, doc.getLength(), "<label> السلام عليكم ورحمة الله وبركاته ,,, </label>", 0, 0, null); //Encoding problem
            kit.insertHTML(doc, doc.getLength(), "<br/>", 0, 0, null); // works fine
            kit.insertHTML(doc, doc.getLength(), MainController.getCurrentTotalPayment(), 0, 0, null); // works fine

            //How can I solve the Unicode problem above?
        }catch(Exception e){
        }
    }
}

htmlFormatTable.html:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

<html>

    <head>

        <meta http-equiv = "Content-Type" content = "text/html; charset=UTF-8">

    </head>

    <body>

        <label> السلام عليكم ورحمة الله وبركاته ,,, </label>
        <br/>
        <label>  الأخوة الكرام نفيدكم بتفاصيل المدفوعات لشهر  </label> XX/143X </label>  هـ كما هو موضح ادناه  </label>
        <br/>
        <table align="right"  border="1" width="600" cellpadding="5" cellspacing="0">
            <tr char="utf-8" bgcolor="cccccc" align="center">
                <td colspan="3">   <label> تفاصيل مدفوعات بنك الرياض </label>  <img src="..\images\riyadh.gif" width="65" height="15"/> </td>
            </tr>
            <tr align="center">
                <td></td>
                <td id="cell1">0,000,000.00</td>
                <td align="right"> معاشات </td>
            </tr>
            <tr align="center">
                <td></td>
                <td id="cell2">0,000,000.00</td>
                <td align="right"> أخطار </td>
            </tr>
            <tr align="center">
                <td bgcolor="cccccc"> المجموع </td>
                <td bgcolor="cccccc">   0,000,000.00 <label> ريال سعودي </label> </td>
                <td></td>
            </tr>
        </table>
        <br/>
        <label> شاكرين لكم حسن تعاونكم ...... </label>
        <br/>
        <label> فريق العمليات بقسم الحاسب الآلي </label>

    </body>

</html>

Thank you for reading my long multiple questions thread and cannot wait for your answer.

Update:

Thanks to @Howard for this insight, if I replace the arabic character with its corresponding unicode (e.g. ب = \u0628) it works fine but there must be a way to do it without the need to replace each character, right?


Solution

  • Solution One

    It is possible to edit HTML loaded into JEditorPane. Here's the complete code based on your MyFormattedOuputSolutionOne.java:

    import java.awt.ComponentOrientation;
    import java.beans.PropertyChangeEvent;
    import java.beans.PropertyChangeListener;
    
    import javax.swing.JEditorPane;
    import javax.swing.JScrollPane;
    import javax.swing.SwingUtilities;
    import javax.swing.text.BadLocationException;
    import javax.swing.text.Document;
    import javax.swing.text.Element;
    import javax.swing.text.SimpleAttributeSet;
    
    public class MyFormattedOuputSolutionOne extends javax.swing.JFrame {
    
        private MyFormattedOuputSolutionOne() {
            super("MyFormattedOuputSolutionOne");
            setDefaultCloseOperation(DISPOSE_ON_CLOSE);
    
            JScrollPane jScrollPane1 = new javax.swing.JScrollPane();
            final JEditorPane myFormattedOuput = new javax.swing.JEditorPane();
    
            getContentPane().add(jScrollPane1);
    
            myFormattedOuput.setContentType("text/html");
            //myFormattedOuput.setContentType("text/html; charset=UTF-8"); //Doesn't seem to work
    
            myFormattedOuput.setEditable(false);
    
            jScrollPane1.setViewportView(myFormattedOuput);
    
            myFormattedOuput.setComponentOrientation(ComponentOrientation.RIGHT_TO_LEFT);
    
            try{
                myFormattedOuput.setPage(getClass().getResource("htmlFormatTable.html"));
                myFormattedOuput.addPropertyChangeListener(new PropertyChangeListener() {
    
                    @Override
                    public void propertyChange(PropertyChangeEvent evt) {
                        if ("page".equals(evt.getPropertyName())) {
                            Document doc = myFormattedOuput.getDocument();
                            Element html = doc.getRootElements()[0];
                            Element body = html.getElement(1);
                            Element table = body.getElement(1);
                            try {
                                Element tr2 = table.getElement(1);
                                Element tr2td1 = tr2.getElement(0);
                                doc.insertString(tr2td1.getStartOffset(), "1: 123,456",
                                                 SimpleAttributeSet.EMPTY);
    
                                Element tr3 = table.getElement(2);
                                Element tr3td1 = tr3.getElement(0);
                                doc.insertString(tr3td1.getStartOffset(), "2: 765.123",
                                                 SimpleAttributeSet.EMPTY);
                            } catch (BadLocationException e) {
                                e.printStackTrace();
                            }
                            myFormattedOuput.removePropertyChangeListener(this);
                        }
                    }
    
                });
    
                //How can I edit/change html elements loaded in 'myFormattedOuput'?
    
            } catch(Exception e){
                e.printStackTrace();
            }
    
            pack();
            setSize(700, 400);
            setVisible(true);
        }
    
        public static void main(String[] args) {
            SwingUtilities.invokeLater(new Runnable() {
                @Override
                public void run() {
                    new MyFormattedOuputSolutionOne();
                }
            });
        }
    }
    

    It loads document asynchronously and waits for page to be loaded. When page is loaded, it accesses the elements of the document to search for elements and inserts text into the first <td> in the 2nd and 3rd row of the table.

    By the way your HTML is not valid! You should clean it up. When you do it, the indexes of the document elements will change and you'll have to adjust code which finds the insertion points.

    The window looks this way: Solution One window screen shot

    Solution Two

    I've found no issues with encoding. The characters display correctly. Yet I had to set the encoding of Java files to UTF-8 in the Eclipse project.

    Solution Two with correctly displayed Arabic

    Solution Three

    Have you considered using JTable to display table of results in the UI?


    The HTML might look this way:

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
    
    <html>
        <head>
            <meta http-equiv = "Content-Type" content = "text/html; charset=UTF-8">
        </head>
        <body>
            <p> السلام عليكم ورحمة الله وبركاته ,,, </p>
    
            <p>  الأخوة الكرام نفيدكم بتفاصيل المدفوعات لشهر  </p>
            <p>XX/143X </p>
            <p>  هـ كما هو موضح ادناه  </p>
    
            <table align="right"  border="1" width="600" cellpadding="5" cellspacing="0">
                <tr bgcolor="cccccc" align="center">
                    <td colspan="3">تفاصيل مدفوعات بنك الرياض <img src="..\images\riyadh.gif" width="65" height="15"/></td>
                </tr>
                <tr align="center">
                    <td></td>
                    <td id="cell1">0,000,000.00</td>
                    <td align="right">معاشات</td>
                </tr>
                <tr align="center">
                    <td></td>
                    <td id="cell2">0,000,000.00</td>
                    <td align="right">أخطار</td>
                </tr>
                <tr align="center">
                    <td bgcolor="cccccc">المجموع</td>
                    <td bgcolor="cccccc">0,000,000.00 ريال سعودي</td>
                    <td></td>
                </tr>
            </table>
            <p> شاكرين لكم حسن تعاونكم ...... </p>
            <p> فريق العمليات بقسم الحاسب الآلي </p>
        </body>
    </html>
    

    Since I don't understand a word, I cannot propose a better formatting. First of all, <label> elements are allowed only in <form>. You had a sequence of three <label>s above the table where only one of them had opening <label> tag, there were three closing </label> tags. I made them all into <p>; however if you meant them to be headers for table columns, you should have used a table row with three <th> elements.

    With this structure of the HTML, <table> element in the HTML tree would be at index 4, i.e. you should change the line

    Element table = body.getElement(1);
    

    to

    Element table = body.getElement(4);
    

    The indexes 0–3 are now <p> elements.


    As a side note, instead of editing HTML after loading it into JEditorPane, which loads it into text model of HTMLDocument, you could edit your HTML document before passing to setPage so that it already contains the correct data in <td> elements. Since JEditorPane.setPage method accepts only URL, then your choice would be read which accepts an InputStream and Object which describes the model (should be instance of HTMLDocument in your case). StringBufferInputStream is the best candidate for this task yet it was deprecated because it cannot correctly read UTF-8 characters. Having this in mind, you would rather use String.getBytes("UTF-8") function (since J2SE 6), and ByteArrayInputStream, your HTML declares the encoding and JEditorPane would respect it when reading.