I'm trying to "normalise" a DefaultStyledDocument subclass, in the sense that we have org.w3c.dom.Node.normalize()
: that is, merge adjoining text "leaves". In the case of a DefaultStyledDocument these leaves are identified for merging if two adjacent ones have the same attributes (or none).
Below is a simple version (we don't check the actual attributes: it is the use-case where you either have plain text, or text with one possible mark-up style).
def normalise( self ):
# recursive function:
def normalise_structure( el, depth = 0 ):
indent = ' ' * depth
start = el.startOffset
print( '%s# el %s |%s|' % ( indent, el, self.getText( start, el.endOffset - start )))
prev_attr_set = None
for i in range( el.elementCount ):
subelement = el.getElement( i )
normalise_structure( subelement, depth + 1 )
if subelement.leaf:
curr_attr_set = subelement.attributes
print( '%s # this is a leaf, attribs %s' % ( indent, curr_attr_set, ))
# this is a simple version: only works if there is only one possible attribute
if prev_attr_set and curr_attr_set and prev_attr_set.attributeCount == curr_attr_set.attributeCount:
print( '%s # %s leaf needs to be merged with previous leaf' % (
indent, 'marked-up' if prev_attr_set.attributeCount == 1 else 'plain'))
attr_set = prev_attr_set.getElement( 0 ) if prev_attr_set.attributeCount else None
prev_subelement = el.getElement( i - 1 )
prev_start = prev_subelement.startOffset
curr_end = subelement.endOffset
merged_element = javax.swing.text.AbstractDocument.LeafElement(
javax.swing.text.DefaultStyledDocument(), el, attr_set, prev_start, curr_end )
el.replace( prev_start, curr_end - prev_start, [ merged_element ] )
prev_attr_set = curr_attr_set
else:
print( '%s # NOT a leaf...' % ( indent, ))
prev_attr_set = None
for self_el in self.rootElements:
normalise_structure( self_el )
When I run this I get this error:
Exception in thread "AWT-EventQueue-0" java.lang.ArrayIndexOutOfBoundsException at java.lang.System.arraycopy(Native Method) at javax.swing.text.AbstractDocument$BranchElement.replace(AbstractDocument.java:2290)
I hasten to add that, before trying javax.swing.text.DefaultStyledDocument()
as param 1 in the LeafElement constructor I tried "self
" (i.e. the DefaultStyledDocument which invokes normalise
on line one): same error.
Yes, possible to do:
AbstractDocument.BranchElement.replace()
looks like this:
public void replace(int offset, int length, Element[] elems)
...
Turns out that "offset" and "length" here refer to the sub-elements of the BranchElement (typically LeafElements), not to the offset and length of the underlying text in the StyledDocument.
Somebody cleverer than me would have got this earlier. API documentation (Java 7) might make it a bit clearer....