Search code examples
javaxmlhibernatesaxstax

StAX - reading base64 string from xml into db


I'm using StAX to read my file, which has some Base64 data in it, and saving it into the db using Hibernate.

XML:

<root>
  <base64>lololencoded12</base64>
  <base64>encodedlolos32</base64>
  ...............................
</root>

Code to read and save:

xmlif = (XMLInputFactory2) XMLInputFactory2.newInstance();
xmlif.setProperty(XMLInputFactory.IS_REPLACING_ENTITY_REFERENCES, Boolean.FALSE);
xmlif.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, Boolean.FALSE);
xmlif.setProperty(XMLInputFactory.IS_COALESCING, Boolean.FALSE); 
xmlif.configureForLowMemUsage();

List<Entity> entities = new ArrayList();
FileInputStream fis = new FileInputStream(filename);
XMLStreamReader2 xmlr = (XMLStreamReader2) xmlif.createXMLStreamReader(filename, fis);
int eventType = xmlr.getEventType();
String curElement = "";
while (xmlr.hasNext()) {
     eventType = xmlr.next();
     switch (eventType) {
         case XMLEvent.START_ELEMENT:
             curElement=xmlr.getName().toString();
             if ("base64".equals(curElement)) {
                 Entity entity = new Entity();
                 entity.setBase64(xmlr.getElementText().getBytes());
                 session.save(entity);
                 session.flush();
             }
             break;
     }
 }
 iterator itr = entities.iterator();
 while (itr.hasNext()) {
     Entity e = (Entity)itr.next();                    
     session.saveOrUpdate(e);
 }

This approach gobbles memory in amount that is 6-9 times size of my xml. How can i improve this?

EDIT

If i comment out entity.setBase64() everything is fine. When saving byte[] to db memory usage goes bonkers. Why?

EDIT Entity getters and setters:

    //for me
    public byte[] getBase64() {
        return base64;
    }

    public void setBase64(byte[] base64) {
        this.base64= base64;
    }


    //for hibernate
    public Blob getBase64Blob() {
        if (this.base64!=null) {
            LobCreator lobok =Hibernate.getLobCreator(MainFrame.sessionFactory.getCurrentSession());
            return lobok.createBlob(base64);
        } else {
            return null;
        }
    }

    public void setBase64Blob(Blob dataBlob) {
        if (dataBlob!=null) {
            this.base64= toByteArray(dataBlob);
        }
    }   

    //utilities methods from blob to byte array 
    private byte[] toByteArray(Blob fromBlob) {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        try {
            return toByteArrayImpl(fromBlob, baos);
        } catch (SQLException e) {
            throw new RuntimeException(e);
        } catch (IOException e) {
            throw new RuntimeException(e);
        } finally {
            if (baos != null) {
                try {
                    baos.close();
                } catch (IOException ex) {
                }
            }
        }
    }

    private byte[] toByteArrayImpl(Blob fromBlob, ByteArrayOutputStream baos)
            throws SQLException, IOException {
        byte[] buf = new byte[4000];
        InputStream is = fromBlob.getBinaryStream();
        try {
            for (;;) {
                int dataSize = is.read(buf);
                if (dataSize == -1)
                    break;
                baos.write(buf, 0, dataSize);
            }
        } finally {
            if (is != null) {
                try {
                    is.close();
                } catch (IOException ex) {
                }
            }
        }
        return baos.toByteArray();
    }

EDIT xmlr.getElementText().getBytes() causes a lot of memory usage for some reason.


Solution

  • How often are you flushing out your entities?

    I think you're building too many entities and not flushing your session frequently enough which results in all those objects being created to be kept in the current session.

    May want to check that.

    Edit

    The comments below indicated that the submitter wanted a way to insert binary blob data directly in to the database.

    Instead of doing this with hibernate, this can be achieved with just JDBC.

    java.sql.Connection conn = ...
    java.sql.PreparedStatement pstmt= conn.prepareStatement("insert into ENTITY_TABLE (BASE64) VALUES (?)");
    InputStream is= ... // byte data
    pstmt.setBinaryStream(1, is);
    int numUpdated= pstmt.executeUpdate();
    

    Now be advised, this is REAL rough and tumble. This is making the assumption that the ENTITY_TABLE is using a database generated identifier for the row and that columns other than BASE64 are allowed to have nulls or have reasonable defaults. The executeUpdate will run the insert statement taking the value of is as the value of the blob data.

    I hope this gets you closer to your solution.