I have a database and am trying to find a safe way to read this data into an XML file. Reading the verticies is not the problem. But with the edges the app runs since a while into a HeapSpace-Exception.
g.V().has(Timestamp) .order() .by(Timestamp, Order.asc) .range(startIndex, endIndex) .outE() Data to be Taken eg. Label, Id, PropertiesI use the timestamp of the verticies as a reference for sorting to avoid capturing duplicates.
But because some verticies make up the largest value of all outEdges, the program runs out of the heap. My question is can I use the alphanumeric ID(UUID) of the edges to sort them safely or is there another or better way to reach the goal.
Something like this:
g.E().order().by(T.id, Order.asc).range(startIndex, endIndex)
Ordering the edges by T.id
would be a fine property to order by however the problem is not related to the property chosen, it is instead related to the sheer number of edges being exported. Neptune, as with other databases, has to retrieve all the edges in order to then order them. Retrieving all these edges is why you are running out of heap. To solve this problem you can either increase the instance size to get additional memory for the query or export the data differently. If you take a look at this utility https://github.com/awslabs/amazon-neptune-tools/tree/master/neptune-export you can see current recommended best practice. This utility will export the entire graph as a CSV file. In your use case you may be able to either use this utility and then transform the CSV into an XML document or modify the code to export as an XML document.