Search code examples
janusgraphgoogle-cloud-bigtable

Migrate Graphs in Janusgraph Between Instances


My team is looking at migrating Janusgraph data between instances (we are using Janusgraph on top of Google Cloud BigTable), using 2 separate approaches:

  1. Export the graph as a graphml file, and import it into the other instance
  2. Export the underlying BigTable table, and import it into the table underlying the other instance

However, for each of the approaches, we are facing issues:

  1. Our graph is quite huge and during the export process, we kept facing the java.io.IOException: Connection reset by peer issue, even after setting the gremlin server timeout to beyond 20mins
  2. We tried exporting the BigTable table via Cloud Dataflow in 3 separate formats (as advised here), all with a different issue faced:
    • Avro format: after exporting the avro files, when re-importing them to the new table, we face the following error: Error message from worker: java.io.IOException: At least 8 errors occurred writing to Bigtable. First 8 errors: Error mutating row ( ;�! with mutations [set cell ....] .... Caused by: java.lang.NullPointerException - since Janusgraph stores binary data into BigTable, perhaps the dataflow job is unable to export avro files properly
    • SequenceFile format: when reimporting these files, we face the following error: Error message from worker: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 310 actions: StatusRuntimeException: 310 times, servers with issues: batch-bigtable.googleapis.com
    • Parquet format: this proves to be the most promising, and the import job mostly completed (except for an error seen during the downscaling of dataflow workers Root cause: The worker lost contact with the service.). When reimporting to the target table, data is generally intact. However, the indexes appear to be "cranky" after the import (e.g. when querying a particular node using a has() filter on an indexed property, the query completes quickly, but does not return any results)

Would appreciate any opinions/inputs on the above issues, thanks!


Solution

  • So the problem here appears to be that Dataflow is failing mutation requests with more than 100k mutations per row (due to BigTable's limitation). However, the newer version of ParquetToBigTable template provided by Google appears to have a new parameter called "splitLargeRows", which helps split up large rows so that the number of mutations stays <= 100k.