I want to load lots of data into Orient DB with multiple threads. I'm using OrientDB 2.2.20 and Java 1.8.0_131 to run below sample test client. But when I run this client with 5 threads and 10000 samples then the client's CPU usage goes over 100% and the process becomes almost dead.
Actually I wanted to use graph APIs to create huge number of vertices and edges between them. But I read in some post that for massive inserts use document API and set the in & out pointers using doc APIs. Hence tried this program.
Could someone point what is wrong in the code?
public OrientDBTestClient(){
db = new ODatabaseDocumentTx(url).open(userName, password);
}
public static void main(String[] args) throws Exception{
int threadCnt = Integer.parseInt(args[0]);
OrientDBTestClient client = new OrientDBTestClient();
try {
db.declareIntent(new OIntentMassiveInsert());
Thread[] threads = new Thread[threadCnt];
for (int i = 0; i < threadCnt; i++) {
Thread loadStatsThread = new Thread(client.new LoadTask(Integer.parseInt(args[1])));
loadStatsThread.setName("LoadTask" + (i + 1));
loadStatsThread.start();
threads[i] = loadStatsThread;
}
}
catch(Exception e){
e.printStackTrace();
}
}
private class LoadTask implements Runnable{
public int count = 0;
public LoadTask(int count){
this.count = count;
}
public void run(){
long start = System.currentTimeMillis();
try{
db.activateOnCurrentThread();
for(int i = 0; i < count; ++ i){
storeStatsInDB(i +"");
}
}
catch(Exception e){
log.println("Error in LoadTask : " + e.getMessage());
e.printStackTrace();
}
finally {
db.commit();
System.out.println(Thread.currentThread().getName() + " loaded: " + count + " services in: " + (System.currentTimeMillis() - start) + "ms");
}
}
}
public void storeStatsInDB(String id) throws Exception{
try{
long start = System.currentTimeMillis();
ODocument doc = db.newInstance();
doc.reset();
doc.setClassName("ServiceStatistics");
doc.field("serviceID", id);
doc.field("name", "Service=" + id);
doc.save();
}
catch(Exception e){
log.println("Exception :" + e.getMessage());
e.printStackTrace();
}
}
db instances aren't sharable between threads. You have two choices:
The following example is extracted from internal tests:
pool = new OPartitionedDatabasePool("remote:localshot/test", "admin", "admin");
Runnable acquirer = () -> {
ODatabaseDocumentTx db = pool.acquire();
try {
List<ODocument> res = db.query(new OSQLSynchQuery<>("SELECT * FROM OUser"));
} finally {
db.close();
}
};
//spawn 20 threads
List<CompletableFuture<Void>> futures = IntStream.range(0, 19).boxed().map(i -> CompletableFuture.runAsync(acquirer))
.collect(Collectors.toList());
futures.forEach(cf -> cf.join());`