wso2 analytics wso2-api-manager wso2-das

WSO2 DAS - High CPU usage with oracleDB 11g

I am configuring DAS 3.1.0 + APIM 2.0.0 with Oracle database 11g (relational database).

When I enable DAS analysis statistics to integrate with the API Manager, almost everything works fine except for the part where the DAS dramatically raises the CPU consumption level of the machine with the database.

I noticed that he always runs this query:

MERGE INTO API_REQ_USER_BROW_SUMMARY dest USING( SELECT :1 api, :2
version, :3 apiPublisher, :4 tenantDomain, :5 total_request_count, :6 
year, :7 month, :8 day, :9 requestTime, :10 os, :11 browser FROM dual) 
src ON(dest.api=src.api AND dest.version=src.version AND 
dest.apiPublisher=src.apiPublisher AND dest.year=src.year AND 
dest.month=src.month AND dest.day=src.day AND dest.os=src.os AND 
dest.browser=src.browser AND dest.tenantDomain=src.tenantDomain)WHEN NOT 
MATCHED THEN INSERT(api, version, apiPublisher, tenantDomain, 
total_request_count, year, month, day, requestTime, os, browser) 
VALUES(src.api, src.version, src.apiPublisher, src.tenantDomain, 
src.total_request_count, src.year, src.month, src.day, src.requestTime, 
src.os, src.browser) WHEN MATCHED THEN UPDATE SET 
dest.total_request_count=src.total_request_count, 
dest.requestTime=src.requestTime

I would like to know if there is a way to optimize so that the cpu of the machine on which the data dase is not so beaten up causing a performance drop.

Please, has anyone had this difficulty yet and could you help me?

Solution

What happens in the above query is, the records are inserted into database if there are no records with the same primarykey values in the database, or if there are some records with the same primarykeys then we update the existing records.

The Table "API_REQ_USER_BROW_SUMMARY" has two columns "OS" and "browser" which are a part of the primary keys of that table. It is observed that when the NULL values are inserted into "OS" and "browser" the analytics server and the database hang.

What you can do is, (You might need to shutdown the analytics server and restart the db server before following steps)

Go to {Analytics_server}/repository/deployment/server/carbonapps then open org_wso2_carbon_analytics_apim-1.0.0.car as a zip file.
Go to folder APIM_USER_AGENT_STATS_1.0.0
open APIM_USER_AGENT_STATS.xml
At the end of the script (before tag), you will see a sparkSql query like below.

INSERT INTO TABLE APIUserBrowserData SELECT api,version,apiPublisher,tenantDomain,total_request_count,year,month,day,requestTime,os,browser FROM API_REQUEST_USER_BROWSER_SUMMARY_FINAL;

Replace that line with the following.

INSERT INTO TABLE APIUserBrowserData SELECT api,version,apiPublisher,tenantDomain,total_request_count,year,month,day,requestTime, if(os is null, "UNKNOWN",os), if(browser is null, "UNKNOWN", browser) FROM API_REQUEST_USER_BROWSER_SUMMARY_FINAL;

This will prevent Spark inserting NULL values to "OS" and "browser" columns into table "API_REQ_USER_BROW_SUMMARY". Please check if the CPU consumption is high after doing the above changes.

Edit: @artCampos, I cannot comment, so I am editing my original answer to reply to your comment. There will not be any side effect. But, Note that, We are replacing the NULL values with string value "UNKOWN". I dont think that will be a problem in this case. You dont need to discard any of the existing data. Please also note that, In any case, if the NULL values are inserted into DB primaryKeys, it will fail in most of the RDBMS.