Search code examples
hbasegeomesa

Geomesa IN Query Performance


I am using Geomesa as a spatial temproral database. For one of the use-case. I need to do Geomesa Id based query (100 Ids in a batch). I am seeing high latency with IN query. Is there anyway to improve the query performance?

Below is the AuditLogger of query:

30 Nov 2019 09:04:05,481 [36m[DEBUG][m c7e42e76-074f-47e7-84d0-22b682912f6e (Coral Endpoint : 182) org.locationtech.geomesa.utils.audit.AuditLogger$: 
{
    "storeType": "hbase",
    "typeName": "OSMNodes",
    "date": 1575104645481,
    "user": "unknown",
    "filter": "IN ('FvUghvjl0iiMYMBsmQ_MoA#56144647','pY2sw3HeKtzeZJcu8u6ChQ#56144672','b7n6IdGTi1Kr_gCPMGCNGw#56144654','0O8cUGIwM5bREyJdr-e-3g#56854415','cWEPdm0iH82n9PHJjw9QgA#56144645','ls74D9mftAptPbT2-MVfdw#56854426','p9PI-J_sw78mI6bdre-DkQ#55772582','oypt0UG8kIc0Z5pZRRSXlg#56144681','0glVM6b1J9fcMu5QzxHBZg#56144648','EdF-Gs7RJNm-sd0NUhH_ew#56854451','73VBCwEWSJv0iXzhca27ZA#56854424','FKEIKipQ5jlTslj-cFqWnQ#57226766','tJj1QESy98YrFocAUsmAig#56854455','1fZLVqkokRrw1iHtKsqS7g#56854416','ynIlytnf75M80GEQmcq5xA#55772578','z3mJk8_OIetWvSfmQDe7Uw#57226757','wif0yWVwEDkfRJs6b1gkbA#56854420','hBNrWVxxWaUHbGpHX4-Jow#56854414','HMBhiho2059nBYoAlgR1YQ#56854454','nKEr_m7Vygdm0YAzHcIbEQ#57226786','nRK4k8BojfNJCHvkE0mpNQ#56854447','Eq7RmxvhEj5nnyUswNtXcQ#56144652','t23othKYQ5RBB_AwJKA4Mg#57501085','FYLA_y7cbwoQ9tOGREKjPQ#56854446','dWJA26NmvOW4R8dk_Q3oIg#55772579','IkV8h3GGrnCBBrxJ6bKXGA#56854449','XU8_WlQJsFSOehbECMuoSw#55772547','IiMN_m4zk222w4QB2vKIYg#56854453','dkJ3xvWVv4egVjRJVFDRPw#56854448','2X5UTSMeqy4PmicFGq6SOg#57226759','Wn6R1zwZlDntM030OYWAEw#56854417','eRMt9-jD6jfpfl3pWPU8-g#56854444','ENz8sxeSn_Axqno5TyvJlg#56144670','sOpAXO0OF6kpUYsCQgRk0w#57226792','FKXWV6W7J7u0LBheoYp9yA#56854452','p2fA2v4VRpr7rUKNwbkbtA#56144651','ivmsdZTFLoYoe8oMRDFRyg#56144677','a4WOED2Qdxdn_PSjszf7fQ#55772580','vYUTA-qnKSXHbg-DrBIFAw#56854422','u_KAFnJEgWbO15Vi_q68Kw#56144673','1Pg7ZOZwXdI7emfLC1AKIw#56854418','bg1yacaoD8Yd9sBZwSVkEA#56144657','s7XKcOn0oJheX7A6U1yPtw#55772545','JZgGfOHUbo0EHvXYBHjvww#56144643','pMbg6jyhLPLzA8rIYIk_sQ#55772551','YXgmVdu0vGYm8OmgGtDLvw#56144658','KiQlNIbMYqDbwM3qSTDXmA#55772544','HGb-AoEXrUU_0Kg4yu-qGw#56854419','fjMMZuleRhdfxSVcgQ7sZw#56144649','srPjx2xS_6NksOpGU8QWaA#56854427','4eLPlwOfe8qZ6CuWnKsWxg#56854413','qKfEGq-ZEwawh1RXxiU_XQ#57501084','0yzWN5nnP7TtuxVGjaxZkQ#56144656','uyT9DGm9AQdp0uemJhu_mw#55772555','MntPnTdv18HeQx2ehn9UAw#56144641','yKGpHSg_4lxMo1VwJ2EBuQ#56144642','DfKB6uXc-nlHUuCDQbm4oQ#56854429','kYy3W_Wa68xLYptUHUrAgw#55772543','FB56jakUAMX7J5JmnH_LMA#55772572','vc_AEf3AkHui4HV-R8cGlw#56144674','sQR0I2x5GNDQlGbzpwr00Q#56854456','bDGtKXBfCkVrq36Ji1SHmA#56854421','jo4yLNxLxW5y0lNb2IkOlQ#55772581','spEzC4ktdzzRnmjAEXLvRQ#57226768','fDmL3071ktD43MCMipT7WA#56144678','n52jno8zl0zsNbF3SuaH3A#57226788','6VGVgJE9UAmpRFKVjrDI_w#56854425','NhfrsSFm3mJQ4dKvGEj-FA#56144640','AFs_HHaLGS5yi0zgNEA1MQ#57226789','OPQxBKQrEiF-GA_-RmKxrQ#56854428','hM0M4H8JED-3IOR6ZyH32w#57226754','mrfglviJkne7p1oMJm8E7Q#56144679','jcwh12ux4ya_8KCMo6jIIQ#56854450','xTAqz6vj3zOodf76J6jvEQ#55772577','P5H_eqOTr1mZJgVLtzhIaw#55772549','PUfqg5qkObW1xZUtBOXK1Q#55772553','b4bgLu84Wd4-s744CuEKWw#56144644','I2giChiupW1WEVTx6n_-Jw#57226755','Ofr5hjw2-g_gc-UVzdNt3Q#56854445','bTuTODwuDGlHvD2M6WWuQg#56144646','fWut04swSSRQtJpqU3xnng#57501087','5vvznB3w2lNsB1z9Pq5o_Q#57226785','2_5O_a3FG5KvUGcr_HwRCA#56144650','FTRdnjQxj4tv-MB9mF_8dg#57501080','fByEYKqCAzHX9FvirTIB2g#57226791','NcXAPORmUL7RCwaCAVSraA#57226794','NqdWByyC16Yn1aEpRLGqnA#56144659','0kyUz8W8ORPyLzFjklE7HA#57226765','9BilERUylnD3Ek21WRItYQ#56144653','EaaAlAuPQSGytGrACFGnHg#55772583','AXUiHq1BTJuIt_Z34EIZpA#56144676','_IHdF_AV3GWcAVn1dk_xYA#55772584','bh3gPi7cmk9Qc7vOUnKCgA#57226763','97wjsuDzOZfJ-_yI9r0Xng#57226795','q9mhtt63Abi-dHWdqWGd1g#57501088','pI9Lv3Ms45TsEV1QWtPuKg#57226783','2Ts7hrzoQCimlCjjbAINig#57501083','u9y3jjiNKj1QqPf23Rz-xw#57501081','OS6UeBPsm2RqBxxr3O786w#56854423','Kck-bicWJJXW_9sZBfZRdQ#56144655')",
    "hints": "RETURN_SFT=*geometry:Point:srid=4326,ingestionTimestamp:Timestamp,nextTimestamp:Timestamp,serializerVersion:String,featurePayload:String",
    "planTime": 1,
    "scanTime": 78584,
    "hits": 99,
    "deleted": false
}

Solution

  • You could double check the explain logging to make sure that you have an ID index, however due to the nature of HBase, doing many single-row scans randomly distributed across your table is going to be slow. If there are any known characteristics to the features that you are updating, you can use a custom feature ID generator to try to create feature IDs so that an update will hit a group of IDs close together in the index - that may improve performance some.