Search code examples
singlestore

MemSQL Weird Insert/Update Behaviour


We are using single node MemSQL and everything was working fine but when we are trying to move our MemSQL setup to use multi node the insert/update statements are behaving very weirdly

My table structures are like below , have removed many columns , to keep it short

CREATE /*!90618 REFERENCE*/ TABLE `fact_orderitem_hourly_release_update` 
(
  `order_id` int(11) NOT NULL DEFAULT '0',
  `customer_login` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `warehouse_id` int(11) DEFAULT NULL,
  `city` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `store_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`order_id`)
);

CREATE TABLE `fact_orderitem_hourly_scale` (
  `order_id` int(11) NOT NULL DEFAULT '0',
  `order_group_id` int(11) NOT NULL DEFAULT '0',
  `item_id` int(11) NOT NULL,
  `sku_id` int(11) NOT NULL DEFAULT '0',
  `sku_code` varchar(45) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `po_type` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `store_order_id` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `bi_last_modified_on` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00.000000',
  PRIMARY KEY (`item_id`,`sku_id`),
  /*!90618 SHARD */ KEY `sku_id` (`sku_id`),
  KEY `idx_fact_orderitem_hourly_lmd` (`bi_last_modified_on`),
  KEY `idx_fact_orderitem_hourly_ord` (`order_id`),
  KEY `idx_order_group_id` (`order_group_id`),
  KEY `idx_store_order_id` (`store_order_id`)
);

My Load Script :

mysql -h$LiveMemSQL_DB -u$LiveMemSQL_USER --password=$LiveMemSQL_PASS -P$LiveMemSQL_PORT --verbose reports_and_summary < /home/titan/brand_catalog/upsert_memsql_orl_update.sql

Contents of .SQL File :

--start of .sql file
    TRUNCATE TABLE reports_and_summary.fact_orderitem_hourly_release_update;

#Load data into staging

LOAD DATA LOCAL INFILE '/myntra/redshift/delta_files/live_scale_order_release_upd.txt' INTO TABLE reports_and_summary.fact_orderitem_hourly_release_update LINES TERMINATED BY '\n';


#Insert/Update statement  
INSERT INTO reports_and_summary.fact_orderitem_hourly_scale
(
item_id,
sku_id,
customer_login,
order_status,
is_realised,
is_shipped,
shipping_charge,
gift_charge,
warehouse_id,
city,
store_id
)
select
fo.item_id,
fo.sku_id,
fr.customer_login,
fr.order_status,
fr.is_realised,
fr.is_shipped,
fr.shipping_charge,
fr.gift_charge,
fr.warehouse_id,
fr.city,
fr.store_id
from fact_orderitem_hourly_release_update fr
join fact_orderitem_hourly_scale fo
on fr.order_id=fo.order_id
ON duplicate key update
customer_login=values(customer_login),
order_status=values(order_status),
is_realised=values(is_realised),
is_shipped=values(is_shipped),
shipping_charge=values(shipping_charge),
gift_charge=values(gift_charge),
warehouse_id=values(warehouse_id),
city=values(city),
store_id=values(store_id);

--End .sql file

When I trigger the above .sql through mysql command line client , it works sometimes and it doesn't many of times , and some times if I execute the same .sql file continuously 5-10 times , the updates will get effected in one of those runs , and sometimes say for example if there are 3 records with order_id 101 and status SHIPPED and we got an update in merge table say the order status has been changed to DELIVERED , ideally status of all 3 orders should be changed to DELIVERED , but only one or 2 of the rows associated with an order are getting updated but if I execute the same .sql file content through MySQLWorkbench it works perfectly fine , I may sound stupid , but this is what is happening and I am struggling from last 2 days with this weird behavior

Please find the below screen cast , where I captured this behaviour https://www.youtube.com/watch?v=v2HN-n4V0MI&feature=youtu.be


Solution

  • Your staging table is a reference table, and writes to reference tables are replicated asynchronously to the cluster. This is why sometimes your updates work as expected and sometimes they don't.

    You can

    • wait for a bit after writing into the reference table
    • make the staging table non-reference