I am using MariaDB. I have a table that I create for every IoT device at the time of the first insertion with a stored procedure. If anyone wonders Why I create a new table for every device is devices publish data every 5 seconds and it is impossible for me to store all of them in a single table.
So, my table structure is like below:
CREATE TABLE IF NOT EXISTS `mqttpacket_',device_serial_number,'`(
`data_type_id` int(11) DEFAULT NULL,
`data_value` int(11) DEFAULT NULL,
`inserted_date` DATE DEFAULT NULL,
`inserted_time` TIME DEFAULT NULL,
FOREIGN KEY(data_type_id) REFERENCES datatypes(id),
INDEX `index_mqttpacket`(`data_type_id`,`inserted_date`)) ENGINE = INNODB;
I have a very long SELECT
query like below to fetch the data between selected type
, date
, and time
.
SELECT mqttpacket_123.data_value, datatypes.data_name, datatypes.value_mult,
CONCAT(mqttpacket_123.inserted_date, ' ',
mqttpacket_123.inserted_time) AS 'inserted_date_time'
FROM mqttpacket_123
JOIN datatypes ON mqttpacket_123.data_type_id = datatypes.id
WHERE mqttpacket_123.data_type_id IN(1,2,3,4,5,6)
AND CASE WHEN mqttpacket_123.inserted_date = '2021-11-08'
THEN mqttpacket_123.inserted_time > '12:25:00'
WHEN mqttpacket_123.inserted_date = '2021-11-15'
THEN mqttpacket_123.inserted_time< '12:25:00'
ELSE (mqttpacket_123.inserted_date BETWEEN '2021-11-08'
AND '2021-11-15')
END;
and this returns around 500k records of the sample below:
| data_value | data_name | value_mult | inserted_date_time |
--------------------------------------------------------------------------------
| 271 | name_1 | 0.1 | 2021-11-08 12:25:04 |
| 106 | name_2 | 0.1 | 2021-11-08 12:25:04 |
| 66 | name_3 | 0.1 | 2021-11-08 12:25:04 |
| 285 | name_4 | 0.1 | 2021-11-08 12:25:04 |
| 61 | name_5 | 0.1 | 2021-11-08 12:25:04 |
| 454 | name_6 | 0.1 | 2021-11-08 12:25:04 |
| 299 | name_7 | 0.1 | 2021-11-08 12:25:04 |
Affected rows: 0 Found rows: 395,332 Warnings: 0 Duration for 1 query: 0.734 sec. (+ 7.547 sec. network)
I keep only the last 2 weeks' data in my tables and clean up the previous data as I have a backup system.
However, Loading the query result to DataTable also takes ~30sec. which is 4 times slower than MySQL.
Do you have any suggestions to improve this performance?
PS. I call this query from C# by the following statement in a Stored Procedure of RunQuery
which takes the query and performs it as it is.
public DataTable CallStoredProcedureRunQuery(string QueryString)
{
DataTable dt = new DataTable();
try
{
using (var conn = new MySqlConnection(_connectionString))
{
conn.Open();
using (var cmd = new MySqlCommand("SP_RunQuery", conn))
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add("@query_string", MySqlDbType.VarChar).Value = QueryString;
using (MySqlDataAdapter sda = new MySqlDataAdapter(cmd))
{
sda.Fill(dt);
}
}
}
}
catch (Exception ex)
{
IoTemplariLogger.tLogger.EXC("Call Stored Procedure for RunQuery failed.", ex);
}
return dt;
}
EDIT: My sensors push a single MQTT packet which contains ~50 different data. There are 12 times
5seconds
in a minute. So, basically, I receive ~600 rows per minute per device.
Data insertion is done in a Stored Procedure
async. I push the JSON content along with the device_id
and I iterate on the JSON to parse and insert into
the table.
PS. The following code is just for clarification. It works fine.
/*Dynamic SQL -- IF they are registered to the system but have notable, create it.*/
SET create_table_query = CONCAT('CREATE TABLE IF NOT EXISTS `mqttpacket_',device_serial_number,'`(`data_type_id` int(11) DEFAULT NULL, `data_value` int(11) DEFAULT NULL,`inserted_date` DATE DEFAULT NULL, `inserted_time` TIME DEFAULT NULL, FOREIGN KEY(data_type_id) REFERENCES datatypes(id), INDEX `index_mqttpacket`(`data_type_id`,`inserted_date`)) ENGINE = InnoDB;');
PREPARE stmt FROM create_table_query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
/*Loop into coming value array. It is like: $.type_1,$.type_2,$.type_3, to iterate in the JSON. We reach each value like $.type_1*/
WHILE (LOCATE(',', value_array) > 0)
DO
SET arr_data_type_name = SUBSTRING_INDEX(value_array,',',1); /*pick first item of value array*/
SET value_array = SUBSTRING(value_array, LOCATE(',',value_array) + 1); /*remove picked first item from the value_array*/
SELECT JSON_EXTRACT(incoming_data, arr_data_type_name) INTO value_iteration; /*extract value of first item. $.type_1*/
SET arr_data_type_name := SUBSTRING_INDEX(arr_data_type_name, ".", -1); /*Remove the $ and the . to get pure data type name*/
/*Check the data type name exists or not in the table, if not insert and assign it's id to lcl_data_type_id*/
IF (SELECT COUNT(id) FROM datatypes WHERE datatypes.data_name = arr_data_type_name) > 0 THEN
SELECT id INTO lcl_data_type_id FROM datatypes WHERE datatypes.data_name = arr_data_type_name LIMIT 1;
ELSE
SELECT devices.device_type_id INTO lcl_device_type FROM devices WHERE devices.id = lcl_device_id LIMIT 1;
INSERT INTO datatypes (datatypes.data_name,datatypes.description,datatypes.device_type_id,datatypes.value_mult ,datatypes.inserted_time) VALUES(arr_data_type_name,arr_data_type_name,lcl_device_type,0.1,NOW());
SELECT id INTO lcl_data_type_id FROM datatypes WHERE datatypes.data_name = arr_data_type_name LIMIT 1;
END IF;
/*To retrieve the table of which device has which datatypes inserted, this is to not to retrieve the datatypes unneccesseraly for the selected device*/
IF (SELECT COUNT(device_id) FROM devicedatatypes WHERE devicedatatypes.device_id = lcl_device_id AND devicedatatypes.datatype_id = lcl_data_type_id) < 1 THEN
INSERT INTO devicedatatypes (devicedatatypes.device_id, devicedatatypes.datatype_id) VALUES(lcl_device_id,lcl_data_type_id);
END IF;
SET lcl_insert_mqtt_query = CONCAT('INSERT INTO mqttpacket_',device_serial_number,'(data_type_id,data_value,inserted_date,inserted_time) VALUES(',lcl_data_type_id,',',value_iteration,',''',data_date,''',''',data_time,''');');
PREPARE stmt FROM lcl_insert_mqtt_query;
EXECUTE stmt;
SET affected_data_row_count = affected_data_row_count + 1;
END WHILE;
Here and here are also extra information that can be found of the server and database regarding the comments.
I have an SSD on the server. There is nothing important else that works other than my dotnet
application and database.
It is usually better to have a DATETIME
column instead of splitting it into two (DATE
and TIME
) columns. That might simplify the WHERE
clause.
Having one table per device is usually a bad idea. Instead, add a column for the device_id.
Not having a PRIMARY KEY
is a bad idea. Do you ever get two readings in the same second for a specific device? Probably not.
Rolling those together plus some other likely changes, start by changing the table to
CREATE TABLE IF NOT EXISTS `mqttpacket`(
`device_serial_number` SMALLINT UNSIGNED NOT NULL,
`data_type_id` TINYINT UNSIGNED NOT NULL,
`data_value` SMALLINT NOT NULL,
`inserted_at` DATETIME NOT NULL,
FOREIGN KEY(data_type_id) REFERENCES datatypes(id),
PRIMARY KEY(device_serial_number, `data_type_id`,`inserted_at`)
) ENGINE = INNODB;
That PK will make the query faster.
This may be what you are looking for after the change to DATETIME
:
AND inserted_at >= '2021-11-08 12:25:00'
AND inserted_at < '2021-11-08 12:25:00' + INTERVAL 7 DAY
To keep 2 weeks' worth of data, DROP PARTITION
is an efficient way to do the delete. I would use PARTITION BY RANGE(TO_DAYS(inserted_at))
and have 16 partitions, as discussed in http://mysql.rjweb.org/doc.php/partitionmaint
If you are inserting a thousand rows every 5 seconds -- With table-per-device, you would need a thousand threads each doing one insert. This would be a nightmare for the architecture. With a single table (as I suggest), and if you can get the 1000 rows together in a process at the same, time, do one multi-row INSERT
every 5 seconds. I discuss other high speed ingestion.