Search code examples
mysqlspringhibernatespring-bootinnodb

Reducing number of write operations for a logging entity


I am running into an issue with a Spring Boot application that uses Hibernate/MySQL.

The application receives sensor data (up to multiple times per second) and saves the incoming data as a Hibernate entity. You can find the java code for the sensor log entity below.

The system was deployed on a Raspberry Pi, which uses an SD card as memory. It appears that the very(!) high number of write operations stresses the SD card too much and it fails. The last SD card used on the development Pi lasted about 6 months despite it being of high quality.

I want to reduce the number of writes per second/minute. Optimally, the data would be saved once every 5 or 10 minutes. Data loss in case of a power shortage is not an issue. I already looked into hardware solutions like using an external hard drive for the database. My only option is to use the SD card of the RPi. So my approach would be to reduce the write operations by configuring Spring/Hibernate/MySQL. However, after some research I found that there are some different ways to achieve this, and I am not sure which one is best suited for my needs.

1. Naive

I could write a Spring component that accepts entities and stores them into a list, and then saves the lists contents periodically. This seems like a fair amount of programming work and I expect that there is some way to achieve this in a smarter way.

2. Hibernate/MySQL/InnoDB

I found the InnoDB configuration sysvar_innodb_flush_log_at_trx_commit that appears to do what I want, sort of. I could use this configuration to have the mysql connection only flush every 1 second, but this is not optimal:

  • the interval of 1 second is still way too frequently
  • this behaviour would apply to all of my tables, however I only want it for this specific table

Rather than setting the configuration for MySQL directly, I am sure that there is a way to define this behaviour directly in Hibernate, but I haven't found it yet.

Conclusion/Question

Is there any way to achieve the desired behaviour? Do my suggestions based on a quick research make any sense, or do you have better suggestions for things to look into? Any input on that matter is greatly appreciated!

Appendix: Entity code:

@Table(
        name = "sensor_data_point",
        indexes = { @Index(name = "sdp_idx", columnList = "sensor_uid,name,created") }
)
@Entity
@EntityListeners({AuditingEntityListener.class})
public class SensorDataPoint {

    @Id
    @GeneratedValue
    private long id;

    @Column(name = "sensor_uid", length = 191)
    private String sensorUid;

    @Column(name = "name", length = 191)
    private String name;

    @Column(name = "value", length = 191)
    private String value;

    @CreatedDate
    @Column(name = "created")
    @Temporal(TemporalType.TIMESTAMP)
    private Date createdDate;

    // Constructor, Getters, Setters
    // ...

}

Solution

  • sync_binlog=OFF

    Since you can allow some amount of data to be lost, consider the following. For discussion, let's say you can lose 1 minute's data.

    Have an extra MEMORY table called Gather where you collect the sensor data, not the Real InnoDB table. And have an EVENT or cron job that comes along every minute and does the following:

    CREATE TABLE Tmp LIKE Gather;   -- copy schema
    RENAME TABLE Gather TO Inserter, Tmp TO Gather;  -- atomic, fast
    INSERT INTO Real  SELECT * FROM Inserter;   -- batched, single transaction, etc
    

    Notes:

    • Probably you are writing only 1 (or a few) blocks per minute.
    • During the SELECT, you could transform the data (if necessary).

    Also:

    • Get rid of unnecessary indexes; they cause writes to the SSD that has inadequate "wear leveling".
    • Consider writing to Summary table at the in the event. Or wait until the end of the day.
    • More discussion: http://mysql.rjweb.org/doc.php/staging_table -- It focuses on how to ingest data faster than the disk can handle it, which is (in a perverse way) your situation. I added MEMORY to this Answer because it fits your situation.
    • Minimize size of datatypes you use. (To decrease number of blocks to write.)
    • Normalize where practical. (To decrease number of blocks to write.)
    • Once you have followed my suggestions, you will no longer need to worry about innodb_flush_log_at_trx_commit
    • It seems like MySQL 8.0 will have some techniques that cut back on disk hits. But I am not ready to point out specifics.

    Please provide the generated SHOW CREATE TABLE and the SQL for the INSERT code. There may be more suggestions to give you. How many sensors? Are different sensor values in diff rows or diff cols? Etc.