Search code examples
csvmicrocontrollercommunicationsemanticsiot

extra light data format for IoT


I'm looking at how to design a data format to send sensor data (temperature,gps,accelerometer and others) from a device with microcontroller to a backend service through GSM.

I created a simple JSON HTTP API, but the payload is really heavy and I want something as light as possible to basically improve my device's battery life: save processing time on the device to create the data, reduce the amount of data submitted and the time needed to send those data, etc.

I would send text messages in binary format with MQTT, but how to format the text message? I could for example use a CSV format or use fixed amounts of bytes for each sensor. I could only send sensor data that changed (like if GPS coordinates are the same, I don't send them again. for the date/time, I only send the seconds if rest hasn't moved since previous sensor data).

I was expecting to find a protocol that answers all those needs. BUT only standards I found are xml/json based.

There is this specific protocol used for drones to exchange commands, that we could rework on but my needs are slightly different: I just want to send groups (10x / 100x) of sensor data (taken every few seconds).

Would you know anything that could answer this, so that we don't reinvent the wheel?


Solution

  • Here are a couple of thoughts:

    Skip the HTTP protocol. HTTP sends extra data in the header and since you know what data your device is sending it is unnecessary. Just connect to a TCP server and send your data.

    Representing your data in ASCII is inefficient. For example, the number 4294967295 requires 10 bytes to represent but in binary it only requires 4 bytes.

    Without compression, your raw data is the minimum limit on the number of bytes required to successfully send your message. Typically a highly efficient protocol is as simple as a few bits or bytes to describe what kind of data is being sent followed by the data in predefined locations.

    What I would suggest, if you are certain that improving your efficiency outweighs the trouble, is define one or several packet formats that is known to your sender and receiver and transmit it over a TCP connection. Looking at the transaction layer for protocols such as PCI Express might help. They are more complected than you need but will give you the general idea, which is to keep it generally pretty simple.

    As an example, the following json data consumes 270 bytes plus maybe 50 bytes for the HTTP header:

    {"sensor1":{"coordinates":[12345,12345],"time":"2016-12-9T08:50:11"}"sensor2":{"coordinates":[12345,12345],"time":"2016-12-9T08:50:11"}"sensor3":{"coordinates":[12345,12345],"time":"2016-12-9T08:50:11"}"sensor4":{"coordinates":[12345,12345],"time":"2016-12-9T08:50:11"}}
    

    The same data could be represented with a data stream like so:

    number of sensors (1 byte)
    sensor1 lat (4 bytes)
    sensor1 lng (4 bytes)
    sensor1 time (4 bytes)
    sensor2 lat (4 bytes)
    sensor2 lng (4 bytes)
    sensor2 time (4 bytes)
    

    for a total of 27 bytes, including the full time every transmission.

    If you provide what language you are using you may get more help on the actual implementation.