Search code examples
cbig-ostring.h

String manipulation & algorithmic complexity


Have written 2 implementations of a program that should copy words to an output buffer and print the information out. I'm looking to optimise my program wrt operation time (algorithmic complexity).

/*cat implementation*/
#include <stdio.h>
#include <string.h>

char command_string[2][70] = {
  "Henry",
  "Julie",
};

int main()
{
    char output_transmit_buffer[255];
    int counter = 0;
    while (1)
    {
      char *command = command_string[counter];
      strcat (output_transmit_buffer, command);
      printf ("%s\r\n", output_transmit_buffer);
      //clearing output buffer
      memset (output_transmit_buffer, 0, sizeof (output_transmit_buffer));
      if (counter == 1)
      {
          return 0;
      }
      counter++;
    }
    return 0;
}

/*copy implementation*/
#include <stdio.h>
#include <string.h>

char command_string[2][70] = {
  "Henry",
  "Julie",
};

int main()
{
    
    char output_transmit_buffer[255];
    int counter = 0;
    
    while (1)
    {
    char *ptr = output_transmit_buffer;
    strcpy(ptr, command_string[counter]);
    ptr += strlen(command_string[counter]);
    printf("%s\r\n",output_transmit_buffer);
    
    memset (output_transmit_buffer, 0, sizeof (output_transmit_buffer));
    if (counter == 1)
    {
        return 0;
    }
    counter++;
    }
    
    return 0;
}

Which implementation is better? Is there a better implementation for MCU specific applications.


Solution

  • strcat (output_transmit_buffer, command); is a bug since you didn't initialize the destination buffer. strcat expects a null terminated string as input, unlike strcpy which just writes to memory no matter what was stored there before. So you need to do char output_transmit_buffer[255] = ""; in order to use strcat.

    As for efficiency, both versions are quite naive. There is lots of room for improvement:

    • In this specific case you actually know the size of the strings at compile-time, since they are string literals. So you need not call strlen at all. "Henry" is an array with size sizeof "Henry", which is 5 bytes + 1 null terminator = 6 bytes.
    • Calling memset over and over to clear the whole memory is pointless. In case you were to use strcat, you only need to write a null terminator to the first byte output_transmit_buffer[0] = '\0'.
    • You can save memory by turning the 2D array into an array of pointers instead: const char* command_string[] =. Now only as much memory as is occupied by the string literals are used. It should be const to make it read-only. On a MCU, const means the data ends up in flash instead of RAM which is generally what you want to save RAM.

    However, what you actually want to do is none of this, but to transfer some strings over UART...

    In case all the UART stuff is embedded into your printf implementation (like the termios lib or similar) then just print directly, with no middle man. However, using printf on embedded systems (or systems in general) is very expensive, so if you are truly looking to increase performance you'd write directly to the UART driver without the burden of the bloated stdio.h.

    Code to be used on an actual MCU might look something like this:

    #include <stdint.h>
    
    #define SIZEOF_ARRAY(arr) (sizeof(arr)/sizeof(*arr))
    
    const char* command_string[] = {
      "Henry",
      "Julie",
    };
    
    const uint16_t command_size[] = {
      sizeof("Henry") - 1,
      sizeof("Julie") - 1,
    };
    
    void UART_Tx (const uint8_t* data, uint16_t size)
    {
      /* 
        Pseudo code for some generic UART hardware peripheral.
        SR = status register, DR = data register
        UART_SR_TX is some TX ready flag
      */
    
      for(uint16_t i=0; i<size; i++)
      {
        UART_DR = data[i];
        while((UART_SR & UART_SR_TX) == 0 )
        {}
      }
    }
    
    int main()
    {
        for(uint16_t i = 0; i < SIZEOF_ARRAY(command_string); i++)
        {
          UART_Tx((uint8_t*)command_string[i], command_size[i]);
          UART_Tx((uint8_t*)"\r\n", 2);
        }
    }
    

    Note that sizeof returns an integer of type size_t which is what's commonly used in hosted systems. In small embedded systems, we typically micro-manage sizes of variables though, so I used uint16_t instead, which will result in better code on 8 or 16 bit MCUs. If all strings are shorter than 256 bytes you can even use uint8_t.