Search code examples
clinuxembeddedatoi

How to get an integer from stdin? Fast. Can trade accuracy for performance


I'm trying to come up with a fast and simple

"Get a string from stdin and convert to an integer. If you can't, just pretend we got zero".

This is a Linux embedded system, CPU and memory are at a premium. Performance is important, accuracy not so much. This should be able to do multiple ingests per second. I will eventually turn it into a daemon and store latest 1024 values in an array.

Here's my take using atoi:

#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[] ) {
  char *c = argv[1];
  unsigned int i = 1; /* on atoi() failure, i = 0 */

  if (i = atoi(c)) {
      puts ("atoi() success");
  }
  else {
      puts ("atoi() FAILED");
  }

  printf("argv[1] = %s\n", argv[1]);
  printf("      i = %d\n", i);
}

A few test runs / fuzzing:

# ./test_atoi 3
atoi() success
argv[1] = 3
      i = 3

# ./test_atoi 99999999999999999999
atoi() success
argv[1] = 99999999999999999999
      i = 2147483647

# ./test_atoi 3.14159
atoi() success
argv[1] = 3.14159
      i = 3

# ./test_atoi $(echo -ne "\u2605")
atoi() FAILED
argv[1] = ★
      i = 0

This fails:

# ./test_atoi $(echo -e "\0")
Segmentation fault

I'll add a check for NUL then:

if (argv[1] == '\0') {
    i = 0;
}

Will this be enough? Have i just (badly) re-implemented strtol? Should i just go ahead and use strtol? If yes, anything i should be checking for, that strtol isn't already?

What i really really care about is not dying because of bad input. I can happily live with getting occasional garbage from the conversion.

EDIT: int i = 1 just because i want to see if atoi() makes it 0.

Ghetto profiling with time

EDIT: i've dropped the print statements and wrapped reading from stdin into atoi/strtol in a for loop.

# time seq 0 999888 | ./test_atoi
real    0m5.245s
user    0m5.870s
sys     0m0.030s

# time seq 0 999888 | ./test_atoi
real    0m5.230s
user    0m5.960s
sys     0m0.050s

# time seq 0 999888 | ./test_atoi
real    0m5.395s
user    0m5.920s
sys     0m0.080s

# time seq 0 999888 | ./test_strtol    
real    0m5.332s
user    0m5.860s
sys     0m0.030s

# time seq 0 999888 | ./test_strtol
real    0m5.023s
user    0m5.790s
sys     0m0.060s

# time seq 0 999888 | ./test_strtol
real    0m5.286s
user    0m5.970s
sys     0m0.010s

Alright, this is insane. I should do something more productive with my time, and yours!


Solution

  • This is a Linux embedded system, CPU and memory are at a premium.

    Yes. Err, no. If you're running a normal linux, your kernel will use atoi and the inverse in a few thousand places. Your single number parser will hardly make any impact, unless you're intending to call it several thousand times per second...

    Should i just go ahead and use strtol?

    for the reasons above: yes.

    If yes, anything i should be checking for, that strtol isn't already?

    you should check strtol's return value. I really don't sympathesize with your "don't need precision" approach. something like this is either done right, or catastrophically wrong.

    EDIT You said:

    don't need precision = i only care about values 0 - 100

    This means a) you just need atoi, not atol/strtol; there, CPU cycles saved. Next do you actually need to convert strings that might look like 13.288 to integers, or can you assume that all strings are 1 to three characters long? In that case, and for raw performance, maybe

    inline unsigned char char2digit(const char *c) {
        unsigned char v = *c - '0';
        return (v<1 || v>9)? 0 : v;
    }
    inline signed char characters2number(const char *string)
    {
        size_t len = strnlen(string,4);
        if(len < 1 || len > 3)
            return -1;
        signed char val = 0;
        signed char power_of_ten = 1;
        for(unsigned char idx = 1; idx <= len; ++idx)
        {
            signed char val += power_of_ten * char2digit(string + len - idx)
            power_of_ten *= 10;
        }
        return val;
    }
    

    I mean, if you're on a toaster. Otherwise atoi has your back. You might still want to check strnlen.