I have always thought that calling putc multiple times is faster than puts or printf. To print "hello" for example, in a real program I would always use puts or printf, but now I'm writing a program that generates C code, so I was wondering whether to generate code as putchar('h'); putchar('e') ...
because I first thought it should be more faster. But I ran a test which gave a very interesting result. The compiler is GCC.
#include <stdio.h>
#include <time.h>
int main() {
time_t timer;
FILE *f;
int i;
f = fopen("out.txt", "w");
#define START_TIMER(TIMER) TIMER = clock()
#define ELAPSED_TIME(TIMER)\
(double)(clock() - TIMER) / (double)CLOCKS_PER_SEC
enum { NUM_ITERS = 9999999 };
START_TIMER(timer);
for (i = 0; i < NUM_ITERS; i++) {
putc('h', f);
putc('e', f);
putc('l', f);
putc('l', f);
putc('o', f);
putc('\n', f);
}
printf("%.3f\n", ELAPSED_TIME(timer));
START_TIMER(timer);
for (i = 0; i < NUM_ITERS; i++) {
fputs("hello", f);
}
printf("%.3f\n", ELAPSED_TIME(timer));
START_TIMER(timer);
for (i = 0; i < NUM_ITERS; i++) {
fprintf(f, "hello\n");
}
printf("%.3f\n", ELAPSED_TIME(timer));
return 0;
}
result without optimization:
4.247 1.013 1.195
result with optimization (-O2):
0.910 1.184 1.315
result with optimization (-O3):
0.920 1.158 1.311
So calling putc multiple times is slower than puts of printf when executing naively without optimization. First, I'm curious why so. Second, which way should I follow for the program-generated C code?
Your intuition for what should be faster is wrong. Generally, putc
/putchar
is going to have a lot of overhead per byte written, since there's a whole cycle of function call, potential locking of the stdio stream (stdout
) being targetted, etc. per byte. On the other hand, functions like printf
or puts
have more overhead per call than putc
(e.g. printf
has to process the format string, and puts
has to call strlen
or equivalent), but that overhead only happens once, no matter how many bytes you're writing. The actual write can take place as a bulk copy to the FILE
's buffer, or a bulk write to the underlying file (if the FILE
is unbuffered).
As for how optimization levels affect this, -O0
probably has a lot more overhead around making the function call which gets optimized out at higher optimization levels.