When to use static inline instead of regular functions

When I inspect other people's codes, I sometimes encounter static inline functions implemented in header files as opposed to regular function implementations in C files.

For example, cache.h header file (https://github.com/git/git/blob/master/cache.h) of git contains many such functions. One of them is copied below;

static inline void copy_cache_entry(struct cache_entry *dst,
                    const struct cache_entry *src)
{
    unsigned int state = dst->ce_flags & CE_HASHED;

    /* Don't copy hash chain and name */
    memcpy(&dst->ce_stat_data, &src->ce_stat_data,
            offsetof(struct cache_entry, name) -
            offsetof(struct cache_entry, ce_stat_data));

    /* Restore the hash state */
    dst->ce_flags = (dst->ce_flags & ~CE_HASHED) | state;
}

I was wondering what are the advantages of using static inline functions compared to regular functions. Is there any guideline one can use to choose which style to adapt?

Solution

Inlining is done for optimization. However, a little known fact is that inline can also hurt performance: Your CPU has an instruction cache with a fixed size, and inlining has the downside of replicating the function at several places, which makes the instruction cache less efficient.

So, from a performance point of view, it's generally not advisable to declare functions inline unless they are so short that their call is more expensive than their execution.

To put this in relation: a function call takes somewhere between 10 to 30 cycles of CPU time (depending on the amount of arguments). Arithmetic operations generally take a single cycle, however, memory loads from first level cache takes something like three to four cycles. So, if your function is more complex than a simple sequence of at most three memory accesses and some arithmetic, there is little point in inlining it.

I usually take this approach:

If a function is as simple as incrementing a single counter, and if it is used all over the place, I inline it. Examples of this are rare, but one valid case is reference counting.
If a function is used only within a single file, I declare it as static, not inline. This has the effect that the compiler can see when such a function is used precisely one time. And if it sees that, it will very likely inline it, no matter how complex it is, since it can prove that there is no downside of inlining.
All other functions are neither static nor inline.

The example in your question is a borderline example: It contains a function call, thus it seems to be too complex for inlining at first sight.

However, the memcpy() function is special: it is seen more as a part of the language than as a library function. Most compilers will inline it, and optimize it heavily when the size is a small compile time constant, which is the case in the code in question.

With that optimization, the function is indeed reduced to a short, simple sequence. I cannot say whether it touches a lot of memory because I don't know the structure that is copied. If that structure is small, adding the inline keyword seems to be a good idea in this case.