Search code examples
cstrlen

Is it faster to use strlen() than just check for null character in a for loop?


I've only been able to find two posts related to this topic. In one post, the code in question was written such that strlen() was called in every iteration of the loop, which many users pointed out would cause the loop to be much slower but didn't discuss the case in which you store the result of strlen() in a variable and then use it in the condition of the loop. In the other post, (Why is strlen() about 20 times faster than manually looping to check for null-terminated character?) a user pointed out that the strlen() function has been optimized over time, making it faster, but I'm still a bit confused as to how exactly storing strlen() in a variable and using it in a loop condition could make it so much faster and was hoping someone with more knowledge on how strlen() is optimized could explain it to me.

Edit: From @n. 1.8e9-where's-my-share m. 's comment, I see now that the other post is referring to using strlen() rather manually looping through the string to find its length. But, I would still like to know if there is a difference in speed between using strlen() in the loop condition or just checking for the null character.

Side Note: I raise the question because I was working on a rather straightforward problem (https://leetcode.com/problems/ransom-note/) in which you are tasked with writing a function that takes two strings, ransomNote and magazine, and determines whether or not the ransomNote can be constructed using only the characters from the magazine (each character in magazine may only be used once), and came up with two possible solutions:

bool canConstruct(char * ransomNote, char * magazine){
    int magCharCount[26] = {0};
    for (int i = 0, magLen = strlen(magazine); i < magLen; ++i) ++magCharCount[magazine[i] - 'a']; 
    for (int i = 0, rNotelen = strlen(ransomNote); i < rNotelen; ++i) if (--magCharCount[ransomNote[i] - 'a'] < 0) return false; 
    return true;
}

bool canConstruct(char * ransomNote, char * magazine){
    int magCharCount[26] = {0};
    for (int i = 0; *magazine; ++magazine) ++magCharCount[*magazine - 'a'];
    for (int i = 0; *ransomNote; ++ransomNote) if (--magCharCount[*ransomNote - 'a'] < 0) return false;
    return true;
}

In most of my CS classes, it seems my professors have favored the second implementation and other solutions to the problem do as well.


Solution

  • No matter if you're using strlen() or not, you need to loop through the strings anyway.

    Therefore the solution using strlen() simply adds a second going through the loop. It might be fast, but it's not needed.

    In the second solution, without strlen(), *magazine and *ransomNote are accessed twice each; for checking \0 and for the actual work. A compiler can optimise this to one access (which you needed anyway).

    To conclude: Using strlen() seems less elegant because it's not needed, and will probably be slower.

    EDIT: Some more background: strlen() is typically optimised for each platform using machine language. CPUs can have special instructions that make tasks like this very fast. You can never beat this hand-optimised machine code with a regular loop in C (unless a compiler could/would understand what you're doing and replace it with optimised code).

    Also a compiler, that of course 'knows' strlen() and similar library functions, could replace it with a chunk of instructions and don't even do a proper function call. (This is one type of optimisation that could also be done for small(er) user functions.)

    General advice: Always write complete for/if/... blocks with { and } because the one-liners you have at the moment are confusing to the eye. The less visual forms of the same source code structure, the easier it becomes to process for us people.

    bool canConstruct(char * ransomNote, char * magazine)
    {
        int magCharCount[26] = {0};
        for (int i = 0; *magazine; ++magazine)
        {
            ++magCharCount[*magazine - 'a'];
        }
    
        for (int i = 0; *ransomNote; ++ransomNote)
        {
            return (--magCharCount[*ransomNote - 'a'] < 0)
            {
                return false;
            }
        }
    
        return true;
    }