Search code examples
cperformancecs50strcmpstrlen

Will comparing string length before checking if string is the same yield me non-negligible speed increases for C?


Very new to programming in C so sorry if I have badly misunderstood something. I am currently doing the speller problem set from CS50 if anyone is familiar with this, and I given words from a text to check if they are spelled correctly by comparing them to a given dictionary. I have sorted this dictionary into a hash table with about 17,000 buckets which point to on average, a linked list with a length of around 100 nodes. There may be several hundred thousand words that need to be spell checked.

My question is this, will checking to see if the length of each word from my dictionary matches the length of the word required to be spellchecked, using strlen(), before then using strcmp() only if the lengths match, be faster than just checking if the strings match using strcmp().

I do potentially see that if there are lots of words that have the same length as your word you want to check, checking the length will disadvantage you but I am wondering if the speed increase, if there is one, by checking the length for words with a more uncommon length will make up for this.


Solution

  • strcmp is an O(n) operation - it iterates over both strings until one of them ends or a mismatching pair of characters is encountered, so at first glance comparing the lengths sounds like a good idea. However, strlen in C is also an O(n) operation - it takes a char* and iterates until it hits a \0 character. So just using strlen naively would in fact probably make your program slower.