I am beginning a personal project of converting an interpreter written in python into C. It is purely for learning purposes.
The first thing I have come across is trying to convert the following:
if __name__ == "__main__":
if not argv[-1].endswith('.py'):
...
And I have done the following conversion thus far for the endswith
method
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
bool endswith(char* str, char* substr)
{
// case1: one of the strings is empty
if (!str || !substr) return false;
char* start_of_substring = strstr(str, substr);
// case2: not in substring
if (!start_of_substring) return false;
size_t length_of_string = strlen(str);
size_t length_of_substring = strlen(substr);
size_t index_of_match = start_of_substring - str;
// case2: check if at end
return (length_of_string == length_of_substring + index_of_match);
}
int main(int argc, char* argv[])
{
char *last_arg = argv[argc-1];
if (endswith(last_arg, ".py")) {
// ...
}
}
Does this look like it's covering all the cases in an endswith
, or am I missing some edge cases? If so, how can this be improved and such? Finally, this isn't a criticism but more a genuine question in writing a C application: is it common that writing C will require 5-10x more code than doing the same thing in python (or is that more because I'm a beginner and don't know how to do things properly?)
And related: https://codereview.stackexchange.com/questions/54722/determine-if-one-string-occurs-at-the-end-of-another/54724
Does this look like it's covering all the cases in an endswith, or am I missing some edge cases?
You are missing at least the case where the substring appears twice or more, one of the appearances at the end.
I wouldn't use strstr()
for this. Instead, I would determine from the relative lengths of the two strings where in the main string to look, and then use strcmp()
. Example:
bool endswith(char* str, char* substr) {
if (!str || !substr) return false;
size_t length_of_string = strlen(str);
size_t length_of_substring = strlen(substr);
if (length_of_substring > length_of_string) return false;
return (strcmp(str + length_of_string - length_of_substring, substr) == 0);
}
With regard to that return
statement: str + length_of_string - length_of_substring
is equivalent to &str[length_of_string - length_of_substring]
-- that is, a pointer to the first character of the trailing substring the same length the same length as substr
. The strcmp
function compares two C strings, returning an integer less than, equal to, or greater than zero depending on whether the first argument is lexicographically less than, equal to, or greater than the second. In particular, strcmp()
returns 0 when its argument are equal, and this function returns the result of exactly such a test.
is it common that writing C will require 5-10x more code than doing the same thing in python
Python is a higher-level language than C, so it is common for C code for a task to be lengthier than Python code for the same task. Also, that C blocks are explcitly delimited makes C code a little longer than Python code. I'm not sure that 5-10x is a good estimate, though, and I think that in this case you're comparing apples to oranges. The code analogous to your Python code is simply
int main(int argc, char* argv[]) {
if (endswith(argv[argc-1], ".py")) {
// ...
}
}
That C has no built-in endswith()
function is a separate matter.