Search code examples
cregexstring-parsing

Checking for a blank line in C - Regex


Goal:

  • Find if a string contains a blank line. Whether it be '\n\n', '\r\n\r\n', '\r\n\n', '\n\r\n'

Issues:

  • I don't think my current regex for finding '\n\n' is right. This is my first time really using regex outside of simple use of * when removing files in command line.

  • Is it possible to check for all of these cases (listed above) in one regex? or do I have to do 4 seperate calls to compile_regex?

Code:

int checkForBlankLine(char *reader) {
    regex_t r;
    compile_regex(&r, "*\n\n");
    match_regex(&r, reader);

    return 0;
}

void compile_regex(regex_t *r, char *matchText) {
    int status;
    regcomp(r, matchText, 0); 
}

int match_regex(regex_t *r, char *reader) {
    regmatch_t match[1];
    int nomatch = regexec(r, reader, 1, match, 0);
    if (nomatch) {
        printf("No matches.\n");
    } else {
        printf("MATCH!\n");
    } 
    return 0;
}

Notes:

  • I only need to worry about finding one blank line, that's why my regmatch_t match[1] is only one item long

  • reader is the char array containing the text I am checking for a blank line.

  • I have seen other examples and tried to base the code off of those examples, but I still seem to be missing something.

Thank you kindly for the help/advice.

If anything needs to be clarified please let me know.


Solution

  • Check what the * in a regex means. It's not like the wildcard "anything" in the command line. The * means that the previous component can appear any amount of times. The wildcard in regex is the .. So if you want to say match anything you can do .*, which would be anything, any amount of times.

    So in your case you can do .*\n\n.* which would match anything that has \n\n.

    Finally, you can use or in a regex and ( ) to group stuff. So you can do something like .*(\n\n|\r\n\r\n).* And that would match anything that has a \n\n or a \r\n\r\n.

    Hope that helps.