Search code examples
regex

Regex: matching up to the first occurrence of a character


I am looking for a pattern that matches everything until the first occurrence of a specific character, say a ";" - a semicolon.

I wrote this:

/^(.*);/

But it actually matches everything (including the semicolon) until the last occurrence of a semicolon.


Solution

  • You need

    /^[^;]*/
    

    The [^;] is a character class, it matches everything but a semicolon.

    ^ (start of line anchor) is added to the beginning of the regex so only the first match on each line is captured. This may or may not be required, depending on whether possible subsequent matches are desired.

    To cite the perlre manpage:

    You can specify a character class, by enclosing a list of characters in [] , which will match any character from the list. If the first character after the "[" is "^", the class matches any character not in the list.

    This should work in most regex dialects.

    Notes:

    • The pattern will match everything up to the first semicolon, but excluding the semicolon. Also, the pattern will match the whole line if there is no semicolon. If you want the semicolon included in the match, add a semicolon at the end of the pattern.
    • This pattern only works for matching up to the first occurence of a single character. If you want to match up to the first occurence of a (multi-character) string, we've got you covered, too :-). See Matching up to first occurrence of two characters .