Search code examples
javascriptregexpalindrome

JavaScript Regex start of string clarification + str.replace()


got a question about the start of string regex anchor tag ^. I was trying to sanitize a string to check if it's a palindrome and found a solution to use regex but couldn't wrap my head around the explanations I found for the start of string anchor tag:

To my understanding:

^ denotes that whatever expression that follows must match, starting from the beginning of the string.

Question:

Why then is there a difference between the two output below:

1)

let x = 'A man, a plan, a canal: Panama';
const re = new RegExp(/[^a-z]/, 'gi');
console.log(x.replace(re, '*'));

Output: A*man**a*plan**a*canal**Panama

VS.

2)

let x = 'A man, a plan, a canal: Panama';
const re = new RegExp(/[a-z]/, 'gi');
console.log(x.replace(re, '*'));

Output: * ***, * ****, * *****: ******

VS.

3)

let x = 'A man, a plan, a canal: Panama';
const re = new RegExp(/^[a-z]/, 'gi');
console.log(x.replace(re, '*'));

Output: * man, a plan, a canal: Panama

Please let me know if my explanation for each of the case above is off:

1) Confused about this one. If it matches a character class of [a-z] case insensitive + global find, with start of string anchor ^ denoting that it must match at the start of each string, should it not return all the words in the sentence? Since each word is a match of [a-z] insensitive characters that occurs at the start of each string per global find iteration?

(i.e.

  • finds "A" at the start
  • then on the next iteration, it should start search on the remaining string " man"
  • finds a space...and moves on to search "man"?
  • and so on and so forth...

Q: Why does it then when I call replace does it only targets the non alpha stuff? Should I in this case be treating ^ as inverting [a-z]?

2) This seems pretty straight forward, finds all occurrence of [a-z]and replaces them with the start. Inverse case of 1)??

3) Also confused about this one. I'm not sure how this is different from 1).

/^[a-z]/gi to me means: "starting at the start of the string being looked at, match all alpha characters, case insensitive. Repeat for global find".

Compared to:

1) /[^a-z]/gi to me means: "match all character class that starts each line with alpha character. case insensitive, repeat search for global find."

To mean they mean exactly the same @_@. Please let me know how my understanding is off for the above cases.


Solution

    • Your first expression [^a-z] matches anything other than an alphabetic, lower case letter, therefore that's why when you replace with * all the special characters such as whitespace, commas and colons are replaced.

    • Your second expression [a-z] matches any alphabetic, lower case letter, therefore the special characters mentioned are not replaced by *.

    • Your third expression ^[a-z] matches a alphabetic, lower case letter at the start of the string, therefore only the first letter is replaced by *.

    For the first two expressions, the global flag g ensures that all characters that match the specified pattern, regardless of their position in the string, are replaced. For the third pattern however, since ^ anchors the pattern at the beginning of the string, only the first letter is replaced.

    As you mentioned, the i flag ensures case insensitivity, so that all three patterns operate on both lower and upper case alphabetic letters, from a to z and A to Z.

    The character ^ therefore has two meanings:

    • It negates characters in a character set.
    • It asserts position at the start of string.