Search code examples
c#stringspecial-charactersstringcollection

How to remove unknown formatted strings from string array?


I'm trying to remove strings with unrecognized characters from string collection. What is the best way to accomplish this?


Solution

  • To remove strings that contain any characters you don't recognize: (EG: if you want to accept lowercase letters, then "foo@bar" would be rejected")

    1. Create a regular expression which defines the set of "recognized" characters, and starts with ^ and ends with $. For example, if your "recognized" characters are uppercase A through Z, it would be ^[A-Z]$
    2. Reject strings that don't match

    Note: This won't work for strings that contain newlines, but you can tweak it if you need to support that

    To remove strings that contain entirely characters you don't recognize: (EG: If you want to accept lowercase letters, then "foo@bar" would be accepted because it does contain at least one lowercase letter)

    1. Create a regular expression which defines the set of "recognized" characters, but with a ^ character inside the square brackets, and starts with ^ and ends with $. For example, if your "recognized" characters are uppercase A through Z, it would be ^[^A-Z]$
    2. Reject strings that DO match