Search code examples
regexinvoice

Regular expression - regex - matching


How do I make an expression to match this below: (including spaces , whitespaces, other characters in the numbers)? Example:

asdasd (S)FS-980/24/BS/02 , asdasd

ffff (S)FS-9525/23/BS/12ss, IP000003587

FSW-952/24/BSB/02

FS-2/24/F/02

FS-4444/23/F/11

I mean if invoice number : FS -444 4/23/F/11 - it also correct to: FS-4444/23/F/11, or FS-4444g/23g/F/1g1

i try like this: ...FS(W)-../.{2}/..(.)/\s.{2}

Demo: https://regex101.com/r/bq7lIM/3


Solution

  • You could update your pattern to get the matches first. Then do a replacement over those matches, and remove the characters that you don't want.

    To match the strings:

    (?:\([A-Z]+\))?FS[A-Z ]*-\d(?:[^\n,()-]*\w)?
    

    The pattern matches:

    • (?:\([A-Z]+\))? Optionally match ( followed by 1+ chars A-Z and then )
    • FS[A-Z ]*- match FS followed by optional chars A-Z or spaces
    • \d Match a digit
    • (?:[^\n,()-]*\w)? Optionally match any char except what is listed in the character class, and end on a word character

    If you want to match a space without a newline in C# you can also use \p{Zs}

    Regex demo

    To replace the unwanted characters over the matched parts:

    (?<=^FS-[a-zA-Z0-9]*)[a-z]|[\p{Zs}\ta-z](?=[a-zA-Z0-9\p{Zs}\t]*$)
    

    This pattern will match all lowercase characters at the beginning of the string before the - and match all lowercase chars a-z or spaces at the end of the string, not crossing any characters other then what is listed in the character class present in the positive lookahead assertion.

    See a Regex demo and a C# demo.

    string patternMatch = @"(?:\([A-Z]+\))?FS[A-Z ]*-\d(?:[^\n,()-]*\w)?";
    string patternReplace = @"(?<=^FS-[a-zA-Z0-9]*)[a-z]|[\p{Zs}\ta-z](?=[a-zA-Z0-9\p{Zs}\t]*$)";
    string input = @"Fakturasadas : (S)FSK-69/23/GFK/12hjkhjkddddd
    (S)FSK-69/23/GFK/ 1 2 
    (S)FSK-69/23/GFK/12";
    
    var strings = Regex
        .Matches(input, patternMatch)
        .Cast<Match>().Select(match => {                    
            return Regex.Replace(
                match.Value,
                patternReplace,
                ""
            );
        }
    );
    
    foreach (var s in strings) {
        Console.WriteLine(s);
    }
    

    Output

    (S)FSK-69/23/GFK/12
    (S)FSK-69/23/GFK/12
    (S)FSK-69/23/GFK/12