Search code examples
regex

A regular expression for finding numbers and spaces in a specific configuration


A regular expression is required, which must find a sequence of 12 numbers, between which there can be no more than 2 spaces (not newlines) inside. Here are examples:

Example 1:3345624 90812
Example 2:908 7865123 23
Example 3:908765312334
Example 4:90878903556 1
Example 5:67555123452  5

An example of an invalid pattern is /[0-9|\s]{12,14}/, it allows trailing spaces, single spaces, newlines or alternating spaces and numbers.


Solution

  • I would do [\d ]{12,14} and then in the calling code see if result has 12 numbers or 2 spaces. Otherwise brute-force generate a regex along these lines:

    \d{12}|              # 1 number
    
    \d{1} {1,2}\d{11}|   # 2 numbers
    ...
    \d{11} {1,2}\d{1}|
    
    \d{1} \d{1} \d{10}|  # 3 numbers
    ...
    \d{10} \d{1} \d{1}
    

    Here's a suitable bash script to generate the regex:

    sep='|'
    #sep=$'\n'
    printf '\d{12}%s' "$sep"
    for((i=1; i<=11; i++))
    do 
        printf '\d{%d} {1,2}\d{%d}%s' $i $((12-i)) "$sep"
    done
    for((i=1; i<=10; i++))
    do
        for((j=1; i+j<=11; j++))
        do
            [ $i -eq 10 -a $j -eq  1 ] && sep=$'\n'
            printf '\d{%d} \d{%d} \d{%d}%s' $i $j $((12-i-j)) "$sep"
       done
    done
    

    and example run:

    \d{12}|\d{1} {1,2}\d{11}|\d{2} {1,2}\d{10}|\d{3} {1,2}\d{9}|\d{4} {1,2}\d{8}|\d{5} {1,2}\d{7}|\d{6} {1,2}\d{6}|\d{7} {1,2}\d{5}|\d{8} {1,2}\d{4}|\d{9} {1,2}\d{3}|\d{10} {1,2}\d{2}|\d{11} {1,2}\d{1}|\d{1} \d{1} \d{10}|\d{1} \d{2} \d{9}|\d{1} \d{3} \d{8}|\d{1} \d{4} \d{7}|\d{1} \d{5} \d{6}|\d{1} \d{6} \d{5}|\d{1} \d{7} \d{4}|\d{1} \d{8} \d{3}|\d{1} \d{9} \d{2}|\d{1} \d{10} \d{1}|\d{2} \d{1} \d{9}|\d{2} \d{2} \d{8}|\d{2} \d{3} \d{7}|\d{2} \d{4} \d{6}|\d{2} \d{5} \d{5}|\d{2} \d{6} \d{4}|\d{2} \d{7} \d{3}|\d{2} \d{8} \d{2}|\d{2} \d{9} \d{1}|\d{3} \d{1} \d{8}|\d{3} \d{2} \d{7}|\d{3} \d{3} \d{6}|\d{3} \d{4} \d{5}|\d{3} \d{5} \d{4}|\d{3} \d{6} \d{3}|\d{3} \d{7} \d{2}|\d{3} \d{8} \d{1}|\d{4} \d{1} \d{7}|\d{4} \d{2} \d{6}|\d{4} \d{3} \d{5}|\d{4} \d{4} \d{4}|\d{4} \d{5} \d{3}|\d{4} \d{6} \d{2}|\d{4} \d{7} \d{1}|\d{5} \d{1} \d{6}|\d{5} \d{2} \d{5}|\d{5} \d{3} \d{4}|\d{5} \d{4} \d{3}|\d{5} \d{5} \d{2}|\d{5} \d{6} \d{1}|\d{6} \d{1} \d{5}|\d{6} \d{2} \d{4}|\d{6} \d{3} \d{3}|\d{6} \d{4} \d{2}|\d{6} \d{5} \d{1}|\d{7} \d{1} \d{4}|\d{7} \d{2} \d{3}|\d{7} \d{3} \d{2}|\d{7} \d{4} \d{1}|\d{8} \d{1} \d{3}|\d{8} \d{2} \d{2}|\d{8} \d{3} \d{1}|\d{9} \d{1} \d{2}|\d{9} \d{2} \d{1}|\d{10} \d{1} \d{1}