Search code examples
regexregular-languagelexical-analysisdfanfa

Generating a regular expression from a language


I need to find a regular expression to define the language of all binary strings with at most one 1 in each substring of length four.

Accepted string: 0001000100

Rejected string: 100010100

My current attempt is: ((0*)(1){0,1}(0*)){4}. Though per various regex-testing sites that is incorrect, something I don't find surprising as I'm new to regular expressions.

I believe this language to be regular and as such I am asked to find a regular expression to define it, along with and NFA and DFA, the process for each I am okay with. However, I am struggling to come up with a regular expression that defines the language.


Solution

  • This regex seems to define the language you want, where no two 1's will be present within four characters,

    ^0*(?:10{3,})*1?0*$
    

    Explanation:

    • ^ - Start of string
    • 0* - One or more zeroes
    • (?:10{3,})* - This matches a literal 1 then at least three or more zeroes and whole of it zero or more times
    • 1?0* - Optionally follows a literal 1 and then zero or more zeroes
    • $ - End of string

    Demo