Search code examples
regexperlparsing

Struggling with Perl regex: Extracting scenario blocks from text


Im struggling with RegEx but first of all here is my input:

 #
  #------------------------------------------- spaceholder ---------------------------------------------------------------------------
  #

  #@E2E-1 @id:1 
  Scenario: Login & Search: B2B_PKG_IN >> BE
    Given I am on the login page
    When I enter <username> and incorrect password multiple times
    Then I should be locked out of my account
    And I should see a lockout message



  #@E2E-2 @id:32 
  Scenario: Login & Search: B2B_PKG_IN >> NL
    Given I am on the login page
    When I enter <username> and incorrect password multiple times
    Then I should be locked out of my account
    And I should see a lockout message


  #
  #------------------------------------------- B2B_PKG_3PA ---------------------------------------------------------------------------


  #

  @E2E-3 @id:3
  Scenario: Login & Search: B2B_PKG_3PA >> BE
    Given I am on the login page
    When I enter <username> and incorrect password multiple times
    Then I should be locked out of my account
    And I should see a lockout message

and this is the output Im trying to archieve:

 #@E2E-1 @id:1 
  Scenario: Login & Search: B2B_PKG_IN >> BE
    Given I am on the login page
    When I enter <username> and incorrect password multiple times
    Then I should be locked out of my account
    And I should see a lockout message



  #@E2E-2 @id:32 
  Scenario: Login & Search: B2B_PKG_IN >> NL
    Given I am on the login page
    When I enter <username> and incorrect password multiple times
    Then I should be locked out of my account
    And I should see a lockout message



  @E2E-3 @id:3
  Scenario: Login & Search: B2B_PKG_3PA >> BE
    Given I am on the login page
    When I enter <username> and incorrect password multiple times
    Then I should be locked out of my account
    And I should see a lockout message

I tried this Pattern on a RegEx test website and it worked so well:

((@[^\n]+|#@[^\n]+)?\s*)?Scenario:[^\n]*\n(?:[^\n]*\n)*?\n 

yet when I use it in perl it doesnt seem to work at all.

This is how I want to use it inside my Code:

while (my $block = <$initial_fh>) {
        push @scenarios, $block if ($block =~ /$pattern/);
    }

Please help


Solution

  • You could set the input separator, $/, to paragraph mode and then only capture the paragraphs beginning with /^\s*#?@/. You only use your actual filename in the open(... below.

    #!/usr/bin/perl
    use strict;
    use warnings;
    
    
    my $file =<<'EOF';
    #
      #------------------------------------------- spaceholder ---------------------------------------------------------------------------
      #
    
      #@E2E-1 @id:1 
      Scenario: Login & Search: B2B_PKG_IN >> BE
        Given I am on the login page
        When I enter <username> and incorrect password multiple times
        Then I should be locked out of my account
        And I should see a lockout message
    
    
    
      #@E2E-2 @id:32 
      Scenario: Login & Search: B2B_PKG_IN >> NL
        Given I am on the login page
        When I enter <username> and incorrect password multiple times
        Then I should be locked out of my account
        And I should see a lockout message
    
    
      #
      #------------------------------------------- B2B_PKG_3PA ---------------------------------------------------------------------------
    
    
      #
    
      @E2E-3 @id:3
      Scenario: Login & Search: B2B_PKG_3PA >> BE
        Given I am on the login page
        When I enter <username> and incorrect password multiple times
        Then I should be locked out of my account
        And I should see a lockout message
    EOF
    
    local $/ = ''; # enable 'paragraph' mode (blocks separated by 2 or more \n)
    my @scenarios;
    
    open my $fh, '<', \$file;
    while (<$fh>) {
        chomp;
        push (@scenarios, $_) if /^\s*#?@/;
    }
    
    print join "\n\n", @scenarios;
    

    Prints:

      #@E2E-1 @id:1
      Scenario: Login & Search: B2B_PKG_IN >> BE
        Given I am on the login page
        When I enter <username> and incorrect password multiple times
        Then I should be locked out of my account
        And I should see a lockout message
    
      #@E2E-2 @id:32
      Scenario: Login & Search: B2B_PKG_IN >> NL
        Given I am on the login page
        When I enter <username> and incorrect password multiple times
        Then I should be locked out of my account
        And I should see a lockout message
    
      @E2E-3 @id:3
      Scenario: Login & Search: B2B_PKG_3PA >> BE
        Given I am on the login page
        When I enter <username> and incorrect password multiple times
        Then I should be locked out of my account
        And I should see a lockout message
    

    (NOTE: your code corrected)

    !/usr/bin/perl
    use strict;
    use warnings;
    
    open my $initial_fh, '<', "initial_state.txt_formatted";
    open my $current_fh, '<', "current_state.txt_formatted";
    
    my @scenarios;
    
    {
        local $/ = ''; # enable 'paragraph' mode (restricted to this block)
        while (<$initial_fh>) {
            chomp;
            push (@scenarios, $_) if /^\s*\#?\@/;
        }
    }
    
    print join("\n", @scenarios);
    
    close $initial_fh;
    close $current_fh;