Search code examples
regexansibleregex-greedy

How to read a block of text from a file in ansible


Hi I am reading content from a file.

The file contains the following content.

======

interface And

public void and(int, int);

public void nand(int, int);

======

interface Or

 public void or(int, int);

 public void nor(int, int);

======

interface Xor

 public void xor(int, int);

 public void xnor(int, int);

======

interface Not

 public void not(int);

======

class BitWise extends And, Or, Xor, Not

// Implementation of the Interfaces goes here

======

I am trying to read only the interfaces

I went through this How to read a particular part of file in ansible

---
 - name: Read the Interfaces
   hosts: 127.0.0.1
   connection: local
   vars:
      - file_path: " {{ playbook_dir }}/input_file.txt"
      - my_interfaces: []
   tasks:
         - name: Reading the interfaces
           set_fact:
                   my_interfaces: "{{ my_interfaces + [ item ] }}"
           with_lines: "cat {{file_path}}"
           when: item is search('^interface.+?=*')
         - name: Printing all the interfaces
           debug:
                  var: my_interfaces

The Programs Output is

ok: [127.0.0.1] => {
    "my_interfaces": [
        "interface And",
        "interface Or",
        "interface Xor",
        "interface Not"
    ]
}

But The desired output is

ok: [127.0.0.1] => {
    "my_interfaces": [
        "interface And \n public void and(int, int) \n public void nand(int, int)",
        "interface Or \n public void or(int, int) \n public void nor(int, int)",
        "interface Xor \n public void xor(int, int) \n public void xnor(int, int)",
        "interface Not \n public void not(int)",
    ]
}

I think that I am doing something wrong in the regular expression part. But I don't know how to correct it to get the desired output.Could anyone help me to solve the problem. And is there any other way than this to do the same task.


Solution

  • In your pattern ^interface.+?=* this part .+? is non greedy so the engine would match 1+ times as least as possible. This part =* matches an equals sign 0+ times.

    When there is no equals sign in the interface, it would only match interface followed by a space if the dot does not match a newline.

    You have to enable that the dot matches a newline (use an inline modifier (?s) if supported) if you want to use your pattern. Use a capturing group or a positive lookahead to not match the newline and the equals signs but make sure that it is there.

    (?s)^interface\b.+?(?=\r?\n=)
    

    Regex demo

    Another option could be to match interface and the rest of the line. Then repeat matching lines as long as the next line does not start with an equals sign using a negative lookahead (?!=)

    ^interface\b.*(?:\r?\n(?!=).*)*
    

    Regex demo