Search code examples
regexvisual-studio-codetext

Filtering a string in regex


I am trying to use regex for the first time because I have a very large text file (over 2000 lines) that contains some data I need to parse and correct.

Consider the following:

|C170|14|1712||5,00000|UN|3,05|0,00|0|000|5102|1100|3,05||
|C170|15|1713||5,00000|UN|2,50|0,00|0|000|5102|1100|2,50||
|C100|1|0|000009043|55|00|1|12769|52230734631526000191550010000127691220214465|03072023|
|C170|1|2005||1,00000|UN|395,00|0,00|0|000|5102|1100|395,00||
|C100|1|0|000009316|55|00|1|12770|52230734631526000191550010000127701782306100|03072023|
|C170|1|119||1,00000|UN|245,00|0,00|0|000|6102|1100|245,00||
|C170|2|5651||1,00000|PC|299,00|0,00|0|000|6102|1100|299,00||
|C170|3|211||2,00000|UN|10,00|0,00|0|000|6102|1100|10,00|12,00||

I need to filter every line that starts with |C170|. Sometimes that line contains |6102| and other times it contains |5102|.

|C170| is always the first field in the line and |6102| or |5102| is always the 11th field in the line.

I'm using VSCode to try and filter every line that starts with |C170| and contains |5102|. How can I do that in regex? Or if VSCode has another function to help me find lines that contain what I need so I can input a value in the last pipes because they are very important in this file. I have not found any examples that contain pipes for the search.

Sorry for my English and thanks in advance.

Reading some documents I have found that groups and lookahead might be what I need but I can't filter exactly the strings I need which are |C170| and |5102| or |6102|.

This is what I got so far


Solution

  • Here is something that could work:

    ^\|C170([|][^|]*){9}\|(6102|5102)
    

    You can try it out on https://regex101.com/ Here is how it works:

    1) ^\|C170
    
    #First we literally match C170
    
    2) ([|][^|]*){9}
    # Then we match 9 times a group starting with | followed by a cell that may not contain anything but another vertical bar.
    
    3) \|(6102|5102)
    # Finally we match the 11th place.
    

    I'm not too familiar with the regex engine of vscode. In vim | is non-magic by default, but most modern regex engines require escaping the |.