Search code examples
regexnotepad++

regex with a limit on new lines


I have the following text:

            gvkey |
             1017  |   .9610464    1.04128     0.92   0.356    -1.079825    3.001917
             1018  |  -.0599428   1.306879    -0.05   0.963    -2.621379    2.501493
             1021  |  -.0766854   .9906029    -0.08   0.938    -2.018231     1.86486
             1034  |  -2.678616   1.308118    -2.05   0.041     -5.24248   -.1147511
             1056  |   1.694514   .9563385     1.77   0.076    -.1798751    3.568903
             1065  |   1.106467   .9584568     1.15   0.248    -.7720734    2.985008
            10001  |   .7988226   1.019213     0.78   0.433    -1.198799    2.796444
            10010  |   .8203764   .9429188     0.87   0.384     -1.02771    2.668463
            10022  |   1.590896   .9615904     1.65   0.098    -.2937862    3.475579
            10030  |   .0067641   .9798901     0.01   0.994    -1.913785    1.927313
            10039  |   3.767551   .9168058     4.11   0.000     1.970645    5.564458
            10056  |    2.29646   .9789753     2.35   0.019     .3777042    4.215217
            10066  |   2.635614   .9398462     2.80   0.005     .7935496    4.477679
            10088  |   1.679799    .930843     1.80   0.071    -.1446195    3.504218
            10089  |  -16.62772   1.017178   -16.35   0.000    -18.62135   -14.63409
            10093  |   .3149815   .9174881     0.34   0.731    -1.483262    2.113225
            10097  |   2.976634   .9224759     3.23   0.001     1.168615    4.784654
            10107  |  -.1184532   .9405728    -0.13   0.900    -1.961942    1.725036
            10115  |   1.899066   .9165281     2.07   0.038      .102704    3.695428
           208068  |  -1.236473   .9326577    -1.33   0.185    -3.064448    .5915026
           209341  |   -.804362   .9516883    -0.85   0.398    -2.669637    1.060913
           213449  |  -1.248011   .9460252    -1.32   0.187    -3.102186    .6061647
           220546  |  -4.424031   .9431063    -4.69   0.000    -6.272485   -2.575576
           221821  |  -.9759739   .9240414    -1.06   0.291    -2.787062    .8351139
           222111  |  -3.733076   .9440901    -3.95   0.000    -5.583458   -1.882693
           223098  |  -2.892674   1.158793    -2.50   0.013    -5.163865   -.6214818
           242977  |  -1.324193   .9371738    -1.41   0.158    -3.161019    .5126345
                   |
             _cons |   .1156292    .915384     0.13   0.899    -1.678491    1.909749
------------------------------------------------------------------------------------


            gvkey |
             1017  |   .9610464    1.04128     0.92   0.356    -1.079825    3.001917
             1018  |  -.0599428   1.306879    -0.05   0.963    -2.621379    2.501493
             1021  |  -.0766854   .9906029    -0.08   0.938    -2.018231     1.86486
             1034  |  -2.678616   1.308118    -2.05   0.041     -5.24248   -.1147511
             1056  |   1.694514   .9563385     1.77   0.076    -.1798751    3.568903
             1065  |   1.106467   .9584568     1.15   0.248    -.7720734    2.985008
            10001  |   .7988226   1.019213     0.78   0.433    -1.198799    2.796444
            10010  |   .8203764   .9429188     0.87   0.384     -1.02771    2.668463
            10022  |   1.590896   .9615904     1.65   0.098    -.2937862    3.475579
            10030  |   .0067641   .9798901     0.01   0.994    -1.913785    1.927313
            10039  |   3.767551   .9168058     4.11   0.000     1.970645    5.564458
            10056  |    2.29646   .9789753     2.35   0.019     .3777042    4.215217
            10066  |   2.635614   .9398462     2.80   0.005     .7935496    4.477679
            10088  |   1.679799    .930843     1.80   0.071    -.1446195    3.504218
            10089  |  -16.62772   1.017178   -16.35   0.000    -18.62135   -14.63409
            10093  |   .3149815   .9174881     0.34   0.731    -1.483262    2.113225
            10097  |   2.976634   .9224759     3.23   0.001     1.168615    4.784654
            10107  |  -.1184532   .9405728    -0.13   0.900    -1.961942    1.725036
            10115  |   1.899066   .9165281     2.07   0.038      .102704    3.695428
           208068  |  -1.236473   .9326577    -1.33   0.185    -3.064448    .5915026
           209341  |   -.804362   .9516883    -0.85   0.398    -2.669637    1.060913
           213449  |  -1.248011   .9460252    -1.32   0.187    -3.102186    .6061647
           220546  |  -4.424031   .9431063    -4.69   0.000    -6.272485   -2.575576
           221821  |  -.9759739   .9240414    -1.06   0.291    -2.787062    .8351139
           222111  |  -3.733076   .9440901    -3.95   0.000    -5.583458   -1.882693
           223098  |  -2.892674   1.158793    -2.50   0.013    -5.163865   -.6214818
           242977  |  -1.324193   .9371738    -1.41   0.158    -3.161019    .5126345
                   |
             _cons |   .1156292    .915384     0.13   0.899    -1.678491    1.909749
------------------------------------------------------------------------------------

And am looking for a regex expression that removes all the gvkeys, so the output would like something this:

                   |
             _cons |   .1156292    .915384     0.13   0.899    -1.678491    1.909749
------------------------------------------------------------------------------------

                   |
             _cons |   .1156292    .915384     0.13   0.899    -1.678491    1.909749
------------------------------------------------------------------------------------

I'm very new to regex, and have tried searching for the following and replace it with nothing in Notepad++:

gvkey.*[\n].*_cons

The problem is it finds all the values in between the first gvkey column and the second one and removes everything in between.

Is there a way to have the search term find each gvkey column once? (So in my example, it would find and replace the gvkey column twice in total)

Many thanks in advance.


Solution

  • You may use this regex for search in MULTILINE mode (it is called matches newline option in Notepad++:

    ^\h*gvkey[\s\S]*?\R(?=\h+\|)
    

    And replace with an empty string.

    RegEx Demo

    RegEx Details:

    • ^: Line start
    • \h*: matches 0 or more horizontal whitespaces
    • gvkey: Matches gvkey string
    • [\s\S]*?: Matches 0 or more of any character including newlines (lazy)
    • \R: Matches any newlines
    • (?=\h+\|): Positive lookahead to assert that we have 0 or more horizontal whitespaces followed by a pipe character ahead of us