I am preforming the following ack-grep
inside of a bash
script and, while it mostly works .. I am getting inconsistent results. The line I am using is:
ack-grep '(?<=imageserver).*(?=png)'
This is Perl type Regex (supported by both ack_grep
and plain grep
) -- I am searching for everything between imageserver
and png
. While it mostly works -- I get inconsistent results IE:
Why is it that you'll see it matched the first umpteen lines, then it matches something that it (in theory) should have two or three matches WITHIN. It's obvious the last "block" should have been matched after the first iteration of png
however it skipped it multiple times and finally settled --
So, the first couple returning are my desired result -- And the last highlighted block is the "bad" result. How do I get consistent results here? I'll paste some text that returns this result for copy/paste posterity (verifiable example). If you copy and paste the following into a text file, you should get the same results I am getting.
Is this a syntax error, a misunderstanding, or a bug? Hate when things should work but don't ... The banes of development.
.mobile_menu_icon { display:block;cursor:pointer;width:100%;height:40px;margin:0 auto;background-image:url('/imageserver/default_images/four_lines_40x19.png');
.button-error { display:inline-block;width:14px;height:13px;background:url('/imageserver/GlobalMedia/Icons/deleteIcon.png') no-repeat;background-size:16px 16px;background-position:center;opacity:1;transition:all ease-in-out 150ms; }
.button-finished { display:inline-block;width:14px;height:13px;background:url('/imageserver/GlobalMedia/Icons/checkmark.png') no-repeat;background-size:16px 16px;background-position:center; }
background:url('/imageserver/confirm/ie.png');
background:url('/imageserver/confirm/buttons.png') no-repeat;
background:url('/imageserver/confirm/buttons.png') no-repeat;
.capItem { width:30px;height:30px;background:url('/imageserver/styles/captchaShapesWhite.png');background-repeat:no-repeat;background-size:auto 35px;display:inline-block;margin:0 3px; }
.form_button_error { display:inline-block;width:14px;height:13px;background:url('/imageserver/GlobalMedia/Icons/deleteIcon.png') no-repeat;background-size:13px 13px;background-position:center;opacity:1;transition:all ease-in-out 150ms; }
.form_button_finished { display:inline-block;width:14px;height:13px;background:url('/imageserver/GlobalMedia/Icons/checkmark.png') no-repeat;background-size:16px 16px;background-position:center; }
#mega_slider_wrapper,.shadow{width:100%;position:relative}.nav-arrows,.nav-dots,.shadow{display:none}.nav-arrows a,.nav-dots span,.nav-options span{cursor:pointer;border-radius:50%}#mega_slider_wrapper{background:0 0;overflow:hidden}#mega_slider_wrapper img,.mega_slide_image{width:100%}.shadow{height:168px;margin-top:-110px;background:url(/imageserver/AdminMedia/moduleImages/megaslider/shadow.png) bottom center no-repeat;background-size:100% 100%;z-index:-1}.sb-description h3{text-shadow:1px 1px 1px rgba(0,0,0,.3)}.sb-description h3 a{color:#4a3c27;text-shadow:0 1px 1px rgba(255,255,255,.5)}.nav-arrows a{width:42px;height:42px;background:url(/imageserver/AdminMedia/moduleImages/megaslider/nav.png) top left no-repeat #cbbfae;position:absolute;top:50%;left:2px;text-indent:-9000px;opacity:.9;box-shadow:0 1px 1px rgba(255,255,255,.8)}.nav-arrows a:first-child{left:auto;right:2px;background-position:top right}.nav-arrows a:hover{opacity:1}.nav-dots{text-align:center;position:absolute;height:30px;width:100%;left:0}.nav-dots span{display:inline-block;width:16px;height:16px;margin:3px;box-shadow:0 1px 1px rgba(255,255,255,.6),inset 0 1px 1px rgba(0,0,0,.1)}.nav-dots span.nav-dot-current{box-shadow:0 1px 1px rgba(255,255,255,.6),inset 0 1px 1px rgba(0,0,0,.1),inset 0 0 0 3px #cbbfae,inset 0 0 0 8px #fff}.nav-options{width:70px;height:30px;position:absolute;right:70px;bottom:0;display:none}.nav-options span{width:30px;height:30px;background:url(/imageserver/AdminMedia/moduleImages/megaslider/options.png) top left no-repeat #cbbfae;text-indent:-9000px;opacity:.7;display:inline-block}.sb-slider,.sb-slider li>img{width:100%}.nav-options span:first-child{background-position:-30px 0;margin-right:3px}.nav-options span:hover{opacity:1}.sb-slider{margin:0 auto;position:relative;overflow:hidden;list-style-type:none;padding:0;max-width:2000px!important}.sb-slider li{margin:0;padding:0;display:none}.sb-slider li>a{outline:0}.sb-slider img{max-width:100%;display:block}.sb-description{width:100%;max-width:1124px;margin:0 auto;padding:30px 10px 10px;height:900px;top:0;left:10px;right:10px;z-index:10;position:absolute;color:#fff;-webkit-transition:all .2s;-moz-transition:all .2s;-o-transition:all .2s;-ms-transition:all .2s;transition:all .2s;background:rgba(40,40,40,.2);text-shadow:#000 0 0 7px}.sb-description h2,.sb-description h3{line-height:1.1;margin:4px 0;padding:4px 0}.nav-dots span,.slider_button{transition:all ease-in-out 180ms}.sb-description h2{font-size:42px}.sb-description h3{font-size:22px}.sb-perspective{position:relative}.sb-perspective>div{position:absolute;-webkit-transform-style:preserve-3d;-moz-transform-style:preserve-3d;-o-transform-style:preserve-3d;-ms-transform-style:preserve-3d;transform-style:preserve-3d;-webkit-backface-visibility:hidden;-moz-backface-visibility:hidden;-o-backface-visibility:hidden;-ms-backface-visibility:hidden;backface-visibility:hidden}.sb-side{margin:0;display:block;position:absolute;-moz-backface-visibility:hidden;-webkit-transform-style:preserve-3d;-moz-transform-style:preserve-3d;-o-transform-style:preserve-3d;-ms-transform-style:preserve-3d;transform-style:preserve-3d}.nav-arrows,.nav-arrows a,.nav-dots{z-index:11!important}.nav-arrows a{margin-top:-60px!important;background-color:rgba(0,0,0,.8);margin-left:10px;margin-right:10px}.nav-dots{bottom:0!important;background:rgba(0,0,0,.8);padding:8px}.nav-dots span{background:#777}.nav-dots span:hover{background:#aaa}.slider_button{position:relative;display:inline-block;line-height:1;width:auto;padding:10px 16px;background:#1F1E1E;border-radius:5px;color:#fff;text-decoration:none;margin:12px 0 0;font-size:16px}.slider_button:hover{background:#333}@media (max-width:1170px){.sb-description{width:85%!important;min-width:auto!important;margin:0 60px;box-sizing:border-box}}@media (max-width:850px){.sb-description{width:80%!important;min-width:auto!important;margin:0 60px;box-sizing:border-box}.sb-description h2,.sb-description h3{line-height:1.1;margin:4px 0;padding:4px 0}.sb-description h2{font-size:28px}.sb-description h3{font-size:14px}}@media (max-width:650px){.sb-description{width:75%!important;min-width:auto!important;margin:0 60px;box-sizing:border-box}}@media (max-width:600px){.hide_in_mobile{display:none}}
<div class="logo"><a href="/"><img src="/imageserver/UserMedia/zakattack/Logo.png" /></a></div>
<div class="mobile_logo"><a href="/"><img src="/imageserver/UserMedia/zakattack/mobile.png" alt="Logo" /></a></div>
<div class="powered_by">Powered by <a href="http://yourwebpro.com" target="new"><img src="/imageserver/UserMedia/ywpgallery/ywpLogo.png" style="max-height:25px;vertical-align:middle;" alt="Your Web Pro | Roofing and Contractor Websites" title="On-Line Showrooms for Roofers & Contractors"></a></div>
The problem with your version of the regex is that it is greedy, which means .*
consumes all characters until the end of the line and performs a backtracking then. That's why in your broken part (the long yellow line) the expression matches everything between 'imageserver' and the last 'png'.
A slight modification can make your regex non-greedy; just add a ?
after the quantifier. Then the new regex will also search for a preceding 'imageserver' but it directly checks for each following character if a 'png' sequence is following. So, it only consumes and matches the text until the first 'png' sequence.
The example with the new regex (?<=imageserver).*?(?=png)
and your text can be found here: https://regex101.com/r/FvSwg4/1
It is also a good idea to have a look at the regex-debugger view for the example. Then one can better understand the single steps that have to be performed for the matching.