I've simplified this question because it was getting quite long. Basically I want to get a substring of the $subject
that goes from the start of $subject
up to the current match the callback function is running on. Here is an example of some input (javascript):
$subject = "var myUrl = '<a href=\"http://google.co.uk\">click me</a>';";
I'm using a url matching regex in my preg_replace_callback
, so it will match http://google.co.uk
. I want to get a substring of $subject
up to the start of that match: var myUrl = '<a href="
should be contained in the substring. How can I do this?
$subject = "var myUrl = '<a href=\"http://google.co.uk\">click me</a>';";
preg_replace_callback("MY URL MATCHING PATTERN", function($matches) {
// Get length of $subject up to the current match
$length = ?; // this is the bit I can't work out
// Get substring
$before = substr($subject, 0, $length);
// Work out whether or not to escape the single quotes
$quotes = array();
preg_match_all("/'/", $before, $quotes);
$quotecount = count($quotes);
$escape = ($quotecount % 2 == 0 ? "" : "\\");
// Return the binary value
return "javascript:parent.query(".$escape."'".textToBinary($matches[0]).$escape."')";
}, $subject);
- Firstly, I recommend using DOM functionalities such as PHP DOMDocument or DOMXPath.
- Secondly, it is better to revise your RegEx. (\S
is the culprit)
- Thirdly, a quick solution to your problem is:
return "javascript:open('".str_replace("'", "\\'", $matches[0])."')";
Updated:
$subject = "var myUrl = '<a href=\"http://google.co.uk\">click me</a>';";
$pattern = "@(https?://([-\w\.]+)+(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)?)@";
$result = preg_replace_callback($pattern, function($matches) use ($subject) {
$pos = strpos($subject, $matches[0]);
$str = substr($subject, 0, $pos);
$escape = (strpos($str, "'") == false) ? "'" : "\\'";
return "javascript:parent.query({$escape}".textToBinary($matches[0])."{$escape})";
}, $subject);