I'm trying to match variables in some PHP code which are missing their leading dollar sign as a means to repair the code.
Sample input:
foo = "bar"
$bar = foo
foo()
$foo = bar;
bar = foo() {}
$foo = array();
should match:
foo = "bar" -> match foo not bar
$bar = foo -> match foo not bar
foo() -> no match
$foo = bar; -> match bar not foo
bar = foo() {} -> match bar not foo
$foo = array(); -> no match
It just should match all words [A-Za-z0-9_]
that are not quoted and do not begin with a $
or end with a (
.
edit:
A little example to explain better what I'm trying to achieve:
<?php
/**
* little script to explain better what im trying to achieve
*/
echo "\nSay Hi :P\n=========\n\n";
$reply = null;
while ("exit" != $reply) {
// command
echo "> ";
// get input
$reply = trim( fgets(STDIN) );
// last char
$last = substr( $reply, -1 );
// add semicolon if missing
if ( $last != ";" && $last != "}" ) {
$reply .= ";";
}
/*
* awesome regex that should add $ chars to words
* to make using this more comfortable!
*/
// output buffer
ob_start();
eval( $reply );
echo $out = ob_get_clean();
// add break
if ( strlen( $out ) > 0 ) {
echo "\n";
}
}
echo "\n\nBye Bye! :D\n\n";
?>
You will have a really hard time trying to parse a programming language with a regex. When you start getting more complicated expressions, regex will become inadequate.
Nonetheless, here is a regex that matches all your examples:
(?<![^\s])\w+(?![^;\s])
You may able to expand that to suit your needs.