I have the following RegEx (<th>Password<\/th>\s*<td>)\w*(<\/td>)
which matches <th>Password</th><td>root</td>
in this HTML:
<tr>
<th>Password</th>
<td>root</td>
</tr>
However this Terminal command fails to find a match:
perl -pi -w -e 's/(<th>Password<\/th>\s*<td>)\w*(<\/td>)/$1NEWPASSWORD$2/g' file.html
It appears to have something to do with the whitespace between the </th>
and <td>
but the <\/th>\s*<td>
works in the RegEx so why not in Perl?
Have tried substituting \s*
for \n*
, \r*
, \t*
and various combinations thereof but still no match.
Any help would be gratefully appreciated.
The substitution is only applied to one line of your file at a time.
You can read the entire file in at once using the -0
option, like this
perl -w -0777 -pi -e 's/(<th>Password<\/th>\s*<td>)\w*(<\/td>)/$1NEWPASSWORD$2/g' file.html
Note that it is far preferable to use a proper HTML parser, such as HTML::TreeBuilder::XPath
, to process data like this, as it is very difficult to account for all possible representations of a given HTML construct using regular expressions.