I can't seem to understand why I can't get all of the data between two tags after four hours, thing is 3 of them are returned but the 4th isn't (35 drops li).
$ string = '<ul>
<li>
<strong>½ cup</strong> white wine </li>
<li>
<strong>½ cup</strong> extra virgin olive oil</li>
<li>
<strong>35 drops</strong> of water
</li>
<li>
<strong>½ cup</strong> golden flaky raspberries</li>
</ul>
';
preg_match_all("/<li>\n<strong>(.*?)<\/strong>(.*?)<\/li>/", $string, $matched);
This is the result that I'm getting:
0 => array(3
0 => <li>
<strong>½ cup</strong> white wine vinegar</li>
1 => <li>
<strong>½ cup</strong> extra virgin olive oil</li>
2 => <li>
<strong>½ cup</strong> golden raspberries</li>
)
1 => array(3
0 => ½ cup
1 => ½ cup
2 => ½ cup
)
2 => array(3
0 => white wine vinegar
1 => extra virgin olive oil
2 => golden raspberries
)
)
All I'm trying to retrieve is everything inside the strong tags, and everything outside of the strong tag like it is in array 1 and 2.
The closing tag for the 35 drops is on a new line, and your regex is missing that new line:
<li>\n<strong>(.*?)<\/strong>(.*?)\n?<\/li>
^^^
Slightly better would be using negated character class (which would match newlines if needed): [^<]
<li>\n<strong>([^<]*)<\/strong>([^<]*)<\/li>
And even better would be to use an html parser.