I have a php script which generates a html email. In order to optimise the size to not fall foul of Google's 102kB limit I'm trying to squeeze as unnecessary characters out of the code as possible.
I currently use Emogrifier to inline the css and then TinyMinify to minify.
The output from this still has spaces between properties and values in the inlined styles (eg style="color: #ffffff; font-weight: 16px"
)
I've developed the following regex to remove the extra whitespace, but it also affects the actual content too (eg this & that becomes this &that)
$out = preg_replace("/(;|:)\s([a-zA-Z0-9#])/", "$1$2", $newsletter);
How can I modify this regex to be limited to inlines styles, or is there a better approach?
There are no bullitproof ways to not match the payload (style=""
can appear anywhere) and to not match actual CSS values (as in content: 'a: b'
). Furthermore consider also
red
is shorter than #f00
, which is shorter than #ff0000
<ins>
and <strong>
can be effectively shorter than using inline CSSOne approach would be to match all inline style HTML attributes first and then operate on their content only, but you have to test for yourself how good this works:
$out= preg_replace_callback
( '/( style=")([^"]*)("[ >])/' // Find all appropriate HTML attributes
, function( $aMatch ) { // Per match
// Kill any amount of any kind of spaces after colon or semicolon only
$sInner= preg_replace
( '/([;:])\\s*([a-zA-Z0-9#])/' // Escaping backslash in PHP string context
, '$1$2'
, $aMatch[2] // Second sub match
);
// Kill any amount of leading and trailing semicolons and/or spaces
$sInner= preg_replace
( array( '/^\\s*;*\\s*/', '/\\s*;*\\s*$/' )
, ''
, $sInner
);
return $aMatch[1]. $sInner. $aMatch[3]; // New HTML attribute
}
, $newsletter
);