Let's say I have an object of string keys and string values, and I'd like to write these as CSS custom properties to some HTML generated by the server. How could I do so safely?
By safely I mean that
For simplicity's sake, I'm going to limit keys to only allow characters within the class [a-zA-Z0-9_-]
.
From reading the CSS specification, and some personal testing, I think I can get quite far by taking the value through the following steps:
{([
outside of a string has a matching closing brace. If not, discard this key-value pair.<
with \3C
, and all instances of >
with 3E
.;
with \3B
.I came up with the steps above based on this CSS syntax specification
For context, these properties may be used by a user's custom styling which we insert elsewhere, but the same object is also used as template data in a template, so it may contain a mix of strings intended as content, and ones intended as CSS variables. I feel like the algorithm above strikes a nice balance of being quite simple, but also not risking to discard too many key-value pairs that could potentially be useful in CSS (even considering future additions to CSS, but I'd like to make sure I'm not missing something.
Here's some JS code showing what I'm trying to achieve. obj
is the object in question, and preprocessPairs
is a function that takes the object, and preprocesses it, dropping/reformatting the values as described in the steps above.
function generateThemePropertiesTag(obj) {
obj = preprocessPairs(obj);
return `<style>
:root {
${Object.entries(obj).map(([key, value]) => {
return `--theme-${key}: ${value};`
}).join("\n")}
}
</style>`
}
So when given an object like this
{
"color": "#D3A",
"title": "The quick brown fox"
}
I'd expect the CSS to look like this:
:root {
--theme-color: #D3A;
--theme-title: The quick brown fox;
}
And while --theme-title
is a pretty useless custom variable in terms of use in CSS, it doesn't actually break the stylesheet, because CSS ignores properties it doesn't understand.
We may actually use just regular expressions and some other algorithms without having to rely on one specific language, hope it is what you need here.
By stating that the object keys are within [a-zA-Z0-9_-]
leaves us with the need to somehow parse values.
So we can break this into categories and just see what we can encounter (they might be slightly simplified for clarity):
'.*'
(string surrounded by apostrophes; greedy)".*"
(string surrounded by double quotes; greedy)[+-]?\d+(\.\d+)?(%|[A-z]+)?
(integers and decimal numbers, optionally per cent or with a unit)#[0-9A-f]{3,6}
(colours)[A-z0-9_-]+
(keywords, named colours, things like "ease-in")([\w-]+)\([^)]+\)
(functions like url()
, calc()
etc.)I can imagine there is some filtering you can do before trying to recognize these patterns. Maybe we trim the value string first. As you mention, <
and >
can be escaped at the beginning of the preprocessPairs()
function, because it does not appear as a part of any of the patterns we have above. If you don't expect unescaped semicolons anywhere, you may escape them as well.
Then we can try recognizing these patterns in the value and for each of the pattern, we might need to run filtering again. We expect that these patterns will be separated by some whitespace character (or two).
Including a support for multi-line strings should be OK, that's an escaped newline character.
We need to recognize that we are filtering for two contexts at least - HTML and CSS. As we include the styles in the <style>
element, the input must be safe for that, and at the same time it must be a valid CSS. Luckily, you do not include the CSS in the style
attribute of an element, so that makes it slightly easier.
So points 1-5 will be quite easy and with the previous simple filtering and trimming will cover most of the values. With some additions (no idea what is the effect on performance) it might even do additional checking of correct units, keywords etc.
But I see a relatively bigger challenge compared to other points is the point #6. You may decide to simply forbid url()
in this custom styling, leaving you with checking the input for functions, so for example you might want to escape semicolons and perhaps even check again patterns inside functions with tiny adjustments e.g. for calc()
.
Broadly that's from my point of view. With a bit of adjustment of these regexes, it should be able to complement what you already did and give as much flexibility to the input CSS while keeping you away from having to adjust the code with every adjustment to CSS features.
function preprocessPairs(obj) {
// Catch-all regular expression
// Explanation:
// ( Start of alternatives
// \w+\(.+?\)| 1st alternative - function
// ".+?(?<!\\)"| 2nd alternative - string with double quotes
// '.+?(?<!\\)'| 3rd alternative - string with apostrophes
// [+-]?\d+(?:\.\d+)?(?:%|[A-z]+)?| 4th alternative - integer/decimal number, optionally per cent or with a unit
// #[0-9A-f]{3,6}| 5th alternative - colour
// [A-z0-9_-]+| 6th alternative - keyword
// ''| 7th alternative - empty string
// "" 8th alternative - empty string
// )
// [\s,]*
const regexA = /(\w+\(.+?\)|".+?(?<!\\)"|'.+?(?<!\\)'|[+-]?\d+(?:\.\d+)?(?:%|[A-z]+)?|#[0-9A-f]{3,6}|[A-z0-9_-]+|''|"")[\s,]*/g;
// newObj contains filtered testObject
const newObj = {};
// Loop through all object properties
Object.entries(obj).forEach(([key, value]) => {
// Replace <>;
value = value.trim().replace('<', '\\00003C').replace('>', '\\00003E').replace(';', '\\00003B');
// Use catch-all regex to split value into specific elements
const matches = [...value.matchAll(regexA)];
// Now try to build back the original value string from regex matches.
// If these strings are equal, the value is what we expected.
// Otherwise it contained some unexpected markup or elements and should
// be therefore discarded.
// We specifically set to ignore all occurences of url() and @import
let buildBack = '';
matches.forEach((match) => {
if (Array.isArray(match) && match.length >= 2 && match[0].match(/url\(.+?\)/gi) === null && match[0].match(/@import/gi) === null) {
buildBack += match[0];
}
});
console.log('Compare\n');
console.log(value);
console.log(buildBack);
console.log(value === buildBack);
if (value === buildBack) {
newObj[key] = value;
}
});
return newObj;
}
Please comment, discuss, criticize, and let me know, if I forgot to touch some topic you are particularly interested in.
Disclaimer: I am not an author, owner, investor or contributor to the below mentioned sources. I just really happened to use them for some information.