Basically, I am changing any and all hexadecimal values with a blue hue to its red hue counterpart in a given stylesheet (i.e. #00f
is changed to #ff0000
(my function outputs six character hexadecimal values excluding the #
)).
It was not a problem creating a regular expression to match hexadecimal colors (I'm not concerned about HTML color names although I may eventually care about rgb
, rgba
, hsb
, etc. values.). This is what I ended up with #(([0-9A-z]{3}){1,2})
. It works but I want it to be full proof. For example, if somebody happens to set a background image with a fragment (i.e. #top
) with a valid hexadecimal value, I don't want to change it. I tried doing a negative lookbehind, but it doesn't seem to work. I was using \B#(([0-9A-z]{3}){1,2})
but if there is a word boundary (such as a space) before the '#', it match the URL fragment. This is what I thought should do the trick but doesn't: (?<!url\([^#)]*)#(([0-9A-z]{3}){1,2})
.
I am using the desktop version of RegExr to test with the following stylesheet:
body {
background: #f09 url('images#06F');
}
span {
background=#00f url('images#889');
}
div {
background:#E4aaa0 url('images#889');
}
h1 {
background: #fff #dddddd;
}
Whenever, I hover over the (?<!
substring, RegExr identifies it as a "Negative lookahead matching 'url\([^#)]*
'." Could there be a bug or am I just having a bad regex day? And while we're at it, are there any other contextes in which a '#' is used for non-hexadecimal purposes?
EDIT: Alright, I can't program early in the morning. That hexadecimal regex should be #(([0-9A-Fa-f]{3}){1,2})
EDIT 2: Alright, so I missed the detail that most languages require static length lookbehinds.
I think that what you need is either one of the following solutions or the other
ss = ''' background: #f09 url('images#06F');
background=#00f url('images #889');
background:#E4aaa0 url('images#890');
background: #fff #dddddd; '''
print ss
import re
three = '(?:[0-9A-Fa-f]{3})'
regx = re.compile('^ *background[ =:]*#(%s{1,2})' % three,re.MULTILINE)
print regx.findall(ss)
print '-----------------------------------------------------'
regx = re.compile('(?:(?:^ *background[ =:]*)|(?:(?<=#%s)|(?<=#%s%s)) +)'
'#(%s{1,2})' % (three,three,three,three),
re.MULTILINE)
print regx.findall(ss)
result
background: #f09 url('images#06F');
background=#00f url('images #889');
background:#E4aaa0 url('images#890');
background: #fff #dddddd;
['f09', '00f', 'E4aaa0', 'fff']
-----------------------------------------------------
['f09', '00f', 'E4aaa0', 'fff', 'dddddd']
ss = ''' background: #f09 url('images#06F');
background=#00f url('images #889');
color:#E4aaa0 url('images#890');
background: #fff #dddddd#125e88 #ae3;
Walter (Elias) Disney: #f51f51 '''
print ss+'\n'
import re
three = '(?:[0-9A-Fa-f]{3})'
regx = re.compile('^ *[^=:]+[ =:]*#(%s{1,2})' % three,re.MULTILINE)
print regx.findall(ss)
print '-----------------------------------------------------'
regx = re.compile('(?:(?:^ *[^=:]+[ =:]*)|(?:(?<=#%s)|(?<=#%s%s)) *)'
'#(%s{1,2})' % (three,three,three,three),
re.MULTILINE)
print regx.findall(ss)
result
background: #f09 url('images#06F');
background=#00f url('images #889');
color:#E4aaa0 url('images#890');
background: #fff #dddddd#125e88 #ae3;
Walter (Elias) Disney: #f51f51
['f09', '00f', 'E4aaa0', 'fff', 'f51f51']
-----------------------------------------------------
['f09', '00f', 'E4aaa0', 'fff', 'dddddd', '125e88', 'ae3', 'f51f51']
ss = ''' background: #f09 url('images#06F');
background=#00f url('images #889');
color:#E4aaa0 url('images#890');
background: #fff #dddddd#125e88 #ae3;
Walter (Elias) Disney: #f51f51
background: -webkit-gradient(linear, from(#000000), to(#ffffff));. '''
print ss+'\n'
import re
three = '(?:[0-9A-Fa-f]{3})'
preceding = ('(?:(?:^[^#]*)'
'|'
'(?:(?<=#%s)'
'|'
'(?<=#%s%s)'
'|'
'(?<= to\()'
')'
')') % (three,three,three)
regx = re.compile('%s *#(%s{1,2})' % (preceding,three), re.MULTILINE)
print regx.findall(ss)
result
background: #f09 url('images#06F');
background=#00f url('images #889');
color:#E4aaa0 url('images#890');
background: #fff #dddddd#125e88 #ae3;
Walter (Elias) Disney: #f51f51
background: -webkit-gradient(linear, from(#000000), to(#ffffff));.
['f09', '00f', 'E4aaa0', 'fff', 'dddddd', '125e88', 'ae3', 'f51f51', '000000', 'ffffff']
Regexes are extremely powerful in the condition that there must be enough portions of strings following a certain organisation having relative stability among variable other portions that are intended to be catched. If the analyzed text becomes too loose in its structure, it becomes impossible to write a regex.
Are there still a lot of other "Harlequin-like patchwork" structures possible for your strings ??