Say I have an array of numbers: arr=[1, 2, 3, 4, 5, 4, 3, 2, 1]
.
Regular Regex has its basic units/literals as characters. I am wondering if there is a way to have "objects" as the basic unit, for an object [a,b]
.
Now I want to answer the question, "does arr
contain at least 2 intervals of length 3 that have positive gradients?". I was thinking that I could mimic Regex by replacing characters with [a,b]
objects, where a
is the size of the interval and b
is the function to run on that interval (gradient_pos
). the function would return True
or False
depending on whether or not the condition was met. For example, to answer the above question, you would query:
[3,gradient_pos]{2,}
You will see that the syntax follows all the regex rules, except that rather than looking at characters, we are looking at [a,b]
objects that effectively evaluate to True
or False
. The idea is that regex would try and match 2 or more intervals of length 3 that satisfy gradient_pos
. Does that make sense? I know this is very abstract, but if anyone could help me out with achieving something like this, I'd greatly appreciate it!
Thanks!
You can treat it like a regex problem. So your array would become:
import numpy as np
arr=[1, 2, 3, 4, 5, 4, 3, 2, 1]
arr = np.array(arr)
arr_as_str=''.join(map(str,np.sign(arr[1:]-arr[:-1])+1))
I got the gradients excluding the last element and shifted the sign
function from (-1,+1) to (0,2). The reason for that is to avoid the minus signs in the string representation.
So now you can just apply regex matching as usual where:
Your "3 consecutive positive gradients" query would look like this:
matches=list(re.finditer('(?=(2{3}))',arr_as_str))
Since there are overlapping regex matches, you have to use lookaheads like in: Python regex find all overlapping matches?
You can refer back to the original array using the starting indices plus you know how long the sequences are:
matches[1].start()
I suggested this approach because it seems like you're comfortable with regular expressions. However, if you're interested in exploring other approaches I would recommend looking at morphological filtering, it would have been my first choice