I would like to get the date that is printed on the name of the files above but without leading zeros.
Instead of getting the whole date as I'm doing above, I want to get for the first file 5-1-2016, for the second file I want 15-1-2016, for the third 10-1-2016 and so on (NO LEADING ZEROS).
The expected output should be like this:
I'm doing this on python.
You could match 3 groups and for the first 2 groups match an optional zero followed by capturing 1 or 2 times a digit 0?([0-9]{1,2}-)
followed by a dash.
You might add a word boundary \b
at the start and at the end.
^.*?\b0?([0-9]{1,2}-)0?([0-9]{1,2}-)([0-9]{4})\b.*$
Then you could use sub and in the replacement use the capturing groups:
\1\2\3
import re
regex = r"^.*?\b0?([0-9]{1,2}-)0?([0-9]{1,2}-)([0-9]{4})\b.*$"
test_str = "01 Ded.PASIVIC 05-01-2016.xlsx"
subst = r"\1\2\3"
result = re.sub(regex, subst, test_str, 1)
if result:
print (result) # 5-1-2016