I have a list of files with the following file name format:
[some unknown amount of characters][_d][yyyymmdd][some unknown amount of characters]
I want to extract the substring that contains the date (yyyymmdd
) which I know will always be proceeded by "_d"
. So basically I want to extract the first 8 characters after the "_d"
.
What is the best way to go about doing this?
I would use sed
:
$ echo "asdfasd_d20150616asdasd" | sed -r 's/^.*_d(.{8}).*$/\1/'
20150616
This gets a string and removes everything up to _d
. Then, catches the following 8 characters and prints them back.
sed -r
is used to be able to catch groups with just ()
instead of \(\)
.^.*_d(.{8}).*$
^
beginning of line.*
any number of characters (even 0 of them)_d
literal _d you want to match(.{8})
since .
matches any character, .{8}
matches 8 characters. With ()
we catch them so that they can be reused later on..*$
any number of characters up to the end of the line.\1
print back the catched group.