I have a data file like this:
randomthingsbefore $DATAROOT/randompathwithoutanypattern randomthingsafter
randomthingsbefore $DATAROOT/randompathwithoutanypattern randomthingsafter $DATAROOT/randompathwithoutanypattern randomthingsafter
randomthingsbefore $DATAROOT/randompathwithoutanypattern randomthingsafter
(...)
I want to delete the substring $DATAROOT from each path and add blank spaces after the path to keep the columns where randomthingsafter started. Notice that there could be 2 or more paths with the $DATAROOT substring in the same line. This way, my desired output would look like this:
randomthingsbefore /randompathwithoutanypattern randomthingsafter
randomthingsbefore /randompathwithoutanypattern randomthingsafter /randompathwithoutanypattern randomthingsafter
randomthingsbefore /randompathwithoutanypattern randomthingsafter
(...)
I've tried:
VAR1=*pathtofile*
VAR2=$(\grep -oP '\$DATAROOT\K[^ ]*' $VAR1)
arr=$(echo $VAR2 | tr " " "\n")
for x in $arr
do
y="${x} "
sed -i "s:$x:$y:" $VAR1
done
sed -i 's/$DATAROOT\///g' $VAR1
but it does not seem to work. Thank you for your help!
I believe the easiest is just to use sed to replace your script in a single line:
sed 's/$DATAROOT\([^[:blank:]]*\)/\1 /g' /path/to/file
Note, that are 9 spaces after \1
which is the length of the string $DATAROOT
. Here we make use of what is known as back-reference:
Editing Commands in sed
[2addr]s/BRE/replacement/flags
: Substitute the replacement string for instances of the BRE in the pattern space. Any character other than <backslash> or <newline> can be used instead of a <slash> to delimit the BRE and the replacement. Within the BRE and the replacement, the BRE delimiter itself can be used as a literal character if it is preceded by a <backslash>.The replacement string shall be scanned from beginning to end. An <ampersand> (
&
) appearing in the replacement shall be replaced by the string matching the BRE. The special meaning of&
in this context can be suppressed by preceding it by a <backslash>. The characters\n
, where n is a digit, shall be replaced by the text matched by the corresponding back-reference expression. If the corresponding back-reference expression does not match, then the characters\n
shall be replaced by the empty string. The special meaning of\n
where n is a digit in this context, can be suppressed by preceding it by a <backslash>. For each other <backslash> encountered, the following character shall lose its special meaning (if any).source: POSIX SED
9.3.6 BREs Matching Multiple Characters
- The back-reference expression
\n
shall match the same (possibly empty) string of characters as was matched by a subexpression enclosed between\(
and\)
preceding the\n
. The charactern
shall be a digit from1
through9
, specifying then
th subexpression (the one that begins with the nth\(
from the beginning of the pattern and ends with the corresponding paired\)
). The expression is invalid if less than n subexpressions precede the\n
. The string matched by a contained subexpression shall be within the string matched by the containing subexpression. If the containing subexpression does not match, or if there is no match for the contained subexpression within the string matched by the containing subexpression, then back-reference expressions corresponding to the contained subexpression shall not match. When a subexpression matches more than one string, a back-reference expression corresponding to the subexpression shall refer to the last matched string. For example, the expression^\(.*\)\1$
matches strings consisting of two adjacent appearances of the same substring, and the expression\(a\)*\1
fails to matcha
, the expression\(a\(b\)*\)*\2
fails to matchabab
, and the expression^\(ab*\)*\1$
matchesababbabb
, but fails to matchababbab
.source: POSIX Basic Regular Expressions