I need to match the a pattern i.e. "Commodity Name" and get the string in the next line between the patterns "<dd>
" "</dd>
".
Sample Input file:
C:\Users\rpm\Desktop\sample.txt:133: <dt>Commodity Name</dt>
C:\Users\rpm\Desktop\sample.txt:134: <dd>Grocery</dd>
C:\Users\rpm\Desktop\sample.txt:136: <dt>IP address</dt>
C:\Users\rpm\Desktop\sample.txt:137: <dd>XXX.XXX.XXX.XXX port 8000</dd>
C:\Users\rpm\Desktop\sample.txt:144: <dt>Commodity Serial #</dt>
C:\Users\rpm\Desktop\sample.txt:145: <dd>0055500000</dd>
C:\Users\rpm\Desktop\sample.txt:147: <dt>Client IP</dt>
C:\Users\rpm\Desktop\sample.txt:148: <dd>xxx.xxx.xxx.xxx</dd>
C:\Users\rpm\Desktop\sample.txt:150: <dt>Client Logged In As</dt>
C:\Users\rpm\Desktop\sample.txt:151: <dd>rpm123</dd>
C:\Users\rpm\Desktop\sample.txt:153: <dt>User is member of</dt>
C:\Users\rpm\Desktop\sample.txt:154: <dd>BP-RPM\COMD_CSO_ITM-AVAI_Def,BP-RPM\user</dd>
Need to match patterns such as
and get the values in the next line of the matched patterns between the tags <dd> & </dd>
.
Desired output:
Grocery | XXX.XXX.XXX.XXX port 8000 | 0055500000 | xxx.xxx.xxx.xxx | rpm123 | BP-RPM\COMD_CSO_ITM-AVAI_Def,BP-RPM\user
I would start to create an array
defining your keywords:
$keywords = @(
'<dt>Commodity Name</dt>'
'<dt>IP address</dt>'
'<dt>Commodity Serial #</dt>'
'<dt>Client IP</dt>'
'<dt>Client Logged In As</dt>'
'<dt>User is member of</dt>'
)
Now you can join
the keywords by an |
to use it with the Select-String
cmdlet:
$file = 'C:\Users\rpm\Desktop\sample.txt'
$content = Get-Content $file
$content | Select-String -Pattern ($keywords -join '|')
This will give you the line number of each matched keyword. Now you can iterate over the result, access the next line by index and crop the <dd>
pre and </dd>
postifx:
ForEach-Object {
[regex]::Match($content[$_.LineNumber], '<dd>(.+)</dd>').Groups[1].Value
}
Regex:
Output:
Grocery
XXX.XXX.XXX.XXX port 8000
0055500000
xxx.xxx.xxx.xxx
rpm123
BP-RPM\COMD_CSO_ITM-AVAI_Def,BP-RPM\user
Finally you have to join the result by |
to get the desired output. Here is the whole script:
$keywords = @(
'<dt>Commodity Name</dt>'
'<dt>IP address</dt>'
'<dt>Commodity Serial #</dt>'
'<dt>Client IP</dt>'
'<dt>Client Logged In As</dt>'
'<dt>User is member of</dt>'
)
$file = 'C:\Users\rpm\Desktop\sample.txt'
$content = Get-Content $file
($content | Select-String -Pattern ($keywords -join '|') |
ForEach-Object {
[regex]::Match($content[$_.LineNumber], '<dd>(.+)</dd>').Groups[1].Value
}) -join ' | '
Output:
Grocery | XXX.XXX.XXX.XXX port 8000 | 0055500000 | xxx.xxx.xxx.xxx | rpm123 | BP-RPM\COMD_CSO_ITM-AVAI_Def,BP-RPM\user