I struggle with a pretty simple CMD task.
I have a root folder (C:\folder) in which I have many subfolders and each of them contains different kind of files. I want to search all txt files in all subfolders to find URL links. At the end I want to put of all links into a single file. My regexp to find URL is like:
(https?|ftp|file):\/\/\)?[-A-Za-z0-9+&@#\/%?=~_|!:,.;]+[-A-Za-z0-9+&@#\/%=~_|]
and it works
My last idea was:
for /R C:\folder %%F in (*.txt) do (
findstr /r "(https?|ftp|file):\/\/\)?[-A-Za-z0-9+&@#\/%?=~_|!:,.;]+[-A-Za-z0-9+&@#\/%=~_|]" >> results.txt
)
Can you help me? What am I missing?
I am not sure that this regex is a universal URL identifier, but if you want to put it into a PowerShell command:
Get-ChildItem -Recurse -File -Filter '*.txt' |
Select-String -Pattern '(https?|ftp|file):\/\/\)?[-A-Za-z0-9+&@#\/%?=~_|!:,.;]+[-A-Za-z0-9+&@#\/%=~_|]'
As suggested by @mklement0:
Get-ChildItem -Recurse -File -Filter '*.txt' |
Select-String -Pattern '(https?|ftp|file):\/\/\)?[-A-Za-z0-9+&@#\/%?=~_|!:,.;]+[-A-Za-z0-9+&@#\/%=~_|]' |
ForEach-Object { $_.Matches.Value }
and:
Get-ChildItem -Recurse -File -Filter '*.txt' |
Select-String -Pattern '(https?|ftp|file):\/\/\)?[-A-Za-z0-9+&@#\/%?=~_|!:,.;]+[-A-Za-z0-9+&@#\/%=~_|]' |
ForEach-Object { $_.Matches.Value } >results.txt
I would not put the results.txt
file in the same directory, since it will be included if the command is run again. Perhaps placing it in the home directory.
Get-ChildItem -Recurse -File -Filter '*.txt' |
Select-String -Pattern '(https?|ftp|file):\/\/\)?[-A-Za-z0-9+&@#\/%?=~_|!:,.;]+[-A-Za-z0-9+&@#\/%=~_|]' |
ForEach-Object { $_.Matches.Value } |
Out-File -Path '~/results.txt'