Search code examples
regexurltext-editorgoogle-searchurl-parameters

Text Editor(Sublime Text, Geany, Notepad++ etc.) Regex to remove all parameters from URL string except one parameter-value


I am not very familiar with advanced matching patterns in Regex.

I have some Google Search URLs which I need to clean up without having to hold Backspace key for 5 seconds to remove unnecessary parameters from the URL.

Let's say I have this URL(could many different URLs following patterns like below):

https://www.google.com/search?source=hp&ei=Ne4pXpSIHIW_9QOD-rmADw&q=laravel+crud+generator&oq=laravel+crud+generator&gs_l=psy-ab.3..0l8.1294.6845..7289...1.0..0.307.3888.0j20j2j1......0....1..gws-wiz.....6..0i131j0i362i308i154i357.PwlZ_932pXo&ved=0ahUKEwjU9pz4tJrnAhWFX30KHQN9DvAQ4dUDCAU&uact=5

And I want to turn that into nice clean Search URL as below:

https://www.google.com/search?q=laravel+crud+generator

How can I acheive that using Find/Replace with Regex of any of mentioned text editors in Question ?


Solution

  • I'm posting that others use the solution.

    Replace

    in notepad++ please press CTRL+H then select Regular expression on below.

    Then place on Find what: this pattern: .+&(q=[^&]+).+ and in Replace with insert: https://www.google.com/search?$1

    Now, easily press the Replace button for single replace or for all replacements press ALT+A or Replace All button.

    Check Regex101

    But description:

    1- .+& find all characters before & following a q. So this part includes https://www.google.com/search?source=hp&ei=Ne4pXpSIHIW_9QOD-rmADw&

    2- (q=[^&]+), our target! we want everything after q= up next &. So we search for a string which started with q= then any character which is not &. [^&] means a character that is not & and + is saying that any character that is not & more than zero time. this part will include q=laravel+crud+generator. Please notice the parentheses.

    3- .+ means any character and includes &oq=laravel+crud+generator&gs_l=psy-ab.3..0l8.1294.6845..7289...1.0..0.307.3888.0j20j2j1......0....1..gws-wiz.....6..0i131j0i362i308i154i357.PwlZ_932pXo&ved=0ahUKEwjU9pz4tJrnAhWFX30KHQN9DvAQ4dUDCAU&uact=5

    ok, remember () in section 2? that was a group. you can use groups in replacements by this pattern $groupNumber which groupNumber is the index of parentheses. Here we have just one () or actually just one group, so our replacement statement will be $1.

    And finally replacement: https://www.google.com/search?$1 so everything is inside group one will replace with $1.