I'm trying to split any URL that would show up on my website into three parts:
Right now I operate with 1 and 3 but I need to develop a way to allow for the pages to have the same names if they have different parents and therefore full URL is unique.
Here are the types of URL I may have:
(nothing)
en
en/test
en/parent/test
test
parent/test
ggparent/gparent/parent/test
I thought about extending my current directive:
RewriteRule ^(?:([a-z]{2})(?=\/))?.*(?:\/([\w\-\,\+]+))$ /index.php?lang=$1&page=$2 [L,NC]
to the following:
(?:([a-z]{2})(?=\/))?(.*)\/([^\/]*)?$
Which then I could translate to index.php?lang=$1&tree=$2&page=$3
but the difficulty I have is that the second capturing group captures the slash from the beginning.
I believe I can't (based on my search so far) dynamically have all the strings between slashes to be returned and make the last one to always be first, without repeating the same regex. I thought I would capture anything between language and current page and process the tree in PHP.
However my current regex has some problems and I can't figure them out:
Link to Regex101: https://regex101.com/r/ecHBQT/1
This likely does it: Split the URL by slash into lang, tree, and page at the proper place, with all three parts possibly empty:
RewriteRule ^([a-z]{2}\b)?\/?(?:\/?(.+)\/)?(.*)$ /index.php?lang=$1&tree=$2&page=$3 [L,NC]
Testcase in JavaScript using this regex:
const regex = /^([a-z]{2}\b)?\/?(?:\/?(.+)\/)?(.*)$/;
[
'',
'en',
'en/test',
'en/parent/test',
'test',
'parent/test',
'ggparent/gparent/parent/test'
].forEach(str => {
let rewritten = str.replace(regex, '/index.php?lang=$1&tree=$2&page=$3');
console.log('"' + str + '" ==>', rewritten);
})
Output:
"" ==> /index.php?lang=&tree=&page=
"en" ==> /index.php?lang=en&tree=&page=
"en/test" ==> /index.php?lang=en&tree=&page=test
"en/parent/test" ==> /index.php?lang=en&tree=parent&page=test
"test" ==> /index.php?lang=&tree=&page=test
"parent/test" ==> /index.php?lang=&tree=parent&page=test
"ggparent/gparent/parent/test" ==> /index.php?lang=&tree=ggparent/gparent/parent&page=test
Notes: