I'm trying to get running a GitHub action link checker that checks markdown files for URLs. I have a Jekyll site.
The link checker looks for files, rather than full URLs, as a result:
This fails - /docs/article1/
But this works - /docs/article1.md
I have the following rewrite in place that works for most use cases:
{
"pattern": "(\\S+)\/(\\s|$)",
"replacement": "$1$2"
}
However, it does not work if the trailing slash is missing from the the link.
Can someone recommend a regex update that will capture either:
/docs/article1/
/docs/article1
And rewrite it as /docs/article1.md
?
If the string you're checking will be always
/<name>/<name>[/]
then
(\S+?)\/(\S+?)\/?$
with substitution
$1/$2.md
should work. (regex101 link)
Regex explanation
( // group 1 \\S+? // any character, more than 1, lazy (until there is '/') ) // close group 1 \/ // the '/' between the names ( // group 2 \\S+? // any character, more than 1, lazy (until there is '/' or $(end of line)) ) // close group 2 \/? // ignore if there is closing '/' $ // the end of line
(Here is a snippet showing that it works)
let reg = {
"pattern": "(\\S+?)\/(\\S+?)\/?$",
"replacement": "$1/$2.md"
};
let tests = ["/a/b","/a/b/","/document/data/", "/document/data", "/docs/article1/", "/docs/article1"];
// run RegExp replace on every one from tests
tests.forEach(e => {
let replaced =
e.replace(new RegExp(reg.pattern), reg.replacement)
console.log(e, "=>", replaced)
})
From here:
Appending the
?
character to a quantifier makes it lazy; it causes the regular expression engine to match as few occurrences as possible
And from here:
By default quantifiers like
*
and+
are "greedy", meaning that they try to match as much of the string as possible. The?
character after the quantifier makes the quantifier "non-greedy": meaning that it will stop as soon as it finds a match.