This regex: \b([A-z*]+)-(?=[A-z*]+\b)
with this replacement: $1
Applied on:
Jean-Pierre bought "blue-green-red" product-2345 and other blue-red stuff.
Gives me:
Jean Pierre bought "blue green red" product-2345 and other blue red stuff.
While I want:
Jean Pierre bought "blue-green-red" product-2345 and other blue red stuff.
https://regex101.com/r/SJzAaP/1
EDIT:
I am using Clojure (Java)
EDIT 2:
yellow-black-white
-> yellow black white
product_a-b
-> product_a-b
EDIT 3: Accepted answer translated in Clojure
(clojure.string/replace
"Jean-Pierre bought \"blue-green-red\" product-2345 and other blue-red-green stuff yellow-black-white product_a-b"
#"(\"[^\"]*\")|\b([a-zA-Z]+)-(?=[a-zA-Z]+\b)"
(fn [[s1 s2 s3]] (if s2 s1 (str s3 " "))))
;;=> "Jean Pierre bought \"blue-green-red\" product-2345 and other blue red green stuff yellow black white product_a-b"
In Java, you may use something like
String s = "Jean-Pierre bought \"blue-green-red\" product-2345 and other blue-red stuff. yellow-black-white. product_a-b";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile("(\"[^\"]*\")|\\b([a-zA-Z]+)-(?=[a-zA-Z]+\\b)").matcher(s);
while (m.find()) {
if (m.group(1) != null) {
m.appendReplacement(result, m.group(0));
} else {
m.appendReplacement(result, m.group(2) + " ");
}
}
m.appendTail(result);
System.out.println(result.toString());
// => Jean Pierre bought "blue-green-red" product-2345 and other blue red stuff. yellow black white. product_a-b
See the Java demo.
The regex is
("[^"]*")|\b([a-zA-Z]+)-(?=[a-zA-Z]+\b)
Details
("[^"]*")
- Group 1: "
, 0+ chars other than "
and "
|
- or\b
- word boundary
-([a-zA-Z]+)
- Group 2: 1+ letters (may be replaced with (\p{L}+)
to match any letter)-
- a hyphen(?=[a-zA-Z]+\b)
- a positive lookahead that, immediately to the right of the current location, requires 1+ letters and a word boundary.If Group 1 matches (if (m.group(1) != null)
) you just paste the match back into the result. If not, paste back Group 2 value and a space.
Adding clojure code here from the question, too, for better visibility:
(def s "Jean-Pierre bought \"blue-green-red\" product-2345 and other blue-red stuff. yellow-black-white. product_a-b"
(defn append [[g1 g2 g3]] (if g2 g1 (str g3 " ")))
(clojure.string/replace s #"(\"[^\"]*\")|\b([a-zA-Z]+)-(?=[a-zA-Z]+\b)" append)
;;=> "Jean Pierre bought \"blue-green-red\" product-2345 and other blue red stuff. yellow black white. product_a-b"