So i am stuck - I have looked at tons of answers in here, but none seems to resolve my last problem.
Through an API with JSON, I receive an equipment list in a camelcase format. I can not change that.
I need this camelcase to be translated into normal language -
So far i have gotten most words seperated through:
$string = "SomeEquipmentHere";
$spaced = preg_replace('/([A-Z])/', ' $1', $string);
var_dump($spaced);
string ' Some Equipment Here' (length=20)
$trimmed = trim($spaced);
var_dump($trimmed);
string 'Some Equipment Here' (length=19)
Which is working fine - But in some of the equipments consists of abbreviations
"ABSBrakes" - this would require ABS and separated from Brakes
I can't check for several uppercases next to each other since it will then keep ABS and Brakes together - there are more like these, ie: "CDRadio"
So what is want is the output to be:
"ABS Brakes"
Is there a way to format it so, if there is uppercases next to eachother, then only add a space before the last uppercase letter of that sequence?
I am not strong in regex.
EDIT
Both contributions are awesome - people coming here later should read both answers
The last problems to consists are the following patterns :
"ServiceOK" becomes "Service O K"
"ESP" becomes "ES P"
The pattern only consisting of a pure uppercased abbreviation is fixed by a function counting lowercase letter, if there is none, it will skip over the preg_replace().
But as Flying wrote in the comments on his answer, there could potentially be a lot of instances not covered by his regex, and an answer could be impossible - I don't know if this could be a challenge for the regex.
Possibly by adding some "If there is not a lowercase after the uppercase, there should not be inserted a space" rule
Here is a single-call pattern that doesn't use any anchors, capture groups, or references in the replacement string: /(?:[a-z]|[A-Z]+)\K(?=[A-Z]|\d+)/
Code: (Demo)
$tests = [
'SomeEquipmentHere',
'ABSBrakes',
'CDRadio',
'Valve14',
];
foreach ($tests as $test) {
echo preg_replace('/(?:[a-z]|[A-Z]+)\K(?=[A-Z]|\d+)/',' ',$test),"\n";
}
Output:
Some Equipment Here
ABS Brakes
CD Radio
Valve 14
This is a better method because there is nothing to mop up. If there are new strings to consider (that break my method), please leave them in a comment so that I can update my pattern.
Pattern Explanation:
/ #start the pattern
(?:[a-z] #match 1 lowercase letter
| #or
[A-Z]+) #1 or more uppercase letters
\K #restart the fullstring match (forget the past)
(?=[A-Z] #look-ahead for 1 uppercase letter
| #or
\d+) #1 or more digits
/ #end the pattern
Edit:
There are some other patterns that may provide better accuracy including:
/(?:[a-z]|\B[A-Z]+)\K(?=[A-Z]\B|\d+)/
Granted, the above pattern will not properly handle ServiceOK
Demo Link Word Boundaries Link
or this pattern with an anchor:
/(?!^)(?=[A-Z][a-z]+|(?<=\D)\d)/
The above pattern will accurately split: SomeEquipmentHere
, ABSBrakes
, CDRadio
, Valve14
, ServiceOK
, ESP
as requested by the OP.
*Note: Pattern accuracy can be improved as more sample strings are provided.