Essentially, I want to replace the u between the random character and the k to be an o. The output I should get from the substitution is dudok and rujok.
How can I do this in Perl? I'm very new to Perl so go easy on me.
This is what I have right now:
$text = "duduk, rujuk";
$_ = $text;
s/.uk/ok/g
print $_; #Output: duok, ruok Expected: dudok, rujok
EDIT: Forgot to mention that the last syllable is the only one that should be changed. Also, the random character is specifically supposed to be a random consonant, not just any random character.
I should mention that this is all based on Malay language rules for grapheme to phoneme conversion.
According to the this page, the Malayan language uses an unaccented latin alphabet, and it has the same consonants as the English language. However, its digraphs are different than English's.
So, if one wanted to find a syllable ending with uk
, one would look for
<syllable_boundary>(?:[bcdfhjlmpqrtvwxyz]|gh?|kh?|n[gv]?|sv?)uk
or
<syllable_boundary>uk
The OP is specifically disinterested in the latter, so we simply need to look for
<syllable_boundary>(?:[bcdfhjlmpqrtvwxyz]|gh?|kh?|n[gv]?|sv?)uk
So now, we have to determine how to find a syllable boundary. ...or do we? All the consonant digraphs end with a consonant, and none of the vowel digraphs end in a consonant so we simply need to look for
[bcdfghjklmnpqrstvwxyz]uk
Finally, we can use \b
to check for the end of the word, so we're interested in matching
[bcdfghjklmnpqrstvwxyz]uk\b
Now, let's use this in a substitution.
s/([bcdfghjklmnpqrstvwxyz])uk\b/$1ok/g
or
s/(?<=[bcdfghjklmnpqrstvwxyz])uk\b/ok/g
or
s/[bcdfghjklmnpqrstvwxyz]\Kuk\b/ok/g
The last one is the most efficient, but it requires Perl 5.10+. (That shouldn't be a problem given how ancient it is.)