I'm looking for a quick Bash script to convert British / New Zealand spellings to American in a TeX document (for working with US-based academics and journal submission). This is a formal mathematical biology paper with very little regional terminology or grammar: prior work is given as formulae rather than quotes.
e.g.,
Generalise
-> Generalize
Colour
-> Color
Centre
-> Centre
Figure there must be sed
or awk
based script to substitute most of the common spelling differences.
See the related TeX forum question for more detail.
https://tex.stackexchange.com/questions/312138/converting-uk-to-us-spellings
n.b. I currently compile PDFLaTeX with kile
on Ubuntu 16.04 or Elementary OS 0.3 Freya but I can use another TeX compiler/package if there's a built-in fix elsewhere.
Thanks for your assistance.
I think you need to have a list of substitution handy with you and call it for translation. You would have to enrich your dictionary file to efficiently translate text files.
sourceFile=$1
dict=$2
while read line
do
word=$(echo $line |awk '{print $1}')
updatedWord=$(grep -i $word $dict|awk '{print $2}')
sed -i "s/$word/$updatedWord/g" $sourceFile 2 > /dev/null
done < $dict
Run the above script like:
./scriptName source.txt dictionary.txt
Here is one sample dictionary I used:
>cat dict
characterize characterise
prioritize prioritise
specialize specialise
analyze analyse
catalyze catalyse
size size
exercise exercise
behavior behaviour
color colour
favor favour
contour contour
center centre
fiber fibre
liter litre
parameter parameter
ameba amoeba
anesthesia anaesthesia
diarrhea diarrhoea
esophagus oesophagus
leukemia leukaemia
cesium caesium
defense defence
practice practice
license licence
defensive defensive
advice advice
aging ageing
acknowledgment acknowledgement
judgment judgement
analog analogue
dialog dialogue
fulfill fulfil
enroll enrol
skill, skillful skill, skilful
labeled labelled
signaling signalling
propelled propelled
revealing revealing
Execution result :
cat source
color of this fiber is great and we should analyze it.
./ScriptName source.txt dict.txt
cat source
colour of this fibre is great and we should analyse it.