Search code examples
javascriptformal-languages

Greek syllabification library for Javascript


Does anyone know a good syllabification library or script for Greek written with Javascript? I tried to use Hyphenator.js but results were poor...

<script src="Hyphenator.js" type="text/javascript"></script>
<script src="patterns/grc.js" type="text/javascript"></script>

<script type="text/javascript">
    var hyphenchar = '|';
    Hyphenator.config({hyphenchar:hyphenchar});
    var t = 'αποκαλυψις ιησου χριστου ην εδωκεν αυτω ο θεος δειξαι τοις δουλοις αυτου α δει γενεσθαι εν ταχει και εσημανεν αποστειλας δια του αγγελου αυτου τω δουλω αυτου ιωαννη'.split(" ").map(function(word){return Hyphenator.hyphenate(word, 'grc')});
    console.log(t);
</script>

Will output:

["απο|κα|λυ|ψις", "ιησου", "χρι|στου", "ην", "εδω|κεν", "αυτω", "ο", "θεος", "δει|ξαι", "τοις", "δου|λοις", "αυτου", "α", "δει", "γε|νε|σθαι", "εν", "ταχει", "και", "εση|μα|νεν", "απο|στει|λας", "δια", "του", "αγ|γε|λου", "αυτου", "τω", "δουλω", "αυτου", "ιω|αν|νη"]

Which evidently shows that hyphenation doesn't work perfectly for syllabification purposes. Maybe for hyphenation it is ok...

Later addition after comments:

I expected library to hyphenate "iesou" and "theos", but it turns out that there is a setting for minwordlength for hyphens. Setting it to 2, gives better results. Several sources say that automatic hyphenation / syllabification is not 100% exact due to many reasons. But this is enough for me at this point.


Solution

  • As said in comments, short words are not hyphenated by default (as it makes no typographical sense). However, it can be forced:

    Hyphenator.config({hyphenchar:hyphenchar, minwordlength:1});