postgresql database-design internationalization locale collation

Best practise for storing multilingual strings

I need to store different versions of not very long strings for different languages (2-4 languages) in a Postgres table.

What is the best way of doing that? Array or JSON or something like that?

Solution

First make sure that the database locale can deal with different languages. Use a UTF-8 server-encoding. Optionally, set LC_COLLATE = 'C' to be on neutral ground, or use the collation of your main language to have a default sort order. Start by reading the chapter Collation Support in the manual.

I would strongly suggest to use the latest version of Postgres (9.1 at time of writing) for superior collation support.

As for the table structure: keep it simple. It sounds like there is a low, fixed number of languages to deal with. You could just have a column for each language then:

CREATE TABLE txt (
  txt_id serial PRIMARY KEY
, txt    text NOT NULL -- master language NOT NULL?
, txt_fr text -- others can be NULL?
, txt_es text
, txt_de text
);

This is pretty efficient, even with many languages. NULL storage is very cheap.
If you have a varying number of languages to deal with, or many updates for individual language strings, a separate table might be the better solution. This solution assumes that you have a "primary language", where the string is always present:

CREATE TABLE txt (
  txt_id serial PRIMARY KEY
, txt    text NOT NULL -- master language NOT NULL?
);

CREATE TABLE lang (
  lang_abbr text PRIMARY KEY -- de, es, fr, ...
, lang      text NOT NULL
, note      text
);

CREATE TABLE txt_trans (
  txt_id    int  REFERENCES txt(txt_id) ON UPDATE CASCADE ON DELETE CASCADE
, lang_abbr text REFERENCES lang(lang_abbr) ON UPDATE CASCADE
, txt       text NOT NULL -- master language NOT NULL?
, CONSTRAINT txt_trans_pkey PRIMARY KEY (txt_id, lang_abbr)
);

Not treating the master language special and keeping all language variants in the same table might make handling in your app simpler. But it really depends on requirements.