I need to store different versions of not very long strings for different languages (2-4 languages) in a Postgres table.
What is the best way of doing that? Array or JSON or something like that?
First make sure that the database locale can deal with different languages. Use a UTF-8 server-encoding. Optionally, set LC_COLLATE = 'C'
to be on neutral ground, or use the collation of your main language to have a default sort order. Start by reading the chapter Collation Support in the manual.
I would strongly suggest to use the latest version of Postgres (9.1 at time of writing) for superior collation support.
As for the table structure: keep it simple. It sounds like there is a low, fixed number of languages to deal with. You could just have a column for each language then:
CREATE TABLE txt (
txt_id serial PRIMARY KEY
, txt text NOT NULL -- master language NOT NULL?
, txt_fr text -- others can be NULL?
, txt_es text
, txt_de text
);
This is pretty efficient, even with many languages. NULL storage is very cheap.
If you have a varying number of languages to deal with, or many updates for individual language strings, a separate table might be the better solution. This solution assumes that you have a "primary language", where the string is always present:
CREATE TABLE txt (
txt_id serial PRIMARY KEY
, txt text NOT NULL -- master language NOT NULL?
);
CREATE TABLE lang (
lang_abbr text PRIMARY KEY -- de, es, fr, ...
, lang text NOT NULL
, note text
);
CREATE TABLE txt_trans (
txt_id int REFERENCES txt(txt_id) ON UPDATE CASCADE ON DELETE CASCADE
, lang_abbr text REFERENCES lang(lang_abbr) ON UPDATE CASCADE
, txt text NOT NULL -- master language NOT NULL?
, CONSTRAINT txt_trans_pkey PRIMARY KEY (txt_id, lang_abbr)
);
Not treating the master language special and keeping all language variants in the same table might make handling in your app simpler. But it really depends on requirements.