Search code examples
bashsedlocale

List the locale without making duplicates


I would like to list all the locale names from the file /etc/locale.gen without having duplicates. I don't really know how to do it.

I've started removing the top of the file like so :

sed -n -e '/aa_DJ/,$p' /etc/locale.gen

It prints me all line. And I would like to have the output like so :

[...]
fr_FR
en_US
en_GB
[...]

Without the # and without the rest of what's after fr_FR for example. In one single command.

EDIT 1 :

I may have found something with grep :

sed -n -e '/aa_DJ/,$p' /etc/locale.gen | grep {,1}

EDIT 2 :

here is the file http://pastebin.com/i227sTV2


Solution

  • This should do it:

    awk -F "[ .@]" '/_|eo|ia/{sub("^# *",""); print $1}' /etc/locale.gen | sort -u
    

    The "[ .@]" removes all after language_country (en_US).


    A source locale.gen file is packaged by debian here (as an example of the file you should have, not needed to run the command above). A full list of locales is in the extracted (from the compressed deb file) file /locales_2.22-5_all/usr/share/i18n/SUPPORTED (it contains 281 unique locale names).

    Updated: s/gsub/sub/g should run on any awk.