List the locale without making duplicates

I would like to list all the locale names from the file /etc/locale.gen without having duplicates. I don't really know how to do it.

I've started removing the top of the file like so :

sed -n -e '/aa_DJ/,$p' /etc/locale.gen

It prints me all line. And I would like to have the output like so :

[...]
fr_FR
en_US
en_GB
[...]

Without the # and without the rest of what's after fr_FR for example. In one single command.

EDIT 1 :

I may have found something with grep :

sed -n -e '/aa_DJ/,$p' /etc/locale.gen | grep {,1}

EDIT 2 :

here is the file http://pastebin.com/i227sTV2

Solution

This should do it:

awk -F "[ .@]" '/_|eo|ia/{sub("^# *",""); print $1}' /etc/locale.gen | sort -u

The "[ .@]" removes all after language_country (en_US).

A source locale.gen file is packaged by debian here (as an example of the file you should have, not needed to run the command above). A full list of locales is in the extracted (from the compressed deb file) file /locales_2.22-5_all/usr/share/i18n/SUPPORTED (it contains 281 unique locale names).

Updated: s/gsub/sub/g should run on any awk.