How to sort column and row names in pandas crosstab?

I have this pandas crosstab:

I would like to reorder the column names in such a way, that it goes like [ar, cs, de, en, es, fr, hi, it, ja, ko, pl, pt, ru, tr, zh, be, gl, la, nn, sa, ur, ...] -> eg., those elemets with high numbers are at the diagonal. I tried multiindexing, rearranging those values that go into the pandas dataframe, but cannot figure it out.

I use the pd.crosstab like

orig = ['ar', 'ar', 'ar', 'ar', 'ar', 'ar', 'ar', 'ar', 'ar', 'ar', ...]
pred = ['ar', 'ar', 'ar', 'ar', 'ar', 'ar', 'ar', 'ar', 'ar', 'tr', ...]
self.df_confusion = pd.crosstab(orig , pred )

Solution

Just reindex on both axes:

order = ['ar', 'cs', 'de', 'en', 'es', 'fr', 'hi', 'it',
         'ja', 'ko', 'pl', 'pt', 'ru', 'tr', 'zh', 'be',
         'gl', 'la', 'nn', 'sa', 'ur']

df_confusion = df_confusion.reindex(index=order, columns=order)

If you want to add potentially missing values:

df_confusion = df_confusion.reindex(index=order, columns=order, fill_value=0)

Or, using a Categorical:

df_confusion = pd.crosstab(pd.Categorical(orig, categories=order, ordered=True),
                           pd.Categorical(pred, categories=order, ordered=True))