The Output of my algorithm gives me a certain string. I need to visualize these in a Time-Height Plot with colors defined by those strings. So far, so good. I convert the strings to categorical and am able to choose my colors freely.
num_hydrometeor = 8
ncar_cmap = cm.get_cmap('gist_ncar_r', num_hydrometeor)
colors = {'AG':'chocolate','IC':'orange','DN':'yellowgreen','OT':'grey','WS':'r','FZ':'rosybrown','RN':'teal','IP':'cyan',np.nan:'white'}
a = np.linspace(0,18,400)
beam_height_test = beam_height_test = np.sort(np.random.choice(a,size=180))
times = pd.date_range('1/1/2020', periods = 288, freq ='5min')
C = np.array(['WS', 'OT', 'FZ', np.nan, 'AG', 'IC'],dtype=object)
test_dist_hca = np.random.choice(C,size=(len(beam_height_test),len(times)))
test_dist_hca_cat = pd.Series(data=test_dist_hca.flatten()).astype('category')
test_dist_hca_cat = test_dist_hca_cat.cat.codes
test_dist_hca_cat = test_dist_hca_cat.values
test_dist_hca_cat = test_dist_hca_cat.reshape((len(beam_height_test),len(times)))
cols = []
a = pd.Series(data=test_dist_hca.flatten()).sort_values().unique()
for hc in a:
cols.append(colors[hc])
ncar_cmap = cm.colors.ListedColormap(cols)
levels = np.unique(test_dist_hca_cat)
plt.figure(figsize=(40,10))
plt.pcolormesh(times,beam_height_test,test_dist_hca_cat,cmap=ncar_cmap,norm = cm.colors.BoundaryNorm(levels, ncolors=ncar_cmap.N, clip=False))
plt.colorbar()
plt.savefig("hmc_daily_test.png")
If applying to my real output it looks like this:
Does anyone has an idea what I am doing wrong? The Algorithm output comes from an pandas DataFrame and goes the same way as the pandas.Series in the minimal example.
To find out what's happening, I reduced the sizes. I also created a scatter plot where the colors are decided directly from the dictionary without the route via .astype('category')
.
It seems the nan
complicates things somewhat, because it gets category number -1. Therefore, it needs to be treated separated from the rest, and we need the ranges for the colors starting with -1
.
To get the ticks for the colorbar exactly in the center of each color, its range (-1 to 4 in this case) is divided into 12 equal parts, after which every even tick is skipped.
Here is how the final test code looks like:
from matplotlib import pyplot as plt
from matplotlib import cm
import pandas as pd
import numpy as np
colors = {'AG': 'chocolate', 'IC': 'orange', 'DN': 'yellowgreen', 'OT': 'grey', 'WS': 'r', 'FZ': 'rosybrown',
'RN': 'teal', 'IP': 'cyan', np.nan: 'white'}
a = np.linspace(0, 18, 25)
beam_height_test = np.sort(np.random.choice(a, replace=False, size=10))
times = pd.date_range('1/1/2020', periods=12, freq='5min')
C = np.array(['WS', 'OT', 'FZ', np.nan, 'AG', 'IC'], dtype=object)
test_dist_hca = np.random.choice(C, size=(len(beam_height_test), len(times)))
plt.figure(figsize=(14, 7))
plt.scatter(np.tile(times, len(beam_height_test)),
np.repeat(beam_height_test, len(times)),
c=[colors[h] for h in test_dist_hca.flatten()])
for i, x in enumerate(times):
for j, y in enumerate(beam_height_test):
plt.text(x, y, test_dist_hca[j][i])
plt.show()
test_dist_hca_cat = pd.Series(data=test_dist_hca.flatten()).astype('category')
test_dist_hca_cat = test_dist_hca_cat.cat.codes
test_dist_hca_cat = test_dist_hca_cat.values
test_dist_hca_cat = test_dist_hca_cat.reshape((len(beam_height_test), len(times)))
used_colors = [colors[np.nan]]
a = pd.Series(data=test_dist_hca.flatten()).sort_values().unique()
for hc in a:
if type(hc) == str:
used_colors.append(colors[hc])
cmap = cm.colors.ListedColormap(used_colors)
plt.figure(figsize=(14, 7))
plt.pcolormesh(times, beam_height_test, test_dist_hca_cat,
cmap=cmap,
norm=plt.Normalize(vmin=-1, vmax=len(a) - 2))
cbar = plt.colorbar(ticks=np.linspace(-1, len(a) - 2, 2 * len(a), endpoint=False)[1::2])
cbar.ax.set_yticklabels(['nan'] + list(a[:-1]))
plt.show()
Here is how the pcolormesh
with the color bar look like:
And the corresponding scatter plot with the text annotations:
Note that the colors and the names correspond. As explained in the pcolormesh
docs, pcolormesh
ignores the last row and column when the X and Y sizes aren't 1 larger than the mesh.