I have a DataFrame, which contains the pixels of a grey images. It has two columns: n
which denotes to which image the pixel belongs to and pixel
denotes how dark the pixel is. When I print the pixels with
plt.figure()
ggplot(aes(x='pixel'), data=pixelDF) + \
geom_histogram(binwidth=8) + \
xlab('pixels') + \
ylab('') + \
ggtitle('Histogram of pixels') + \
scale_y_log() + \
facet_grid(y='n')
but when I transform it first with
def my_historgram(to_histogram):
histogram = np.histogram(to_histogram, bins=32, range=(0, 255), weights=None, density=False)
return (histogram)
def get_pixel(df, i):
return (df.loc[df['n'] == i]['pixel'])
def hist_calc(hist):
return(np.log(hist) / sum(np.log(hist)))
imageNr = pixelDF['n'].drop_duplicates().tolist() hist, bin_edges = my_historgram(get_pixel(pixelDF, imageNr[0])) histograms = pd.DataFrame({
'binNr': range(len(hist)),
'binValue_' + str(imageNr[0]): pd.Series(hist_calc(hist))}).set_index('binNr') for i in imageNr[1:]:
hist, bin_edges = my_historgram(get_pixel(pixelDF, i))
histogram = pd.DataFrame({
'binNr': range(len(hist)),
'binValue_' + str(i): pd.Series(hist_calc(hist))}).set_index('binNr')
histograms = histograms.join(histogram) histograms = histograms.reset_index()
### Print new type of Histogram
plt.figure() plotDF = pd.melt(histograms, id_vars=['binNr'], var_name='imageNr', value_name='binValue')
ggplot(aes(x='factor(binNr)', weight='binValue'), data=plotDF) + \
geom_bar() + \
xlab('binNr') + \
ylab('') + \
ggtitle('Histograms of pixels') + \
facet_grid(y='imageNr')
I get a pretty different picture:
Why is that? What am I doing wrong in the processing for the second picture?
Thanks to "jeremycg": Who commented " it looks like your version has treated the binNr as a cateogorical variable, and needs to be sorted – jeremycg 2 hours ago"
The solution is: Simply get rid of factor()
in the last ggplot.