python pandas visualization jupyter noise

A smart way to get rid of insignificant data in Pandas or its visualization engine for PieChart?

There can be a lot of insignificant edge cases and data noise. I want to get a pie chart (based on Bokeh or any other open source, free plot library) that would allow to see data like this:

type size
 S    1
 V    2
 T    200
 ...
 Z    3333

Reduced to its core, with insignificant (< 1% type size) noise put into new "other" type.

1) Can Pandas do it on its own? How? 2) Does some visualization already come with such feature integrated?

Solution

Consider the pandas series a with counts of values

import pandas as pd
import numpy as np
from string import ascii_uppercase

np.random.seed([3,1415])
types = np.random.permutation(list(ascii_uppercase))
r = np.arange(1, 27)
r = r / r.sum()
s = np.random.choice(types, 10000, p=r)

a = pd.value_counts(s)

a.plot.pie(colormap='jet');

Now group all groups with representation less than 3% into one group other

n = a / a.sum()

f = n < .03

a[~f].append(pd.Series(a[f].sum(), ['other'])).plot.pie(colormap='jet')