Dynamically create bins from Pandas column such that No. of observations or count should be 1% of total count

I have pandas DataFrame of numeric columns. Currently, I'm working on one numeric column.

Total rows 34149
Mean 4112195398.901
Min -1687500000
Median 44531800
Max 56956420100000

This data does not contain any zeros(0)

I want to create integer bins as follows

 1. -1 for all negative values 
 2. Bin width 1000000 upto 20000000
 3. After 20000000, dynamically size bins to contain 1% of the total count or NOBS(No. of Observations) 
i.e. if NOBS is going below 1% of total NOBS the dynamically adjust the bin width such that it will be approx 1% of total NOBS.

I'm able to create integer bins upto the second condition mentioned above. Stuck at last condition.

Here is my code

df_ = df['Col1'].dropna()
df_1 = pd.Series(np.where(df_<0, -1, pd.Series(df_/1000000).astype(int)))

df_2 = pd.DataFrame(df_1, columns=['Bins'])
df_bins = df_2.value_counts(sort=False).reset_index(name='NOBS')
df_bins['Percentage'] = (df_bins['NOBS']/df_bins['NOBS'].sum())*100

The output of df_bins is

    Bins NOBS Percentage
   -1    195   0.943
    0   2155  10.420
    1   1594   7.707
    2   1209   5.846
    3    970   4.690
    .
    .
    .
    .
    .
    .
    .

The total of Percentage column will be 100% always. All bins will be integers.

Solution

What you expect is not clear but I think you can use pd.cut:

import pandas as pd
import numpy as np

# Reproducible example
rng = np.random.default_rng(42)
a = rng.integers(-1687500000, 0, 5000)
b = rng.integers(0, 20000000, 25000)
c = rng.integers(20000000, 56956420100000, 4149)
df = pd.DataFrame({'Col1': np.concatenate([a, b, c])})

# Create bins
m = df['Col1'] > 20000000
q = np.linspace(0, 1, 100 * sum(m) // len(df))
bins = [-np.inf] + np.arange(0, 20000000+1, 1000000).tolist()
bins += df.loc[m, 'Col1'].quantile(q).tolist()[1:]

out = (pd.cut(df['Col1'].dropna(), bins, labels=False).sub(1)
         .value_counts().rename('NOBS')
         .sort_index().rename_axis('Bins').reset_index()
         .assign(Percentage=lambda x: x['NOBS'] / len(df) * 100))

Output:

>>> out
    Bins  NOBS  Percentage
0     -1  5000   14.641717
1      0  1274    3.730710
2      1  1169    3.423233
3      2  1287    3.768778
4      3  1241    3.634074
5      4  1296    3.795133
6      5  1246    3.648716
7      6  1288    3.771706
8      7  1201    3.516940
9      8  1289    3.774635
10     9  1174    3.437875
11    10  1273    3.727781
12    11  1246    3.648716
13    12  1277    3.739495
14    13  1292    3.783420
15    14  1259    3.686784
16    15  1276    3.736566
17    16  1200    3.514012
18    17  1242    3.637003
19    18  1204    3.525725
20    19  1266    3.707283
21    20   378    1.106914
22    21   377    1.103985
23    22   377    1.103985
24    23   377    1.103985
25    24   377    1.103985
26    25   377    1.103985
27    26   377    1.103985
28    27   377    1.103985
29    28   377    1.103985
30    29   377    1.103985
31    30   378    1.106914