Search code examples
pythonpandasgroup-byapplyrandomized-algorithm

Block randomization using python?


I have the following input table (df):

ColumnA ColumnB Blocks
A 12 1
B 32 1
C 44 1
D 76 2
E 99 2
F 123 2
G 65 2
H 87 3
I 76 3
J 231 3
k 80 4
l 55 4
m 27 5
n 67 5
o 34 5

I would like to perform block randomization such that, it pick one value from each blocks ( one value from 1,2,3,4,5) and create that as a separate table.

The output should look something like the following:

ColumnA ColumnB Blocks Groups
B 32 1 A1
E 99 2 A1
I 76 3 A1
l 55 4 A1
m 27 5 A1
A 12 1 A2
F 123 2 A2
k 80 3 A2
m 27 4 A2
n 67 5 A2
C 44 1 A3
H 87 2 A3
J 231 3 A3
n 67 4 A3
o 34 5 A4
D 76 1 A4
G 65 2 A4

Randomly selected rows such that each group has all the blocks (evenly distributed).

What I tried so far?


df = df.groupby('blocks').apply(lambda x: x.sample(frac=1,random_state=1234)).reset_index(drop=True)
treatment_groups = [f"A{i}" for i in range(1, n+1)]
df['Groups'] = (df.index // n).map(dict(zip(idx, treatment_groups)))

This doesn't randomize according to the blocks column. How do I do that?


Solution

  • Let us try by defining a function to generate random samples from each block:

    def random_samples(n):
        for i in range(1, n+1):
            for _, g in df.groupby('Blocks'):
                yield g.sample(n=1).assign(Groups=f'A{i}')
    
    sampled = pd.concat(random_samples(4), ignore_index=True)
    

    >>> sampled
    
       ColumnA  ColumnB  Blocks Groups
    0        A       12       1     A1
    1        D       76       2     A1
    2        I       76       3     A1
    3        k       80       4     A1
    4        n       67       5     A1
    5        C       44       1     A2
    6        G       65       2     A2
    7        J      231       3     A2
    8        l       55       4     A2
    9        m       27       5     A2
    10       B       32       1     A3
    11       G       65       2     A3
    12       H       87       3     A3
    13       l       55       4     A3
    14       m       27       5     A3
    15       B       32       1     A4
    16       F      123       2     A4
    17       I       76       3     A4
    18       l       55       4     A4
    19       m       27       5     A4