Search code examples
pythonexcelpandasxlsx

How to copy certain similar rows of data from excel file and save into another excel file?


I would like to ask how to copy certain similar rows of data from an excel file and save into another excel file according to specific column name? I would like to find out the similar word according to column no1 which is 'name' (there is 3 columns in the input file which is name, number and number), then copy the whole row into another excel file. Next, I also want the amount of row is the same by adding 'zzzz | 0 | 0 ' on the missing row.

Eventually, there will be 3 different excel files. All has same amount of row, for this case, each of the files have 4 rows.

input (excel file):

name | number | number 
aaaa | 123456 | 223456
aaaa | 123456 | 221456
bbbb | 123356 | 228456
bbbb | 123556 | 229456
cccc | 123756 | 200456
cccc | 122256 | 220756
cccc | 121156 | 226656

expected output (excel file no1):

name | number | number 
aaaa | 123456 | 223456
aaaa | 123456 | 221456
zzzz | 0      | 0

expected output (excel file no2):

name | number | number 
bbbb | 123356 | 228456
bbbb | 123556 | 229456
zzzz | 0      | 0

expected output (excel file no3):

name | number | number 
cccc | 123756 | 200456
cccc | 122256 | 220756
cccc | 121156 | 226656

Any help will be appreciated, thanks!


Solution

  • You can check with goupby + reindex

    g=df.groupby('name')
    for x , y in g : 
    
        y=y.reset_index(drop=True).reindex(range(g.size().max())).fillna({'name':'zzzz'}).fillna(0)
        y.to_csv(x+'.csv')
        print(y)
    
       name    number  number.1
    0  aaaa  123456.0  223456.0
    1  aaaa  123456.0  221456.0
    2  zzzz       0.0       0.0
       name    number  number.1
    0  bbbb  123356.0  228456.0
    1  bbbb  123556.0  229456.0
    2  zzzz       0.0       0.0
       name  number  number.1
    0  cccc  123756    200456
    1  cccc  122256    220756
    2  cccc  121156    226656