Search code examples
pythonpandasdataframedata-analysis

Analyzing of DataFrame's with specific substring in rows


I've got Pandas DataFrame similar to this one:

Date        Name        Value
2018-02-11  AP1-C4-we2  223
2018-04-22  AP1-C4-dej  44
2018-04-22  AP1-C4-dej  443
2018-05-02  AP4-C2-oe0  992
2018-05-02  AP1-C6-we2  29
2018-05-03  AP4-B5-iiu  58
2018-05-03  AP4-B5-ffw  12

How can I sum values of name starting with the same substring (first two parts of name)? It should look like this:

Date        Name    Value  
2018-02-11  AP1-C4  223 
2018-04-22  AP1-C4  487
2018-05-02  AP4-C2  992
2018-05-02  AP1-C6  29
2018-05-03  AP4-B5  70

I don't know all of the values that appear in 'Name' (there's many more of them)


Solution

  • You can use

    df.Value.groupby([df.Date,df.Name.str.rsplit('-',n=1).str[0]]).sum().sort_values().reset_index()