Search code examples
pythondatasetranking

Python order dataset weather


I have a dataset history of temperature/weather by city like this:

{"city": "Barcelona", "date": "2016-10-16", "temperature": "13", "weather": "cloudy"}
{"city": "Berlin", "date": "2016-10-16", "temperature": "-1", "weather": "sunny"}
{"city": "Pekin", "date": "2016-10-16", "temperature": "19", "weather": "cloudy"}
{"city": "Paris", "date": "2016-10-16", "temperature": "-8", "weather": "sunny"}

And I would like to create a top 5, order by best average temperature.In this result I would like to know the number of day by weather type(sunny cloudy rainy)

Example:

Rank - City -      Average Temperature - Cloudy days - Sunny days - Rainy Days
1 -    Barcelona -           20 -           93 -        298 -       29 

How can doing this in Python?

Thanks

Matt


Solution

  • I believe you need pandas:


    import pandas as pd
    
    import pandas as pd
    
    df = pd.read_json('a.json', lines=True)
    print (df)
            city       date  temperature weather
    0  Barcelona 2016-10-16           13  cloudy
    1     Berlin 2016-10-16           -1   sunny
    2      Pekin 2016-10-16           19  cloudy
    3      Paris 2016-10-16           -8   sunny
    
    s = df.groupby(['city'])['temperature'].mean().nlargest(5)
    print (s)
    city
    Pekin        19
    Barcelona    13
    Berlin       -1
    Paris        -8
    Name: temperature, dtype: int64
    

    df2 = (df[df['city'].isin(s.index)]
                   .groupby(['city', 'weather'])['temperature']
                   .size()
                   .unstack(fill_value=0)
                   .add_suffix(' days')
                   .reindex(s.index)
                   .reset_index()
                   .rename_axis(None, axis=1))
    
    df2.insert(1, 'temp avg', df2['city'].map(s))
    df2.insert(0, 'rank', range(1, len(df2) + 1))
    print (df2)
       rank       city  temp avg  cloudy days  sunny days
    0     1      Pekin        19            1           0
    1     2  Barcelona        13            1           0
    2     3     Berlin        -1            0           1
    3     4      Paris        -8            0           1