Search code examples
pythoncsvopencsv

Is it possible to find the name of something in a csv given 2 parameters? Python


Say I had a csv file with 3 columns, 'name', 'price' and 'color'.

How could I go about getting the variable of the name of say, the most expensive blue item, the most expensive red, and the most expensive yellow?

Would really appreciate any help :)


Solution

  • Our plan is to find the class we want (say, "blue" items) and, then, find the most expensive (the maximum in the price column).

    Let's define an example DataFrame:

    import pandas as pd
    
    df = pd.DataFrame({
            'name': [a for a in "abcdef"],
            'price': [1.5, 3.8, 1.4, 5.9, 3.5, 1.9],
            'color': ['blue', 'red', 'yellow', 'blue', 'red', 'yellow']
        }).set_index('name')
    

    And here our DataFrame:

          price   color
    name               
    a       1.5    blue
    b       3.8     red
    c       1.4  yellow
    d       5.9    blue
    e       3.5     red
    f       1.9  yellow
    

    To do the first part (find the items of a specific color), we can use Pandas' query. So the following will select blue items and save to blue_items.

    blue_items = df[df.color == "blue"] # selects the df's slice in which df.color is equals to "blue".
    

    Then we can get the index of the maximum price (as I've defined name to be the index column, it'll return the name):

    blue_items["price"].idxmax()
    

    The complete code (now considering you're importing a CSV file):

    import pandas as pd
    
    df = pd.read_csv("filename.csv", index_col="name")
    
    most_exp_blue = df[df.color == "blue"]["price"].idxmax()  # the most expensive blue
    
    most_exp_red = df[df.color == "red"]["price"].idxmax()  # the most expensive red
    
    most_exp_yellow = df[df.color == "yellow"]["price"].idxmax()  # the most expensive yellow