Search code examples
pythonarrayspandasnumpylookup-tables

How to update numpy array based on pandas DataFrame


I have a numpy array with thousands of rows and columns, and I'm wondering how to update each value based on the values in a pandas DataFrame.

For example, let's say my array contains a list of years (here's an incredibly small sample just to give you the basic idea):

[[2020, 2015, 2017],
 [2015, 2016, 2016],
 [2019, 2018, 2020]]

I want to change each value in the array to "Lat" based on the "Year". So if my pandas dataframe looks like this:

Year Lat Lon
2020 37.2 103.45
2019 46.1 107.82
2018 35.2 101.45
2017 38.6 110.62
2016 29.1 112.73
2015 33.8 120.92

Then the output array should look like:

[[37.2, 33.8, 38.6],
 [33.8, 29.1, 29.1],
 [46.1, 35.2, 37.2]]

If my dataset were truly this small, it wouldn't be a problem, but considering I have millions of values in the array and thousands of values in the DataFrame, I'm a little overwhelmed on how to go about this efficiently.

Update:

Perhaps my question might be a bit more complicated than I anticipated. Rather than matching up the years, I'm matching up GPS time, so the numbers don't match up as nicely. Is there a way to take a number in the array and match it up to the closest value in the DataFrame column? In reality, my array would look more like this:

[[2019.99, 2015.2, 2017.1],
 [2015.33, 2016.01, 2015.87],
 [2019.2, 2018.3, 2020.00]]

Solution

  • Maybe setting Year as index and using at (or loc) would help

    # Data
    arr = np.array([[2020, 2015, 2017], [2015, 2016, 2016], [2019, 2018, 2020]])
    df = pd.DataFrame({'Year': {0: 2020, 1: 2019, 2: 2018, 3: 2017, 4: 2016, 5: 2015},
                       'Lat': {0: 37.2, 1: 46.1, 2: 35.2, 3: 38.6, 4: 29.1, 5: 33.8},
                       'Lon': {0: 103.45, 1: 107.82, 2: 101.45, 3: 110.62, 4: 112.73, 5: 120.92}})
    
    df = df.set_index("Year")
    np.array([df.loc[years, "Lat"] for years in arr])
    # array([[37.2, 33.8, 38.6],
    #        [33.8, 29.1, 29.1],
    #        [46.1, 35.2, 37.2]])