Search code examples
pythonpandasdataframeopen-source

What does the nrows argument for from_records() do in pandas?


I am trying to learn how to submit a Pull-Request to an open-source project. So I chose the issue #23455 from pandas-dev. It is a simple documentation error. However I realized that I have no idea what the nrows actually does in from_records.

I tried

sales = [('Jones LLC', 150, 200, 50),
     ('Alpha Co', 200, 210, 90),
     ('Blue Inc', 140, 215, 95)]
labels = ['account', 'Jan', 'Feb', 'Mar']
df = pd.DataFrame.from_records(sales, columns=labels)

which yields

    account  Jan  Feb  Mar
0  Jones LLC  150  200   50
1   Alpha Co  200  210   90
2   Blue Inc  140  215   95

as the output. However to my understanding, if I do the following:

df = pd.DataFrame.from_records(sales, columns=labels,nrows=1)

I should only have one row in the df. Instead my output remains the same as the above df.

Can someone help me with this? Thank-you.


Solution

  • nrows is a parameter used to select the first n elements of a record. If you see the code it currently only works on an iterator. There might be some reason why only on an iterator which I currently dont know.

    An example to show the use case of nrows is to convert the sales data to an iterator. i.e

    sales = iter([('Jones LLC', 150, 200, 50),('Alpha Co', 200, 210, 90), ('Blue Inc', 140, 215, 95)])
    
    df = pd.DataFrame.from_records(sales,nrows=2)
               0    1    2   3
    0  Jones LLC  150  200  50
    1   Alpha Co  200  210  90
    
    sales = iter([('Jones LLC', 150, 200, 50),('Alpha Co', 200, 210, 90), ('Blue Inc', 140, 215, 95)])
    
    df = pd.DataFrame.from_records(sales,nrows=3)
    
               0    1    2   3
    0  Jones LLC  150  200  50
    1   Alpha Co  200  210  90
    2   Blue Inc  140  215  95