Search code examples
pythonpandasdataframeseries

How do I insert a one column Series into a single dataframe column in python?


I took out and copied a column from a dataframe. Easy. I modified it and now I need to put it back in but I don't know how. I have tried countless methods and none of them work. Any help greatly appreciated.

Here's the code: [code]

for col in ["Shares__Basic_"]:
    tmp_col = data[col]
    count = 0
    index_no = data.columns.get_loc(col)
    while 1:
        result = sm.tsa.stattools.adfuller(tmp_col, autolag='AIC')
        pvalue = result[1]
        if pvalue > 0.01:
            tmp_col = tmp_col.diff()
            count = count + 1
            tmp_col = tmp_col.drop(tmp_col.index[0])
            print(col+" diffed")
        elif pvalue < 0.01:
            break
    while count > 0:
        tmp_col = pd.concat([pd.Series([float("nan")]), tmp_col])
        count = count - 1
    del data[col]
    data.insert(index_no, col, value=tmp_col)

[/code]


Solution

  • Try this to add the column over the existing column -

    df = pd.DataFrame({'A':[1,2,3],'B':[4,5,6]}) #DUMMY DATASET
    print(df)
    
    #>>    A  B
    #>> 0  1  4
    #>> 1  2  5
    #>> 2  3  6
    
    modified_column = df['A']**2
    
    #Adding it back over the existing columns
    df['A'] = modified_column
    print(df)
    
    #>>    A  B
    #>> 0  1  4
    #>> 1  4  5
    #>> 2  9  6
    

    If you want to add it as an addition column, then try this -

    #Adding it back as a new column
    df['New_A'] = modified_column
    print(df)
    
    #>>    A  B  New_A
    #>> 0  1  4      1
    #>> 1  2  5      4
    #>> 2  3  6      9
    

    EDIT: ValueError: cannot reindex from a duplicate axis usually occurs when you have duplicate index values. You maybe corrupting the modified_column's index accidently. Reset it by using the original dataframe's index.

    modified_column.index = df.index