Search code examples
pythonpython-3.xpandasdataframeplotly

Python Pandas DataFrame: ValueError: 2 columns passed, passed data had 3 columns


I am trying to plot a line of best fit on some LIDAR data using plotly and a pandas data frame, however when I try to create the data frame I am getting an error ValueError: 2 columns passed, passed data had 3 columns. I am just trying to read the lidar data from the .csv file and plot it, but I am getting the error for some reason. The only thing I can think of is that it is trying to read the [] as data points, but I don't see any reason it would be doing so. If anyone could help me decipher this, the help would be great.

Python Code:

import numpy as np
import matplotlib.pyplot as plt
from math import sin, cos, radians
import multiprocessing as mp
import pandas as pd
import plotly.express as px
import csv

new_master=[]
def grab_plot():
    with open('lidar03.csv', 'r') as f:
        reader = csv.reader(f)
        for row in reader:
            temp_list = []
            new_x = float(row[0])
            new_y = float(row[1])
            temp_list.append([new_x, new_y])
            new_master.append(row)
            if len(new_master) > 1000:
                df = pd.DataFrame(new_master, columns=['x', 'y'])
                fig = px.scatter(df, x='x', y="y", trendline="lowess")
                fig.show()
            else:
                print("err")


grab_plot()

Lidar03.csv(just some example data, real file is 280k lines):

-241.72250217077044,-399.5738128860572
-227.90134287289055,-396.9836777711814
-215.29533284661807,-396.0094470520418
-206.42379816517118,-402.9538162755934
-202.48907022573056,-417.7633767327136
-194.58213975978188,-473.043381611565
-139.37896911133979,-1391.7884413478437
-105.58002562367821,-1395.01034339868
-9.548104225177978,-1400.4674518551674
22.257610379315068,-1407.0739715026366
53.92438894226407,-1411.7204832321459
86.536304790659,-1414.8560767629965
119.66441166265868,-1416.7051531922336
151.09708809931834,-1418.9780371689715
185.17026611362976,-1424.51543429596
219.19089203051948,-1429.2905068077885
253.3941639959759,-1430.2264323011166
286.9286362218502,-1430.2529567233444

Solution

  • The problem is that I was appending row instead of doing new_master.append([new_x, new_y)]. That fixed the problem. Full Code:

    new_master=[]
    def grab_plot():
        with open('lidar03.csv', 'r') as f:
            reader = csv.reader(f)
            for row in reader:
                temp_list = []
                new_x = float(row[0])
                new_y = float(row[1])
                new_master.append([new_x, new_y])
                if len(new_master) > 1000:
                    df = pd.DataFrame(new_master, columns=['x', 'y'])
                    fig = px.scatter(df, x='x', y="y", trendline="lowess")
                    fig.show()
                elif len(new_master) < 1000:
                    pass
    
    grab_plot()