Search code examples
pythonplotlydashboardlinegraph

Data visualization of CSV file with dash


I am new to Python. https://realpython.com/python-dash provides code for visualizing a line graph from a CSV file using Python's dash.

I ran the code below, but receive an error.

import dash_core_components as dcc
import dash_html_components as html
import pandas as pd

data = pd.read_csv("avocado.csv")
data = data.query("type == 'conventional' and region == 'Albany'")
data["Date"] = pd.to_datetime(data["Date"], format="%Y-%m-%d")
data.sort_values("Date", inplace=True)

app = dash.Dash(__name__)

app.layout = html.Div(
    children=[
        html.H1(children="Avocado Analytics",),
        html.P(
            children="Analyze the behavior of avocado prices"
            " and the number of avocados sold in the US"
            " between 2015 and 2018",
        ),
        dcc.Graph(
            figure={
                "data": [
                    {
                        "x": data["Date"],
                        "y": data["AveragePrice"],
                        "type": "lines",
                    },
                ],
                "layout": {"title": "Average Price of Avocados"},
            },
        ),
        dcc.Graph(
            figure={
                "data": [
                    {
                        "x": data["Date"],
                        "y": data["Total Volume"],
                        "type": "lines",
                    },
                ],
                "layout": {"title": "Avocados Sold"},
            },
        ),
    ]
)

if __name__ == "__main__":
    app.run_server(debug=True)
Traceback (most recent call last):
  File "/Users/halcyon/Documents/Python/Dashboard - Avocado prices/app.py", line 8, in <module>
    data["Date"] == pd.to_datetime(data["Date"], format="%Y-%m-%d")
  File "/Users/halcyon/Documents/Python/Dashboard - Avocado prices/venv/lib/python3.9/site-packages/pandas/core/ops/common.py", line 64, in new_method
    return method(self, other)
  File "/Users/halcyon/Documents/Python/Dashboard - Avocado prices/venv/lib/python3.9/site-packages/pandas/core/ops/__init__.py", line 529, in wrapper
    res_values = comparison_op(lvalues, rvalues, op)
  File "/Users/halcyon/Documents/Python/Dashboard - Avocado prices/venv/lib/python3.9/site-packages/pandas/core/ops/array_ops.py", line 247, in comparison_op
    res_values = comp_method_OBJECT_ARRAY(op, lvalues, rvalues)
  File "/Users/halcyon/Documents/Python/Dashboard - Avocado prices/venv/lib/python3.9/site-packages/pandas/core/ops/array_ops.py", line 57, in comp_method_OBJECT_ARRAY
    result = libops.scalar_compare(x.ravel(), y, op)
  File "pandas/_libs/ops.pyx", line 84, in pandas._libs.ops.scalar_compare
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I copied and pasted the code from the tutorial as it was shown, but was unable to reproduce it. I tried to Google and understand the material from the traceback log but was unable to comprehend it.


Solution

    • I didn't see it had been fixed in comments. A couple of small changes to make it reproducible
      1. dynamically get data from github rather than hoping it's on file system
      2. used JupyterDash which works out of box with plotly 5.x.y
    import dash_core_components as dcc
    import dash_html_components as html
    from jupyter_dash import JupyterDash
    import pandas as pd
    import requests
    import io
    
    # data = pd.read_csv("avocado.csv")
    data = pd.read_csv(io.StringIO(requests.get("https://raw.githubusercontent.com/chainhaus/pythoncourse/master/avocado.csv").text))
    data = data.query("type == 'conventional' and region == 'Albany'")
    data["Date"] = pd.to_datetime(data["Date"], format="%Y-%m-%d")
    data.sort_values("Date", inplace=True)
    
    app = JupyterDash(__name__)
    # app = dash.Dash(__name__)