Search code examples
pythonpython-3.xmojolang

Create Pandas Dataframe in Mojo


I am trying to declare a Pandas DataFrame in mojo using a list of data. I have followed the examples seen for importing and using Numpy but declaring a DataFrame is only giving errors. How do I fix this so it creates a DataFrame? Here is what I have

from python import Python

let pd = Python.import_module("pandas")

var data = [[1, 2, 3],[2, 3, 4],[4, 5, 6]]
var df = pd.DataFrame(data, columns=['cola', 'colb', 'colc'])

And here is the error I am receiving.

error: Expression [13]:19:41: keyword arguments are not supported yet
    var df = pd.DataFrame(data, columns=['cola', 'colb', 'colc'])
                                        ^

expression failed to parse (no further compiler diagnostics)

Observation:

Declaring without the column names gives even more errors, as follows

error: Expression [14]:7:1: no viable expansions found
fn __lldb_expr__14(inout __mojo_repl_arg: __mojo_repl_context__):
^

Expression [14]:9:28:   call expansion failed - no concrete specializations
    __mojo_repl_expr_impl__(__mojo_repl_arg, __get_address_as_lvalue(__mojo_repl_arg.`___lldb_expr_failed`.load().address), __get_address_as_lvalue(__mojo_repl_arg.`pd`.load().address))
                           ^

Expression [14]:13:1:     no viable expansions found
def __mojo_repl_expr_impl__(inout __mojo_repl_arg: __mojo_repl_context__, inout `___lldb_expr_failed`: __mlir_type.`!kgen.declref<@"$builtin"::@"$bool"::@Bool>`, inout `pd`: __mlir_type.`!kgen.declref<@"$python"::@"$object"::@PythonObject>`) -> None:
^

Expression [14]:22:26:       call expansion failed - no concrete specializations
  __mojo_repl_expr_body__()
                         ^

Expression [14]:15:3:         no viable expansions found
  def __mojo_repl_expr_body__() -> None:
  ^

Expression [14]:19:27:           call expansion failed - no concrete specializations
    var df = pd.DataFrame(data)
                          ^

expression failed to parse (no further compiler diagnostics)

Solution

  • You could do this with the following implementation:

    from python import Python
    
    fn main() raises:
        let pd = Python.import_module("pandas")
        let np = Python.import_module("numpy")
        let data = np.array([1, 2, 3, 2, 3, 4, 4, 5, 6]).reshape(3,3).T
        let df =  pd.DataFrame(data, np.arange(3), ['cola', 'colb', 'colc'])
        print(df)