python pandas export-to-csv python-unittest

Unittest mock pandas to_csv

mymodule.py

def write_df_to_csv(self, df, modified_fn):
    new_csv = self.path + "/" + modified_fn
    df.to_csv(new_csv, sep=";", encoding='utf-8', index=False)

test_mymodule.py

class TestMyModule(unittest.TestCase):
    def setUp(self):
        args = parse_args(["-f", "test1"])
        self.mm = MyModule(args)
        self.mm.path = "Random/path"

    self.test_df = pd.DataFrame(
                [
                    ["bob", "a"],
                    ["sue", "b"],
                    ["sue", "c"],
                    ["joe", "c"],
                    ["bill", "d"],
                    ["max", "b"],
                ],
                columns=["A", "B"],
            )

def test_write_df_to_csv(self):
    to_csv_mock = mock.MagicMock()
    with mock.patch("project.mymodule.to_csv", to_csv_mock, create=True):
        self.mm.write_df_to_csv(self.test_df, "Stuff.csv")
    to_csv_mock.assert_called_with(self.mm.path + "/" + "Stuff.csv")

When I run this test, I get:

FileNotFoundError: [Errno 2] No such file or directory: 'Random/path/Stuff.csv'

I'm trying to mock the to_csv in my method. My other tests run as expected, however I'm not sure where I am going wrong with this test. Is my use of MagicMock correct, or am I overlooking something else?

Solution

You didn't provide a minimal, reproducible example, so I had to strip some things out to make this work. I suppose you can fill in the missing bits on your own.

One problem was with mock.patch("project.mymodule.to_csv", ...) which tries to mock a class named to_csv in the module at the import path project.mymodule. This only "worked" because you passed create=True, but of course mocking something that didn't exist before has no effect because nobody will call it.

You could mock out the entire DataFrame class using mock.patch("pandas.DataFrame", ...). Note: it's not pd regardless of how (or even whether) you imported pandas in the current module.

But then your unit test will be asserting that to_csv was called on any DataFrame, not necessarily the one you passed in. By mocking just the to_csv method on the one DataFrame object that we are passing into write_df_to_csv, the test becomes a bit more comprehensive and also easier to understand. We can do this using mock.patch.object.

mock.patch.object returns the mock function, on which we can subsequently call assertions. Because it's a method mock, not a free function, we don't need to pass the self argument in the assertion.

project/mymodule.py

def write_df_to_csv(df, file_name):
    df.to_csv(file_name, sep=";", encoding='utf-8', index=False)

project/test_mymodule.py

import unittest.mock as mock
import unittest

import pandas as pd

import project.mymodule as mm

class TestMyModule(unittest.TestCase):

    def test_write_df_to_csv(self):
        test_df = pd.DataFrame(...)
        with mock.patch.object(test_df, "to_csv") as to_csv_mock:
            mm.write_df_to_csv(test_df, "Stuff.csv")
            to_csv_mock.assert_called_with("Stuff.csv")

if __name__ == '__main__':
    unittest.main()

Output

The test fails in a proper way now, because the arguments don't actually match!

$ python -m project.test_mymodule
F
======================================================================
FAIL: test_write_df_to_csv (__main__.TestMyModule)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/project/test_mymodule.py", line 25, in test_write_df_to_csv
    to_csv_mock.assert_called_with("Stuff.csv")
  File "/usr/lib/python3.8/unittest/mock.py", line 913, in assert_called_with
    raise AssertionError(_error_message()) from cause
AssertionError: expected call not found.
Expected: to_csv('Stuff.csv')
Actual: to_csv('Stuff.csv', sep=';', encoding='utf-8', index=False)

----------------------------------------------------------------------
Ran 1 test in 0.003s

FAILED (failures=1)