Search code examples
pythonpandasnamespacesevalnameerror

Why does `eval` not work within a Python class function?


Say I have a python class with a Pandas dataframe df as attribute. I want to query df by releasing one or more pre-defined queries, using a class function to which one or more query handles are provided as arguments:

import pandas as pd
import numpy as np

class doorn:
    def __init__(self):
        self.name = 'foo'
        self.df = pd.DataFrame(data={'A':np.arange(0, 10), 'B':np.arange(5, 15), 'C':np.arange(14, 24)}, index=[x for x in range(0, 10)])

    def query_df(self, *query):
        # query arguments must by formatted as 'q1', 'q2' etc
        queries = [q for q in query]

        q1 = self.df.loc[self.df.A > 2].index
        q2 = self.df.loc[self.df.B < 13].index
        q3 = self.df.loc[self.df.C > 15].index

        sel_rows = set().union(*[eval(x, globals(), locals()) for x in queries])

        self.df = self.df.loc[sel_rows]

Now, it seems that eval cannot find the instances of the query-strings it is provided:

>>> foo = doorn()
>>> foo.query_df('q1', 'q2')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<input>", line 17, in query_df
  File "<input>", line 17, in <listcomp>
  File "<string>", line 1, in <module>
NameError: name 'q1' is not defined

My guess is that q1, q2, q3 are not present in the row comprehension Namespace. Or something, because I haven't really wrapped my head around Namespaces yet. I've tried solving this by providing globals() and locals() as additional arguments to eval, as suggested in the docs, but without success.

How can I solve this? Can I even refrain from using eval altogether?


Solution

  • I think this is because the locals() in your comprehension loop are not the same as the ones in your function, thus they don't contain 'q1'. You may use global variables but I would not recommend this. Moreover using eval with something coming maybe from user inputs can be hazardous has it can execute malicious code.

    I suggest you to store your list of predefined queries in a dictionary like in this example:

    class doorn:
        def __init__(self):
            self.name = 'foo'
            self.df = pd.DataFrame(data={'A':np.arange(0, 10), 'B':np.arange(5, 15), 'C':np.arange(14, 24)}, index=[x for x in range(0, 10)])
    
        def query_df(self, *query):
            # query arguments must by formatted as 'q1', 'q2' etc
            queries = [q for q in query]
    
            possible_queries = {'q1' : self.df.loc[self.df.A > 2].index,
            'q2' : self.df.loc[self.df.B < 13].index,
            'q3' : self.df.loc[self.df.C > 15].index}
    
            sel_rows = set().union(*[possible_queries[x] for x in queries])
    
            self.df = self.df.loc[sel_rows]
    

    Hope this will help you.