Search code examples
pythonpython-dataclassespython-class

Dataclasses - Basic method chaining


I am trying to create a class, that has 2 methods:

  • Query data (as a generator)

  • Save as json

    @dataclass
    class Data_Query:
        hierarchic: str
        sku: bool
        pred_lenght: int
    
        def query(self, db):
           if (self.hierarchic == 'store' and self.sku == True):
               x = db.aggregate([{...}]);
               self.export_json(x) 
    
        def export_json(self, x, file):
            with open(f'/home/Documents/dataset/{file}', 'w') as fp:
                for i in x:
                    json.dump(i, fp)
                    fp.write('\n')
    

When I execute the query method, both methods are executed.

data = Data_Query('store', True, 56)
data.query(db)

What do I have to modify in order to call these methods separated ?

My expected output:

data.query(db).export_json('abc.json')

Solution

  • Instead of calling export_json directly from query, save the result in an instance attribute and return self to enable chaining. Then export_json looks for the saved query on the instance, rather than taking it as an argument.

    @dataclass
    class Data_Query:
        hierarchic: str
        sku: bool
        pred_lenght: int
    
        def query(self, db):
           if (self.hierarchic == 'store' and self.sku == True):
               self.x = db.aggregate([{...}]);
               # self.export_json(x) 
           return self
    
        def export_json(self, file):
            try:
                x = self.x
            except AttributeError:
                return
            
            with open(f'/home/Documents/dataset/{file}', 'w') as fp:
                for i in x:
                    json.dump(i, fp)
                    fp.write('\n')
            del self.x

    Now you can write data.query(db).export_json('abc.json'), and the JSON file will only be written if, in fact, a query takes place.

    However, this isn't the greatest design. Nothing about export_json is specific to your class; it should be a regular function that takes a result and a file name and that you call after you make a query, if the query returns any results. Something more like

    @dataclass
    class Data_Query:
        hierarchic: str
        sku: bool
        pred_lenght: int
    
        def query(self, db):
           if (self.hierarchic == 'store' and self.sku == True):
               return db.aggregate([{...}])
    
    def export_json(self, x, file):
        with open(f'/home/Documents/dataset/{file}', 'w') as fp:
            for i in x:
                json.dump(i, fp)
                fp.write('\n')
    
    result = data.query(db)
    if result is not None:
        export_json(result, 'abc.json')
    

    You might argue "Of course export_json is related to my class; it assumes that x is an iterable of objects, which is something defined by the query method." In that case, you might consider defining a QueryResult class, and make export_json a method of that class. Then DataQuery.query returns an instance of QueryResult, and chaining feels a little less arbitrary: you are exporting the result, not the query.

    # By the way, I am assuming there is more to this class than a query
    # method; otherwise, there should probably just be a regular function
    # that takes the db, hierarchic, and sku as arguments.
    @dataclass
    class DataQuery:
        hierarchic: str
        sku: bool
        pred_length: int
    
        def query(self, db):
            result = None
            if self.hierarchic == 'store' and self.sku:
                result = db.aggregate(...)
            return QueryResult(result)
    
    
    class QueryResult:
        def __init__(self, result):
            self.result = result
    
        def export_json(self, file):
            if self.result is None:
                return
    
            with open(f'/home/Documents/dataset/{file}', 'w') as fp:
               for i in x:
                    json.dump(i, fp)
                    fp.write('\n')
    
    
    data.query(db).export_json('abc.json')