I am trying to create a class, that has 2 methods:
Query data (as a generator)
Save as json
@dataclass
class Data_Query:
hierarchic: str
sku: bool
pred_lenght: int
def query(self, db):
if (self.hierarchic == 'store' and self.sku == True):
x = db.aggregate([{...}]);
self.export_json(x)
def export_json(self, x, file):
with open(f'/home/Documents/dataset/{file}', 'w') as fp:
for i in x:
json.dump(i, fp)
fp.write('\n')
When I execute the query method, both methods are executed.
data = Data_Query('store', True, 56)
data.query(db)
What do I have to modify in order to call these methods separated ?
My expected output:
data.query(db).export_json('abc.json')
Instead of calling export_json
directly from query
, save the result in an instance attribute and return self
to enable chaining. Then export_json
looks for the saved query on the instance, rather than taking it as an argument.
@dataclass
class Data_Query:
hierarchic: str
sku: bool
pred_lenght: int
def query(self, db):
if (self.hierarchic == 'store' and self.sku == True):
self.x = db.aggregate([{...}]);
# self.export_json(x)
return self
def export_json(self, file):
try:
x = self.x
except AttributeError:
return
with open(f'/home/Documents/dataset/{file}', 'w') as fp:
for i in x:
json.dump(i, fp)
fp.write('\n')
del self.x
Now you can write data.query(db).export_json('abc.json')
, and the JSON file
will only be written if, in fact, a query takes place.
However, this isn't the greatest design. Nothing about export_json
is specific to your class; it should be a regular function that takes a result and a file name
and that you call after you make a query, if the query returns any results. Something more like
@dataclass
class Data_Query:
hierarchic: str
sku: bool
pred_lenght: int
def query(self, db):
if (self.hierarchic == 'store' and self.sku == True):
return db.aggregate([{...}])
def export_json(self, x, file):
with open(f'/home/Documents/dataset/{file}', 'w') as fp:
for i in x:
json.dump(i, fp)
fp.write('\n')
result = data.query(db)
if result is not None:
export_json(result, 'abc.json')
You might argue "Of course export_json
is related to my class; it assumes that x
is an iterable of objects, which is something defined by the query
method." In that case, you might consider defining a QueryResult
class, and make export_json
a method of that class. Then DataQuery.query
returns an instance of QueryResult
, and chaining feels a little less arbitrary: you are exporting the result, not the query.
# By the way, I am assuming there is more to this class than a query
# method; otherwise, there should probably just be a regular function
# that takes the db, hierarchic, and sku as arguments.
@dataclass
class DataQuery:
hierarchic: str
sku: bool
pred_length: int
def query(self, db):
result = None
if self.hierarchic == 'store' and self.sku:
result = db.aggregate(...)
return QueryResult(result)
class QueryResult:
def __init__(self, result):
self.result = result
def export_json(self, file):
if self.result is None:
return
with open(f'/home/Documents/dataset/{file}', 'w') as fp:
for i in x:
json.dump(i, fp)
fp.write('\n')
data.query(db).export_json('abc.json')