Search code examples
perlsingle-responsibility-principle

Single Responsibility Principle: Write data to file after running a query


I have to write rows generated after running a sql query to a file.

# Run the SQL script.
my $dbh = get_dbh($source);
my $qry = $dbh->prepare("$sql_data");
$qry->execute();

# Dump the data to file.
open(my $fh_write, ">", "$filename");
while (my @data = $qry->fetchrow_array())
{
  print {$fh_write} join("\t", @data) . "\n";
}
close($fh_write);

Clearly i am doing two thing in a function:

  1. Running the sql query.
  2. Writing the data to file.

Is there a way to do this using SRP ?

There are lots of rows in the data so returning the array of rows from a seperate function might not be nice idea.


Solution

  • You could split it up into two different functions. One would query the database, and the other would write data to a file.

    sub run_query {
        my ( $sql, @args ) = @_;
    
        # if you truly want separation of concerns,
        # you need to connect $dbh somewhere else
    
        my $sth = $dbh->prepare($sql);
        $sth->execute(@args);
    
        # this creates an iterator
        return sub {
            return $sth->fetchrow_arrayref;
        };
    }
    

    This function takes a an SQL query and some arguments (remember to use placeholders!) and runs the query. It returns a code reference that closes over $sth. Every time that reference is invoked, one line of results will be fetched. When the statement handle $sth is empty, it will return undef, which is handed through, and you're done iterating. That might seem overkill, but stay with me for a moment.

    Next, we make a function to write data to a file.

    sub write_to_file {
        my ( $filename, $iter ) = @_;
    
        open my $fh, '>', $filename or die $!;
    
        while ( my $data = $iter->() ) {
            print $fh join( "\t", @{$data} ), "\n";
        }
    
        return;
    }
    

    This takes a filename and an iterator, which is a code reference. It opens the file, and then iterates until there is no more data left. Every line is written to the file. We don't need close $fh because it's a lexical filehandle that will be closed implicitly once $fh goes out of scope at the end of the function anyway.

    What you've done now is define an interface. Your write_to_file function's interface is that it takes a file name and an iterator that always returns an array reference of fields.

    Let's put this together.

    my $iter = run_query('SELECT * FROM orders');
    write_to_file( 'orders.csv', $iter );
    

    Two lines of code. One runs the query, the other one writes the data. Looks pretty separated to me.

    The good thing about this approach is that now you can also write other things to a file with the same code. The following code could for example talk to some API. The iterator that it returns again gives us one row of results per invocation.

    sub api_query {
        my ($customer_id) = @_;
    
        my $api = API::Client->new;
        my $res = $api->get_orders($customer_id); # returns [ {}, {}, {}, ... ]
    
        my $i = 0;
        return sub {
            return if $i == $#{ $res };
            return $res->[$i++];
        }
    }
    

    You could drop this into the above example instead of run_query() and it would work, because this function returns something that adheres to the same interface. You could just as well make a write_to_api or write_to_slack_bot function that has the same partial interface. One of the parameters would be the same kind of iterator. Now those are exchangeable too.


    Of course this whole example is very contrived. In reality it highly depends on the size and complexity of your program.

    If it's a script that runs as a cronjob that does nothing but create this report once a day, you should not care about this separation of concerns. The pragmatic approach would likely be the better choice.

    Once you have a lot of those, you'd start caring a bit more. Then my above approach might be viable. But only if you really need to have things flexible.

    Not every concept is always applicable, and not every concept always makes sense.


    Please keep in mind that there are tools that are better suited for those jobs. Instead of making your own CSV file you can use Text::CSV_XS. Or you could use an ORM like DBIx::Class and have ResultSet objects as your interface.