Search code examples
pythonsqlbashflat-filecsv

python/bash SQL for tsv flatfiles (No sqlite)


Background:

sqlite is great for doing SQL operations on data loaded into databases, but many times in my line of work I need to do selects, joins, and where statements on files that aren't loaded into a database and not necessarily worth the time to do loading/initialization into a database. Also, the random access characteristics of sqlite often make operations that are being performed on every row in a database slower.

Question:

Is there a suite of SQL type commands/fxns (preferably python/bash) that doesn't need sqlite and works on just raw tab spaced files? For instance, instead of using tables to select rows, just use column numbers.

Example

select col1,col2,col3 from fileName.tsv where col1[int] < 3

Note: I realize a lot of this can be accomplished with awk, cut, bash-join, etc; I was wondering if there was something more SQLesque?


Solution

  • After googling python equivalent of DBD::CSV, I found KirbyBase. That looks as though it'll fit the bill.

    Since I generally don't use Python, however, I've never tried it.

    Edited to add: Okay, after taking a glance at the documentation, the query commands aren't exactly SQL, but they're a lot more SQLesque than using awk.