Search code examples
pythoncsvsqliteodo

Inserting 'NA' as a string into SQLite table using odo function on Python


Consider the following file (stock_prices.csv) in CSV format:

Symbol,Price
RY,96.61
NA,58.69
BNS,80.35

When using the odo function in Python to insert the CSV file into a SQLite database table, the NA ticker appears to be inserted as a None value.

from odo import odo, dshape

input_csv = 'stock_prices.csv'
output_sqlite = 'sqlite:///stocks.db::stock_prices'
ds = dshape('var * {Symbol: string, Price: float64}')

odo(input_csv, output_sqlite, dshape=ds)

Here is the code I used to query the SQLite database.

DB_PATH = 'stocks.db'
cn = sqlite3.connect(DB_PATH)
c = cn.cursor()
c.execute("SELECT * FROM stock_prices")
for row in c.fetchall():
    print(row)

Results were as follows:

('RY', 96.61)
(None, 58.69)
('BNS', 80.35)

While I could update each row whose Symbol is None with 'NA', I would rather insert the row correctly the first time.

Note: I am using odo function because for my actual project the files I will be inserting into the table are as large as several gigabytes and contain around 15-20 columns. odo seemed to me to be the quickest way for me to accomplish what I need to do in a short period of time.


Solution

  • One way to get around this is to read the file in using pandas and specifying na_filter=False.

    >>> import pandas as pd
    >>> df = pd.read_csv('stock_prices.csv',na_filter=False)
    >>> df
      Ticker  Price
    0     RY  96.61
    1     NA  58.69
    2    BNS  80.35
    

    And import:

    odo(df, output_sqlite, dshape=ds)
    

    Results:

    >>> for row in c.fetchall():
    ...     print(row)
    ...
    (u'RY', 96.61)
    (u'NA', 58.69)
    (u'BNS', 80.35)