Search code examples
pythonpandasoracle-databasecx-oracle

Set correct pandas.DataFrame datatypes from cx_Oracle default data types


I have some oracle table with next field datatype:

ID|              NOT NULL NUMBER              
VERSION|         NOT NULL NUMBER              
STAT_ACTUAL|     NOT NULL NUMBER              
REGION|          NOT NULL VARCHAR2(5 CHAR)    
PARENTID|                 NUMBER              
CITY|                 VARCHAR2(5 CHAR)    
...

and if I had trying to add them from cx_Oracle to pandas.DataFrame like that:

import pandas as pd
from sqlalchemy import *

conn = cx_Oracle.connect('datbs/datbs@host/serv')
cursorr = conn.cursor()
SQL  = "select * from table where region = 1"

df= pd.read_sql_query(SQL, con=conn)

datatypes looks like:

ID                          int64
VERSION                     int64
STAT_ACTUAL                 int64
REGION                     object
PARENTID                    int64
CITY                       object
...

So, all my oracle varchar2 (which looks in cx_Oracle as ('REGION', <class 'cx_Oracle.STRING'>)) are object in pandas.df!

I tried to convert them by this:

def OutConverter(value):
    if value is None:
        return ''
    return value

def VarToStr(cursor, name, defaultType, size, precision, scale):
    if defaultType in (cx_Oracle.STRING, cx_Oracle.OBJECT):
        return cursor.var(str, size, cursorr.arraysize, outconverter=OutConverter)

conn.outputtypehandler = VarToStr

but result hasn't been taken...df.dtypes show object type again

How to map oracle and cx_Oracle datatypes to pandas df?


Solution

  • I had used manual cast of uncorrectly columns like:

    newdf = df.astype({"REGION": "Int64", ..., "CITY": "Int32", ...})
    

    outupt:

    REGION                      int64
    CITY                        Int32