Search code examples
pythondatabaseblobpyodbcsybase

Data from Sybase image column truncated at 32 KiB when retrieved via pyodbc


I have PDF files stored as image datatype (large binary data as mentioned in the doc) in a sybase database table. I am trying to read one of those files from the db and write it to a file in a local folder using python pyodbc package like this example :

 driver = "FreeTDS"
 prt = 'port'
 db = 'db'
 passwd = 'passwd'
 usr = 'usr'
 serv = 'serv'
 conn = pyodbc.connect(driver=driver, server=serv, port=prt, uid=usr, pwd=passwd)
 sql_query = (
    "SELECT ARCH_DOC_DOC as file_content FROM table_name WHERE ARCH_DOC_ID = id"
 )
 cursor = conn.cursor()
 cursor.execute(sql_query)
 pdf_data = cursor.fetchone()[0]
 with open('my_test_file.pdf', 'wb') as f:
     f.write(pdf_data)

I am using TDS driver and running this code on Debian GNU/Linux 11 machine

Compile-time settings (established with the "configure" script)
                            Version: freetds v1.2.3
             freetds.conf directory: /etc/freetds
     MS db-lib source compatibility: no
        Sybase binary compatibility: yes
                      Thread safety: yes
                      iconv library: yes
                        TDS version: auto
                              iODBC: no
                           unixodbc: yes
              SSPI "trusted" logins: no
                           Kerberos: yes
                            OpenSSL: no
                             GnuTLS: yes
                               MARS: yes

The problem is that I am getting corrupt file in the end and after testing a couple of files I noticed that I am always getting a file size 33ko. For example, the original file size that I am using to test is 90ko in the db and the file I am getting is only 33ko. So I am wondering if the issue is in the database/driver config or if there is a limit in the size of data that I can read with pyodbc ? And how can I fix that ?


Solution

  • This is a reproducible issue, discussed here

    https://github.com/mkleehammer/pyodbc/issues/1226

    As a workaround, we can use JayDeBeApi and jTDS, like so:

    import jaydebeapi
    
    cnxn = jaydebeapi.connect(
        "net.sourceforge.jtds.jdbc.Driver",
        "jdbc:jtds:sybase://192.168.0.199:5000/mydb;useLOBs=false",
        ["sa", "myPassword"],
        "/home/gord/Downloads/jtds-1.3.1.jar"
        )
    crsr = cnxn.cursor()
    crsr.execute("SELECT ARCH_DOC_DOC FROM so76408133 WHERE ARCH_DOC_ID = 1")
    pdf_data = crsr.fetchone()[0]
    with open("test_pdf", "wb") as f:
        f.write(pdf_data)
    

    Note that this requires a Java Runtime Environment (JRE). On Ubuntu/Debian, it can be installed via

    sudo apt install default-jre