Setup:
I am using the following components:
...against a MSSQL Server 2014.
The problem:
Suppose I have a table consisting of only 2 columns:
My SQLAlchemy Model looks like this:
from sqlalchemy.dialects.mssql.base import VARCHAR
MyText(Base):
id = Column(Integer, primary_key=True)
my_text = Column(VARCHAR())
When I try to create a new text entry like so, and my text is longer than 255 characters, the string is silently truncated at 255.
my_text='REALLY LONG STRING THAT IS LONGER THAN 255. E.g.: 6000+ characters. Assume my string is 558 bp long.'
print(len(my_text)) # Gives 558 bp.
new_text = MyText(my_text=my_text)
print(len(new_text.my_text)) # Gives 558 bp.
db_s.add(new_text)
print(len(new_text.my_text)) # Gives 558 bp.
db_s.commit()
print(len(new_text.my_text)) # Gives 255 bp now after commit.
At first, I thought this is caused at write time to DB. But I found that this is caused at query time (read below).
Question:
1.) Why is this happening?
I think this has something to do with the drivers (e.g.: unixodbc 2.3.4, FreeTDS 1.12) but I am not entirely where exactly this is occurring.
2.) More importantly, how do I fix this issue?
These issues are related but different:
MySQL VARCHAR index truncated to 255 during migration
The problem above is that I am not using MYSQL. So the solution provided does not fix the issue.
unixODBC/FreeTDS results truncated to 255 character
No solution is provided in that thread.
On that note, it should be noted that my strings can be routinely longer that 6000 characters (as I am working with long DNA sequences). I would really appreciate any way to fix the truncation issue.
Update (2017-10-12):
I made some remarkable but equally puzzling discoveries since yesterday.
# Connecting via pyodbc direct connection using just some helper functions to make things more convenient.
con_str = create_connection_string(DATABASE='test')
cur = make_connection_db(connection_str=con_str)
for row in cur.execute('SELECT Text.my_text, len(Text.my_text) FROM [test].[dbo].[Text]'):
print(row)
print(len(row[0]))
This gives me a string that is 558 characters long (see below).
('ATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATG',
558)
558
Now using SQL Alchemy but still direct sql statements.
# Using SqlAlchemy connection with direct SQL query.
eoi_engine = create_engine(
"mssql+pyodbc://user:somepw@db:1234/test?driver=FreeTDS")
s_con = eoi_engine.connect()
s_res = s_con.execute('SELECT Text.my_text, len(Text.my_text) FROM [test].[dbo].[Text]')
for row in s_res:
print(row)
print(len(row[0])
This gives me a string that it claims is 558 characters long but in reality it is only 255 characters long.
('ATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATGATG',
558)
255
And last scenario:
# Using SQLAlchemy full-stack.
Session = sessionmaker()
s = Session(bind=s_con)
fs_res = s.query(DNAPtsSeq).filter().all()
for row in fs_res:
print(row)
print(len(row.nt_seq))
This gives me a string that is only 255 characters long.
So in summary:
I don't know why I am only getting this error when I query using SQLAlchemy. Does anyone know the reason why? How can I fix this behavior? (I know the workaround is to use direct sql queries...)
In my case, it turns out that there were several compounding errors, I am listing for posterity in case other people run into the same problem as I:
Even changing VARCHAR(8000) to VARCHAR() did not fix the issue for me. It turns out that MSSQL VARCHAR(MAX) is highly non-standard. An obscure reference, led me to discover that there is a different VARCHAR just for MSSQL. If you import VARCHAR from the following location:
from sqlalchemy.dialects.mssql.base import VARCHAR
...you can then declare your text field to be VARCHAR() which equals VARCHAR(Max) in MSSQL speak.
http://docs.sqlalchemy.org/en/latest/dialects/mssql.html
This has increased the truncation limit to 4096 characters but not resolved it.
I tried to decipher the following answer here:
SQLAlchemy Truncating VARCHAR(MAX)
Unfortuately, changing the text size buffer does not remove the truncation limit of 4096. In my case, I had to also precede queries to the DB with a sql statement from SQLAlchemy:
db_s.execute('Set TEXTSIZE {0}'.format(SOME_BIG_NUMBER_LIKE_20000)