Search code examples
excelvbaactivexadoadodb

Using ADODB SQL in VBA, why are Strings truncated [to 255] only when I use grouping?


I’m using ADODB to query on Sheet1. If I fetch the data using SQL query on the sheet as below without grouping I’m getting all characters from comment.

However, if I use grouping my characters are truncated to 255.

Note – My first row contains 800 len of characters so drivers have identified the datatype correctly.


Here is my query output without grouping

Select  Product, Value, Comment, len(comment) from [sheet1$A1:T10000]

without grouping

With grouping

Select  Product, sum(value), Comment, len(comment) from [sheet1$A1:T10000] group by Product, Comment

with grouping


Solution

  • Thanks for posting this! During my 20+ years of database development using ADO recordsets I had never faced this issue until this week. Once I traced the truncation to the recordset I was really scratching my head. Couldn't figure how/why it was happening until I found your post and you got me focused on the GROUP BY. Sure enough, that was the cause (some kind of ADO bug I guess). I was able to work around it by putting correlated scalar sub-queries in the SELECT list, vice using JOIN and GROUP BY.

    To elaborate...

    At least 9 times out of 10 (in my experience) JOIN/GROUP BY syntax can be replaced with correlated scalar subquery syntax, with no appreciable loss of performance. That's fortunate in this case since there is apparently a bug with ADO recordset objects whereby GROUP BY syntax results in the truncation of text when the string length is greater than 255 characters.

    The first example below uses JOIN/GROUP BY. The second uses a correlated scalar subquery. Both would/should provide the same results. However, if any comment is greater than 255 characters these 2 queries will NOT return the same results if an ADODB recordset is involved.

    Note that in the second example the last column in the SELECT list is itself a full select statement. It's called a scalar subquery because it will only return 1 row / 1 column. If it returned multiple rows or columns an error would be thrown. It's also known as a correlated subquery because it references something that is immediately outside its scope (e.emp_number in this case).

    SELECT e.emp_number, e.emp_name, e.supv_comments, SUM(i.invoice_amt) As total_sales
    FROM employees e INNER JOIN invoices i ON e.emp_number = i.emp_number
    GROUP BY e.emp_number, e.emp_name, e.supv_comment
    
    SELECT e.emp_number, e.emp_name, e.supv_comments,
    (SELECT SUM(i.invoice_amt) FROM invoices i WHERE i.emp_number = e.emp_number) As total_sales
    FROM employees e