Search code examples
pythonsqlazurevectorazure-sql-database

Error: Explicit Conversion from Data Type ntext to vector Not Allowed in Azure SQL Database


I'm working on implementing hybrid search functionality using Azure SQL Database. I am trying to execute https://github.com/Azure-Samples/azure-sql-db-vector-search/blob/main/Hybrid-Search/hybrid_search.py and encountered the following error in my Python script:

pyodbc.DataError: ('22018', '[22018] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Explicit conversion from data type ntext to vector is not allowed. (529) (SQLExecDirectW); [22018] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Statement(s) could not be prepared. (8180)')

Following is the simple SQL Statement which is failing

INSERT INTO dbo.documents (id, content, embedding) VALUES (?, ?, CAST(? AS VECTOR(384)));

This error occurs when executing a SQL query that involves vector operations. It appears that the embeddings are being treated as ntext, and the current setup doesn't recognize the vector data type.

Environment Details:

  • Python version: 3.11.x
  • ODBC Driver: Microsoft ODBC Driver 18 for SQL Server
  • Azure SQL Database

Steps Taken:

  • Verified that the database schema defines the relevant column with the appropriate data type for storing vector embeddings.
  • Ensured that the Python environment is using the latest version (18) of the ODBC driver.
  • Confirmed that the SQL Server instance supports vector data types and operations.

Solution

  • enter image description here

    The data you are getting is ntext data type, cast that involves converting an ntext data type to another incompatible data type, like vector . That may be the reason to get above error. To resolve error, Cast the data into NVARCHAR(MAX) after that cast it to VECTOR(384). Use below command to insert the data:

    INSERT INTO dbo.documents (id, content, embedding)
    VALUES (?, ?, CAST(CAST(? as NVARCHAR(MAX)) AS VECTOR(384)));
    

    Then the code will execute successfully, without any error:

    enter image description here

    And the data will insert into the table successfully as shown below:

    enter image description here

    Here is the complete code for your reference;

    import os
    import pyodbc
    import logging
    import json
    from sentence_transformers import SentenceTransformer
    
    connection_string = (
        "DRIVER={ODBC Driver 17 for SQL Server};"
        "SERVER=<serverName>.database.windows.net;"
        "DATABASE=<db>;"
        "UID=<userName>;"
        "PWD=<password>"
    )
    
    if __name__ == '__main__':
        print('Initializing sample...')
        print('Getting embeddings...')
        sentences = [
            'The dog is barking',
            'The cat is purring',
            'The bear is growling'
        ]
        model = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1')  # returns a 384-dimensional vector
        embeddings = model.encode(sentences)
    
        # Establishing the connection using the hardcoded connection string
        conn = pyodbc.connect(connection_string)
    
        print('Cleaning up the database...')
        try:
            cursor = conn.cursor()
            cursor.execute("DELETE FROM dbo.documents;")
            cursor.commit()
        finally:
            cursor.close()
    
        print('Saving documents and embeddings in the database...')
        try:
            cursor = conn.cursor()
    
            for id, (content, embedding) in enumerate(zip(sentences, embeddings)):
                cursor.execute(f"""
                    INSERT INTO dbo.documents (id, content, embedding) 
                    VALUES (?, ?, CAST(cast(? as NVARCHAR(MAX)) AS VECTOR(384)));
                """,
                id,
                content, 
                json.dumps(embedding.tolist())
                )
    
            cursor.commit()
        finally:
            cursor.close()