Search code examples
sql-serverpyodbcfreetdsdjango-2.2sql-variant

Is there a way in Django to always CAST a specific Field to VARCHAR?


Background Situation Description:

I'm creating a Django2 application which use a Microsoft SQL Server as back-end database. This database has several tables (which I cannot manage or change them) containing SQL_VARIANT fields. This fields contains either an empty string or a number (They are used as Primary Key or as Foreign Key)

Problem:

On MSSQL when you make a JOIN using a SQL_VARIANT field and a VARCHAR field, it yields an row empty result, to solve this you have to explicitly convert the SQL_VARIANT field to VARCHAR using CAST([Table Name].[SQL_VARIANT Field Name] AS VARCHAR

Goal:

I want to find a way to do the CAST([Table Name].[SQL_VARIANT Field Name] AS VARCHAR every time Django makes a JOIN on the QuerySet or calls a SQL_VARIANT field on the QuerySet

How can I accomplish this?

Code as reference:

In my problem I'm using 3 tables from MSSQL Database (Vendor, Product and Product Ledger Entry)

Vendor Table

create table [Vendor]
(
    Id sql_variant not null primary key, 
    Name varchar(30) not null,
    Address varchar(30) not null,
    City varchar(30) not null,
    Contact varchar(30) not null,
    Type int not null,
    Category int not null,
)

Product Table

create table [Product]
(
    Id varchar(20) not null primary key,
    Description varchar(60) not null,
    Class varchar(10) not null,
    [Vendor Id] varchar(20) not null,
)

Product Ledger Entry table

create table [Product Ledger Entry]
(
    Id int not null primary key,
    [Product Id] varchar(20) not null,
    [Posting Date] datetime not null,
    [Entry Type] int not null,
    [Source Id] varchar(20) not null,
    [Document Id] varchar(20) not null,
    Quantity decimal(38,20) not null,
)

If I wanted to transverse all 3 tables, I would do ...

SELECT
       [Vendor].[Id],
       [Vendor].[Name],
       [Product].[Id],
       [Product Ledger Entry].Quantity
FROM [Vendor]
    INNER JOIN [Product] ON ([Vendor].[Id] = [Product].[Vendor Id])
    INNER JOIN [Product Ledger Entry] ON ([Product].[Id] = [Product Ledger Entry].[Product Id])

However this query doesn't yield any rows. Only by making a explicit cast over the SQL_VARIANT field, this query shows the excepted result.

SELECT
       [Vendor].[Id],
       [Vendor].[Name],
       [Product].[Id],
       [Product Ledger Entry].Quantity
FROM [Vendor]
    INNER JOIN [Product] ON (CAST([Vendor].[Id] AS VARCHAR(20)) = [Product].[Vendor Id])
    INNER JOIN [Product Ledger Entry] ON ([Product].[Id] = [Product Ledger Entry].[Product Id])

The same thing happends when you use Django, here are the models:

class Vendor(models.Model):
    id = models.CharField(db_column='Id', primary_key=True, , max_length=20)

    name = models.CharField(db_column='Name', max_length=30)

    # Address Description
    address = models.CharField(db_column='Address', max_length=30)

    # Province/City/Municipality
    city = models.CharField(db_column='City', max_length=30)

    class Meta:
        managed = False
        db_table = 'Vendor'

    def __str__(self):
        return f'{self.id} | {self.name}'
class Product(models.Model):
    id = models.CharField(db_column='No_', primary_key=True, max_length=20)

    description = models.CharField(db_column='Description', max_length=60)

    vendor_id = models.ForeignKey(
        'Vendor', on_delete=models.CASCADE,
        db_column='Vendor Id', to_field='id', related_name='products', db_index=False, blank=True
    )

    class Meta:
        managed = False
        db_table = 'Product'

    def __str__(self):
        return f'{self.id}'
class ProductLedgerEntry(models.Model):
    id = models.IntegerField(db_column='Id', primary_key=True)

    product_id = models.ForeignKey(
        'Product', on_delete=models.CASCADE,
        db_column='Product Id', to_field='id', related_name='ledger_entries', db_index=False
    )

    posting_date = models.DateTimeField(db_column='Posting Date')

    ENTRY_TYPE = [
        (0, 'Compra'),
        (1, 'Venta'),
        (2, 'Ajuste positivo'),
        (3, 'Ajuste negativo'),
        (4, 'Transferencia'),
        (5, 'Consumo'),
        (6, 'Salida desde fab.'),
    ]
    entry_type = models.IntegerField(
        db_column='Entry Type', choices=ENTRY_TYPE
    )

    quantity = models.DecimalField(db_column='Quantity', max_digits=38, decimal_places=20)

    class Meta:
        managed = False
        db_table = 'Product Ledger Entry'

    def __str__(self):
        return f'{self.product_id} | {self.variant_code} | {self.posting_date}'

As long as I don't make use of annotate(...), select_related(...) or any other Django API methond that makes JOIN under the hood, I can manage to get proper results. However, I want to use annotate.

Please help me


Solution

  • AFAIK, the only way to make this possible would be to fork the SQL Server backend you are using (presumably django-pyodbc-azure) to insert an override when it detects the SQL_VARIANT field.

    However, that feels like a pretty bad idea; as the book "Two Scoops of Django" says, avoid the temptation to create a FrankenDjango! Your underlying tables aren't created by Django; in my experience, it is best to use raw SQL for tables that aren't created and managed by Django. You can use a mix of ORM for tables managed or created by Django, and raw SQL for legacy tables created outside of Django.

    Using bound parameters through Django's execute() method still gives you SQL injection protection: https://docs.djangoproject.com/en/2.2/topics/db/sql/#executing-custom-sql-directly

    Good luck!