Search code examples
sqldatabase-designdatabase-normalization

Is it better to have int joins instead of string columns?


Let's say I have a User which has a status and the user's status can be 'active', 'suspended' or 'inactive'.

Now, when creating the database, I was wondering... would it be better to have a column with the string value (with an enum type, or rule applied) so it's easier to both query and know the current user status or are joins better and I should join in a UserStatuses table which contains the possible user statuses?

Assuming, of course statuses can not be created by the application user.

Edit: Some clarification

  1. I would NOT use string joins, it would be a int join to UserStatuses PK
  2. My primary concern is performance wise
  3. The possible status ARE STATIC and will NEVER change

Solution

  • On most systems it makes little or no difference to performance. Personally I'd use a short string for clarity and join that to a table with more detail as you suggest.

    create table intLookup
    (
    pk integer primary key,
    value varchar(20) not null
    )
    insert into intLookup (pk, value) values
    (1,'value 1'),
    (2,'value 2'),
    (3,'value 3'),
    (4,'value 4')
    
    create table stringLookup
    (
    pk varchar(4) primary key,
    value varchar(20) not null
    )
    
    insert into stringLookup (pk, value) values
    (1,'value 1'),
    (2,'value 2'),
    (3,'value 3'),
    (4,'value 4')
    
    
    
    create table masterData
    (
    stuff varchar(50),
    fkInt integer references intLookup(pk),
    fkString varchar(4)references stringLookup(pk)
    )
    create index i on masterData(fkInt)
    create index s on masterData(fkString)
    
    insert into masterData
    (stuff, fkInt, fkString)
    select COLUMN_NAME, (ORDINAL_POSITION %4)+1,(ORDINAL_POSITION %4)+1 from INFORMATION_SCHEMA.COLUMNS
    go 1000
    

    This results in 300K rows.

    select 
    *
    from masterData m inner join intLookup i on m.fkInt=i.pk
    
    select 
    *
    from masterData m inner join stringLookup s on m.fkString=s.pk
    

    On my system (SQL Server) - the query plans, I/O and CPU are identical - execution times are identical. - The lookup table is read and processed once (in either query)

    There is NO difference using an int or a string.