Search code examples
oracle-databasems-accesssql-like

How do I run a general SQL LIKE statement with specific exclusions?


I am writing a query in MS Access¹ towards an Oracle database, where I am trying to find everyone incarcerated in prisons in Norway during the census of 1875. To do this, I have to search for keywords in several different fields, as the ennumerators would describe their state several different ways, e.g.:

  • By describing their family connection as ‘Fange’, ‘Arrestant’ (prisoner) or the likes.
  • By listing them as temporarily present at a location named ‘Fængsel’ (prison) or the likes.
  • If at their family’s home, by listing them as absent and naming their assumed location as above.
  • By listing them as living in a separate building, named ‘Fængsel’ or the likes.
  • By listing their profession as ‘Fange’, ‘Arrestant’ (v.s.) or the likes.

One of the keywords I am looking for, is variants of ‘Bodsfængsel’ (a ‘prison of penitence’). Problems is, ‘Bod’ can also mean stall/small building. To make sure I include all variants of ‘Bodsfængsel’, I have written the following lines (only the relevant code snippet is shown here; a complete section is shown below):

Or BOSTNVN Like "*bod*"
    And
    (
        BOSTNVN not Like "*bode*" or
        BOSTNVN not Like "*dbod*" or
        BOSTNVN not Like "*bodg*"
    )

Problem is, when I run this code, I get 8513 hits, exactly the same number of lines as when I exclude the not like-lines.

Comments: The exclusions are to avoid hits on placenames such as ‘Toldbod’, ‘Boden’ or ‘Toldbodgade’ If I somehow could get it to output extra fields (e.g. Field0, Field1, Field2 et c.) to show where the hit was found, it would probably make it much easier to find the errors too. I hope I have provided all necessary information. Please don’t chop my head off!

¹ Access 2007 on Win 7 Enterprise.

Here is a complete snippet from the beginning of the query; if this is not necessary, please advise, and I will remove it:

SELECT
    KOMMNR, KRETSNR, BOSTNR, PERSNR,
    ⋮
PID  
FROM
    FOLKETELLINGER_PERSON_1875  
WHERE
    (
        (
            KOMMNR Not Like "11*" 
            And KOMMNR Not Like "12*" 
            ⋮
            And KOMMNR Not Like "17*"
        ) 
        AND (
            SEDVBO Like "*Fæng*" 
            Or SEDVBO Like "*fæng*" 
            ⋮
            Or SEDVBO Like "*arest*"
            And
            (
                SEDVBO not Like "*Bode*" or
                SEDVBO not Like "*dBod*" or
                SEDVBO not Like "*Bodg*"
            )
            Or SEDVBO Like "*bod*"
            And
            (
                SEDVBO not Like "*bode*" or
                SEDVBO not Like "*dbod*" or
                SEDVBO not Like "*bodg*"
            )
            Or SEDVBO Like "*Bot*"
            And
            (
                SEDVBO not Like "*Bote*" or
                SEDVBO not Like "*dBot*" or
                SEDVBO not Like "*Botg*"
            )
            Or SEDVBO Like "*bot*"
            And
            (
                SEDVBO not Like "*bote*" or
                SEDVBO not Like "*dbot*" or
                SEDVBO not Like "*botg*"
            )
        )
    ) 
    OR (
    ⋮
    ); 

Solution

  • First, SQL is not case sensitive, so you have redundant lines. E.g. these two do the same thing:

    Or SEDVBO Like "*Bot*"
    Or SEDVBO Like "*bot*"
    

    You're using or in your bottom level conditions where you should be using and. If you use or it will return all records because if any one of the inclusion criteria are triggered the record would be included even if it fails all others at that same level. For example, if a value is not Like '*Bode*' it will be included even if it is like "*dBod*".

    You also have your code split out in a more complex manner than necessary, if the code is meant to have the statements at the level they currently are. You can condense your SEDVBO block like this:

    SEDVBO Like "*arest*"
    Or SEDVBO Like "*bod*"
    Or SEDVBO Like "*Bot*"
        And
            (
                SEDVBO not Like "*Bode*" and 
                SEDVBO not Like "*dBod*" and
                SEDVBO not Like "*Bodg*" and
                SEDVBO not Like "*bote*" and
                SEDVBO not Like "*dbot*" and
                SEDVBO not Like "*botg*"
            )
    

    Or if you're trying to do what I think you might be, namely looking for unique combinations of Like and not Like you are missing parentheses around each or/and pairing. You'd add them like this:

    (SEDVBO Like "*arest*"
        And
        (
            SEDVBO not Like "*Bode*" and
            SEDVBO not Like "*dBod*" and
            SEDVBO not Like "*Bodg*"
        )
    )
    Or 
    (SEDVBO Like "*bod*"
        And
        (
            SEDVBO not Like "*bode*" and
            SEDVBO not Like "*dbod*" and
            SEDVBO not Like "*bodg*"
        )
    )
    Or 
    (SEDVBO Like "*bot*"
        And
        (
            SEDVBO not Like "*bote*" and
            SEDVBO not Like "*dbot*" and
            SEDVBO not Like "*botg*"
        )
    )
    

    Your first snippet would become:

    Or 
    (BOSTNVN Like "*bod*"
        And
        (
            BOSTNVN not Like "*bode*" and
            BOSTNVN not Like "*dbod*" and
            BOSTNVN not Like "*bodg*"
        )
    )