Search code examples
sqlstringfull-text-searchperiscope

Searching for specific words in a text string - SQL


I wondered if someone might be able to help me. I know that I am doing something wrong and over-complicating this but i'm not sure what.

I would like to pick out text from questions 205,227 and 278 when someone uses the words Autism, ASD, Aspergers etc... to answer the question.

What is happening is I am only getting single word phrases, rather than when someone writes the word in a string e.g. 'I have autism' will not appear in the output. Only when someone writes Autism. I have noticed it is case sensitive too and I wondered whether this can be coded so that I don't have to write all words in every combination of upper and lower case.

The main things I need are the specific questions (question.id) and to be able to search for the occurrence of specific words and phrases within the responses of these questions (texter_survey_response_value.value).

    select 
    texter_survey_response.response_id, question_id, conversation_id, 
    texter_survey_response_value.value as scrubbed_feedback
    from
    texter_survey_response
    join texter_survey_response_value on texter_survey_response.response_id = 
    texter_survey_response_value.response_id
    where
    texter_survey_response_value.question_id IN (205, 227, 278)
    and texter_survey_response_value.value = 'Autism'
    or texter_survey_response_value.value = 'ASD'
    or texter_survey_response_value.value = 'Autistic'
    or texter_survey_response_value.value = 'Aspergers'
    or texter_survey_response_value.value = 'Autism Spectrum'
    or texter_survey_response_value.value = 'Autistic Spectrum Disorder'
    or texter_survey_response_value.value = 'Autistic Spectrum'
    or texter_survey_response_value.value = 'Autism Spectrum Disorder'
    and texter_survey_response_value.value is not null

Solution

  • If you use =, it will only search for exact matches. You can achieve a contains-search with like by adding % to both sides of your search phrases. You can work around the case-sensitive nature of most like implementations by only working with upper or lower case.

    select 
      texter_survey_response.response_id, question_id, conversation_id, 
      texter_survey_response_value.value as scrubbed_feedback
    from
      texter_survey_response join texter_survey_response_value
        on texter_survey_response.response_id = texter_survey_response_value.response_id
    where
      texter_survey_response_value.question_id IN (205, 227, 278)
      and texter_survey_response_value.value is not null
      and (
        UPPER(texter_survey_response_value.value) like '%AUTISM%' or
        UPPER(texter_survey_response_value.value) like '%AUTISTIC%' or
        UPPER(texter_survey_response_value.value) like '%ASPERGERS%' or
        UPPER(texter_survey_response_value.value) like '%ASD%');
    

    You didn't write what database you use, so you might have to call a different procedure than UPPER.

    Hope this helps :)