Search code examples
databricksazure-databricks

Databricks Concatenation Issue


As community member called Saideep help with an coding issue I had here

I have attempted to modify the code to search for values with without www (after removing either http or https) and prepending it to a field value.

So, at the moment, my code does the following: enter image description here

You can see from the image, that code successfully removes https:// and http, but it fails to prepend the values with www in the field newwebsite_url when doesn't exists in the homepage_url field

For example movingahead.com should appear as www.movingahead.com in newwebsite_url

My code is as follows:

SELECT  tt.homepage_url
        ,concat(iff(left(v1.RightString, 4)='www.', null, 'www.')) as addwww , LEFT(v1.RightString,COALESCE(NULLIF(CHARINDEX('/',v1.RightString)-1,-1),150)) as newwebsite_url
   FROM basecrmcbreport.organizations tt
  inner join (select (SUBSTRING(homepage_url,CHARINDEX('//',homepage_url)+2,150)) from basecrmcbreport.organizations)v1(RightString) on tt.homepage_url  like concat('%',v1.RightString,'%') escape '|';

I know its a concatenation issue, but not where to fix it.

Any thoughts?


Solution

  • As you suspected, the issue here is with concatenation. When using concat in the query, we have just given the part where taking www. if it is absent and null if it is present.

    • But we are not concatenating this with the extracted part from URL (instead using it to create new column). To fix this, we have to place the above function value inside concat.

    • The following is the demonstration of the same using my sample data.

    enter image description here

    • Using the following updated query, I was able to achieve your requirement.
    %sql
    
    SELECT  tt.homepage_url
            ,concat(iff(left(v1.RightString, 4)='www.', '', 'www.'),LEFT(v1.RightString,COALESCE(NULLIF(CHARINDEX('/',replace(v1.RightString,'\\','/'))-1,-1),150)))  as newwebsite_url
       FROM demo tt
      inner join (select (SUBSTRING(homepage_url,CHARINDEX('//',homepage_url)+2,150)) from demo)v1(RightString) on tt.homepage_url  like concat('%',v1.RightString,'%') escape '|';
    

    enter image description here

    NOTE: I have also replaced \ with /. Also, I have used '' (empty string) instead of using null in concat.