Search code examples
sqlhivehiveql

Exclude Minimum 6 Digits and Replace Trailing digits In Hive


Can someone help me to write below logic in hive. I have one value in column in which I have digits with trailing 0's. I need to Replace all these 0's by 9,while replacing the 0's by 9 ,I need to also consider that before 9 minimun 6 digits should be there, else need to exclude some 0's so that before 9 can have at-least 6 digits.PFB some Scenarios.

  1. 1234506600000000000

Here we can see the number of digits before trailing 0's is 8 (12345066) so i Just need to remaining 0's by 9 and the output will be like. 1234506699999999999.

  1. 1234500000000000000

Here I have only 5 digits before trailing 0's so I need to consider 6th Position's 0 also a digit and need to exclude this while replacing the 0's by 9 so the output will be 1234509999999999999.

  1. 1000000000000000000

Here I have only 1 digit before trailing 0's ,so I need to exclude 5 extra 0's and need to replace remaining 0's by 9, so final output will be like 1000009999999999999.

 Input                        Output

1234506600000000000        1234506699999999999
1234500000000000000        1234509999999999999
1000000000000000000        1000009999999999999

Solution

  • If you want to modify leftjoin's technique from the other question we can tweak the Regex to match at least 6 digits including 0s

    with mytable as (
    select '1234560000000' as input union all
    select '123450000000' union all
    select '12340000000' union all
    select '1230000000'   
    )
    
    select lpad(concat(splitted[0], translate(splitted[1],'0','9')),13,0)
    from
    (
    select split(regexp_replace(input,'(\\d{6,}?)(0+)$','$1|$2'),'\\|') splitted
      from mytable
    )s
    

    If you want to go the replace/pad/replace route I proposed, you'd check the length of the number after it's rtrim'd and if it's less than 6, rpad it out to 6 with zeroes. Most implementations of rpad would chop the string off at 6 chars if it were longer than 6 - if they didn't it would be nice and simple to just call rpad after rtrim. It might be worth making your own rpad function that leaves strings longer than N alone, if hive's rpad performs a substring op