I currently have a challenge with Snowflake, where I have a PRICE column like the bellow, the goal is to "scale" this values but keep the original sum intact, like, I do not need to respect the proportions on the scale, but lowest value should continue to be the lowest and so on. Also I guess this can be a SQL problem.
I tried running a script using WIDTH_BUCKET but there is no option to maintain the total sum.
`SELECT
sale_date,
price,
WIDTH_BUCKET(price, 200000, 600000, 5) AS "SALES GROUP"
FROM home_sales
ORDER BY sale_date;`
`+------------+-----------+-------------+
| SALE_DATE | PRICE | SALES GROUP |
|------------+-----------+-------------|
| 2013-08-01 | 10 | 1 |
| 2014-02-01 | 20 | 2 |
| 2015-04-01 | 30 | 3 |
| 2016-04-01 | 10 | 1 |
| 2017-04-01 | 50 | 4 |
| 2018-04-01 | 60 | 5 |
+------------+-----------+-------------+`
The output(SALES GROUP column) I am really looking for is like this: By this example the total SUM of column price is 180.
`+------------+-----------+-------------+
| SALE_DATE | PRICE | SALES GROUP |
|------------+-----------+-------------|
| 2013-08-01 | 10 | 12 |
| 2014-02-01 | 20 | 24 |
| 2015-04-01 | 30 | 28 |
| 2016-04-01 | 10 | 12 |
| 2017-04-01 | 50 | 47 |
| 2018-04-01 | 60 | 57 |
+------------+-----------+-------------+`
If we sum the SALES GROUP column, it still has a total of 180. *OBS: I know this is not exactly scaling, but I am new to snowflake and did not find the most correct term.
How about scaling up the natural log to match the sum?
select *, (ln(price) * sum(price) over() / sum(ln(price)) over())::int as scaled_price
from t;