I would like to add a column with computed values to my MSSQL database, but I don't know how to create the SQL code
My data contains the columns PricePerUnit
and Instance_Type
I would like the new computed value column to show what percentage cheaper each Instance_Type is versus the most expensive of that same Instance_Type. For example the most expensive c5.12xlarge is on the first row (London) and therefore is 0% cheaper, but the same c5.12xlarge is Ireland is cheaper by 4.95%, and in Oregon that identical Instance_Type of c5.12xlarge is 15.84% cheaper. I would like the computed value column in SQL to show 0% and 4.95%. 15.84% and so on.
In Excel I would use the following
formula =(MAXIFS(A:A,B:B,B2)-A2)/MAXIFS(A:A,B:B,B2)
The database table is called AmazonEC2
Here is an image of it working in Excel. The first blue table is identical to the data in the SQL database, the black table represents what I want to achieve in SQL.
I don't think it would be good (it may not even be possible) to do this as a computed column.
From the comments it seems the data type of PricePerUnit is nvarchar(). If that is the case I must point out that is poor database design. I understand that may be beyond your control to change.
Anyway, I created an AmazonEC2 table as I think you may have it. Using your spreadsheet of data I created insert statements to populate that table with your data using the following formula.
=CONCATENATE("insert into AmazonEC2 (PricePerUnit, Instance_Type, Instance_Family, Location) values ('", A2, "', ", "'", B2, "', '", C2, "', '", D2, "')")
I built all of that into a dbfiddle so you can see it in action and so that other people here can manipulate the data and try different approaches.
Here is final SQL statement to retrieve your data and calculate the PercentCheaper at the time of retrieval. You could also create a view based on this SQL statement.
SELECT
x.PricePerUnit
, x.Instance_Type
, x.Instance_Family
, x.Location
, FORMAT((x.MaxPricePerUnit - convert(decimal(10, 2), PricePerUnit)) / x.MaxPricePerUnit, 'P') AS PercentCheaper
FROM (
SELECT
PricePerUnit
, Instance_Type
, Instance_Family
, Location
, MAX(convert(decimal(10, 2), PricePerUnit)) OVER (PARTITION BY Instance_Type) AS MaxPricePerUnit
FROM AmazonEC2
) x;
What we are doing here is getting the maximum PricePerUnit for each Instance_Type in the subquery which I have aliased as "x". Then I select from that and perform the calculation to find the PercentCheaper for each row.
Since it seems PricePerUnit is an nvarchar() column you need to convert it to a number in order to do the calculations. Note that you do not need to convert MaxPricePerUnit because the conversion happened before it was used as an input to the MAX() function resulting in an output with a decimal(10,2) data type.