I am starting to learn SQL Server, in the documentation found in msdn states like this
HAVING is typically used with a GROUP BY clause. When GROUP BY is not used, there is an implicit single, aggregated group.
This made me to think that we can use having without a groupBy clause, but when I am trying to make a query I am not able to use it.
I have a table like this
CREATE TABLE [dbo].[_abc]
(
[wage] [int] NULL
) ON [PRIMARY]
GO
INSERT INTO [dbo].[_abc] (wage)
VALUES (4), (8), (15), (30), (50)
GO
Now when I run this query, I get an error
select *
from [dbo].[_abc]
having sum(wage) > 5
Error:
The documentation is correct; i.e. you could run this statement:
select sum(wage) sum_of_all_wages
, count(1) count_of_all_records
from [dbo].[_abc]
having sum(wage) > 5
The reason your statement doesn't work is because of the select *
, which means select every columns' value. When there is no group by
, all records are aggregated; i.e. you only get 1 record in your result set which has to represent every record. As such, you can only* include values provided by applying aggregate functions to your columns; not the columns themselves.
* of course, you can also provide constants, so select 'x' constant, count(1) cnt from myTable
would work.
There aren't many use cases I can think of where you'd want to use having without a group by, but certainly it can be done as shown above.
NB: If you wanted all rows where the wage was greater than 5, you'd use the where
clause instead:
select *
from [dbo].[_abc]
where wage > 5
Equally, if you want the sum of all wages greater than 5 you can do this
select sum(wage) sum_of_wage_over_5
from [dbo].[_abc]
where wage > 5
Or if you wanted to compare the sum of wages over 5 with those under:
select case when wage > 5 then 1 else 0 end wage_over_five
, sum(wage) sum_of_wage
from [dbo].[_abc]
group by case when wage > 5 then 1 else 0 end
having
to use aggregate functions?No. You can run select sum(wage) from [dbo].[_abc]
. When an aggregate function is used without a group by
clause, it's as if you're grouping by a constant; i.e. select sum(wage) from [dbo].[_abc] group by 1
.
The documentation merely means that whilst normally you'd have a having
statement with a group by
statement, it's OK to exclude the group by
/ in such cases the having
statement, like the select
statement, will treat your query as if you'd specified group by 1
It's hard to think of many good use cases, since you're only getting one row back and the having
statement is a filter on that.
One use case could be that you write code to monitor your licenses for some software; if you have less users than per-user-licenses all's good / you don't want to see the result since you don't care. If you have more users you want to know about it. E.g.
declare @totalUserLicenses int = 100
select count(1) NumberOfActiveUsers
, @totalUserLicenses NumberOfLicenses
, count(1) - @totalUserLicenses NumberOfAdditionalLicensesToPurchase
from [dbo].[Users]
where enabled = 1
having count(1) > @totalUserLicenses
Yes and no. Having is a filter on your aggregated data. Select says what columns/information to bring back. As such you have to ask "what would the result look like?" i.e. Given we've had to effectively apply group by 1
to make use of the having
statement, how should SQL interpret select *
? Since your table only has one column this would translate to select wage
; but we have 5 rows, so 5 different values of wage
, and only 1 row in the result to show this.
I guess you could say "I want to return all rows if their sum is greater than 5; otherwise I don't want to return any rows". Were that your requirement it could be achieved a variety of ways; one of which would be:
select *
from [dbo].[_abc]
where exists
(
select 1
from [dbo].[_abc]
having sum(wage) > 5
)
However, we have to write the code to meet the requirement, rather than expect the code to understand our intent.
Another way to think about having
is as being a where
statement applied to a subquery. I.e. your original statement effectively reads:
select wage
from
(
select sum(wage) sum_of_wage
from [dbo].[_abc]
group by 1
) singleRowResult
where sum_of_wage > 5
That won't run because wage
is not available to the outer query; only sum_of_wage
is returned.