Quantcast
Channel: beyondrelational.com
Viewing all articles
Browse latest Browse all 25

Conditional aggregation - SUM Vs COUNT

$
0
0

You can do conditional aggregation like SUM(CASE WHEN .. THEN 1 ELSE 0 END),....etc to find a count for a particular match. This type of conditions are useful when you want to write a CROSS-TAB/PIVOT Query. You can also make use of COUNT(CASE WHEN .. THEN 1 END). One of the developers told me that he used the same type of queries but the count did not seem to be correct. I asked him to show the code. His code has the following pattern

COUNT(CASE WHEN .. THEN 1 ELSE 0 END)

I immediately pointed out to him that the above condition is equivalent to COUNT(*) because COUNT will count everything other than NULL. In the above expression both 1 and 0 are counted. So either COUNT should be replaced with SUM or 0 should be NULL.

But for these types are expressions I would like to use SUM instead of COUNT.  There are atleast two reasons I know

1 SUM can be used to COUNT as well as to SUM the values. See the below examples

SUM(CASE WHEN gender='M' THEN 1 ELSE 0 END) as male_count,
SUM(CASE WHEN year(trans_date)=2012 THEN trans_amount ELSE 0 END) as total_2012


The first expression COUNTs how many males in the recordset and second expression SUMs the values of the column named trans_amout for the year 2012.

COUNT can be used only for COUNTing as below

COUNT(CASE WHEN gender='M' THEN 1 END) as male_count


2 NULL warning

Using SUM will almost avoid NULL warning. In the first expression the derived values are always 1 and 0 so you will never get any warnings on NULL. In the second example, if the column trans_amount is a NOT NULL column, you will never get a NULL warning but if it is NULLable column and if there are some NULL values for the transaction year 2012, you will get a NULL warning. The NULL warning is as below

Warning: Null value is eliminated by an aggregate or other SET operation.

But using COUNT will most likely result to NULL warning until there are no rows with gender<>'M'.

The NULL warning depends on the data. So use conditional aggregations cleverly.


Viewing all articles
Browse latest Browse all 25

Trending Articles