Skip navigation links
Articles
Debunking GROUP BY Myths
Add to My Link Library +2 Vote Up -0Vote Down
There is a popular myth about the SQL GROUP BY clause. The myth holds that 'standard SQL' requires columns referenced in the SELECT list of a query to also appear in the GROUP BY clause, unless these columns appear exclusively in an aggregated expression. MySQL is often accused of violating this standard. In this article I will attempt to debunk this myth, and to provide a more balanced view regarding MySQL's treatment of GROUP BY at the same time. To do that, I will first demonstrate that MySQL can be instructed to only accept GROUP BY clauses that include all non-aggregated expressions referred to in the SELECT list, thus making MySQL's behaviour conform more to that of other well-known rdbms-products. Second, I will show that it is very important to clearly define which version of the SQL-standard is being referred to. The two most recent versions use a rather sophisticated way of defining the required relationships between expressions appearing in the GROUP BY clause and the SELECT list. Contrary to a popular belief, these standards do not literally require that all non-aggregated SELECT list columns appear in the GROUP BY clause. Third, I will use a simple yet realistic example to illustrate in an informal manner what I believe is the intent expressed in the more recent versions of the SQL standard. Hopefully I will be able to convince you why it may even be better to not blindly include all non-aggregated columns from the SELECT list in the GROUP BY clause.


Report this link: