Saturday, February 25, 2006

 

BUG: Combination of GROUP BY with HAVING Clause and LEFT OUTER JOIN with Derived Table With LEFT Function Produces Incorrect Result


Applies To : SQL Server 2000 RTM to SP4

This bug is a variant of KB 308458 which is fixed in SP2.

Symptoms

If all of the following conditions are true for a query, the query will return extra rows.

• You include GROUP BY and HAVING clauses.
• You use a LEFT OUTER JOIN.
• You incorporate derived tables.
• You use an aggregate.
• You use the LEFT function.

Cause
The optimizer inappropriately processes a cross join.

Steps to Reproduce

Execute the following query in Query Analyzer

Use Pubs
GO

SELECT T.pub_id, T.type, SUM(T.price) AS BasePrice
FROM Titles T
LEFT JOIN
(Select 1) X (pub_id)
ON X.pub_id = T.pub_id
WHERE LEFT(T.title,3) = 'The'
GROUP BY T.pub_id, T.type
--HAVING SUM(T.price) > 0


You will get the following results






pub_idtypeBasePrice
1389business19.9900
0877mod_cook 2.9900
0877UNDECIDEDNULL


Now try executing the above query by uncommenting the the HAVING caluse to exclude the NULL value.

Use Pubs
GO

SELECT T.pub_id, T.type, SUM(T.price) AS BasePrice
FROM Titles T
LEFT JOIN
(Select 1) X (pub_id)
ON X.pub_id = T.pub_id
WHERE LEFT(T.title,3) = 'The'
GROUP BY T.pub_id, T.type
HAVING SUM(T.price) > 0

you will get the following incorrect result.

pub_idtypeBasePrice

1389
business19.9900

0877
mod_cook19.9900

1389
business2.9900

0877
mod_cook2.9900

1389
businessNULL

0877
mod_cookNULL


If you look at the execution plan, you can see that the query performs an incorrect cross join.

|--Nested Loops(Inner Join)
..... |--Compute Scalar(DEFINE:([Expr1005]=If ([Expr1018]=0) then NULL else [Expr1019]))
..... |....|--Stream Aggregate(GROUP BY:([T].[type], [T].[pub_id]) DEFINE:([Expr1018]=COUNT_BIG([T].[price]), [Expr1019]=SUM([T].[price])))
......|........|--Sort(ORDER BY:([T].[type] ASC, [T].[pub_id] ASC))
......|..............|--Clustered Index Scan(OBJECT:([pubs].[dbo].[titles].[UPKCL_titleidind] AS [T]), WHERE:(substring([T].[title], 1, 3)='The'))
......|--Table Spool
...........|--Filter(WHERE:([Expr1002]>0.00))
................|--Compute Scalar(DEFINE:([Expr1002]=If ([Expr1020]=0) then NULL else [Expr1021]))
.....................|--Stream Aggregate(GROUP BY:([T].[type], [T].[pub_id]) DEFINE:([Expr1020]=COUNT_BIG([T].[price]), [Expr1021]=SUM([T].[price])))
..........................|--Sort(ORDER BY:([T].[type] ASC, [T].[pub_id] ASC))
...............................|--Clustered Index Scan(OBJECT:([pubs].[dbo].[titles].[UPKCL_titleidind] AS [T]), WHERE:(substring([T].[title], 1, 3)='The'))


Workaround
You can workaround the problem by just replacing the LEFT function with SUBSTRING function.

SELECT T.pub_id, T.type, SUM(T.price) AS BasePrice
FROM Titles T
LEFT JOIN
(Select 1) X (pub_id)
ON X.pub_id = T.pub_id
WHERE SUBSTRING(T.title,1,3) = 'The'
GROUP BY T.pub_id, T.type
HAVING SUM(T.price) > 0


Note
This bug is fixed in SQL server 2005





Comments: Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?