Thursday, April 21, 2011

Is a sequence of condition checks in WHERE clause predefined?

Example:

SELECT *
FROM [User]
WHERE [Value1] > 0
AND [Value2] / [Value1] > 3

I would want to avoid a division-by-zero situation. Can I be sure that the first condition be checked first and the second one will never get hit if the first one is false? I suppose the query planner is allowed to rearrange the sequence if it will yield more performance.

From stackoverflow
  • It seems you are looking for a conditional statement, an IF.
    http://msdn.microsoft.com/en-us/library/ms182717(SQL.90).aspx

  • Generally no. The optimiser works out the best way to execute the query.

    SQL does not produce explicit code: it's simply an instruction to the optimiser to return data in certain formats, orders, filtered etc.

    SELECT *
    FROM [User]
    WHERE [Value1] > 0
    AND [Value1] / [Value1] > 3
    
    --could be this
    SELECT *
    FROM
        (SELECT *
        FROM [User]
        WHERE [Value1] > 0
    ) foo
    WHERE [Value1] / [Value1] > 3
    
    --but to be safe
    SELECT *
    FROM
        (SELECT TOP 2000000000 *
        FROM [User]
        WHERE [Value1] > 0
        ORDER BY Value1
    ) foo
    WHERE [Value1] / [Value1] > 3
    
    User : Why is #2 not safe?
    gbn : The optimiser may collapse it into one WHERE clause. The 3rd form arbritrarily tells the optimiser to sort before applying the outer WHERE clause
  • SELECT  *
    FROM    [User]
    WHERE   CASE WHEN Value1 = 0 THEN 0 ELSE [Value2] / [Value1]  END > 3
    

    , or even better:

    SELECT  *
    FROM    [User]
    WHERE   [Value2] / NULLIF([Value1], 0) > 3
    

    Division by NULL is always safe.

    Note than if you have lots of [Value1] = 0 in your table and an index on [Value1], then the following query:

    SELECT  *
    FROM    [User]
    WHERE   [Value1] > 0
            AND [Value2] / NULLIF([Value1], 0) > 3
    

    , will use the index and just skip the wrong values.

    This, however, should not be relied upon and NULLIF should still be used. It's almost free, anyway.

  • Your original SQL code will work perfectly :)

    User : It actually works. But is it safe?
  • Well for starters...

    Value1 / Value1 == 1 //always...
    

    but assuming its a different column.

    SELECT top 20 * from [User]
    WHERE
    CASE WHEN Value2 > 0
        THEN convert(decimal(18,2), Value1) / convert(decimal(18,2), Value2) 
        ELSE 0
    END > 3
    
  • You can not count on the server processing your conditions in the listed order. Generally the optimizer will run them in whichever order it thinks is faster. Normally what you have listed should be safe, but I've seen edge cases where the optimizer gets confused.

    One good way to make the optimizer thing the [Value1] > 0 condition will be faster is to have an index on Value1 but not on Value2.

0 comments:

Post a Comment