Friday, May 6, 2011

How do I create an SQL query that groups by value ranges

I would like to create a sql query that reports the percentage of results in a particular range. for instance

20% of the values between 10 to 20

40% of the values between 20 to 32.5

Server - MSSQL

From stackoverflow
  • GROUP BY CASE 
             WHEN VALUE >= 10 AND VALUE <= 20 THEN '20%'
             WHEN VALUE > 20 AND VALUE <= 32.5 THEN '40%' ELSE '0' END
    

    You need to cover all possible values, hence the ELSE 0. You'll probably want to do something a little different there, but it should give you a start.


    Based on Joel Gauvreau's comment:

    SUM(CASE WHEN VALUE >=10 AND VALUE <= 20 THEN 1.0 ELSE 0.0 END) / COUNT(*),
    SUM(CASE WHEN VALUE > 20 AND VALUE <= 32.5 THEN 1.0 ELSE 0.0 END) / COUNT(*)
    

    Or at the end of the query use the COMPUTE statement.

    Joel Gauvreau : I Think he meant the distribution of the values by range. 200 records where value is in the range between 10 and 20 would results what percentage of the total number of records...
    Joel Coehoorn : Ah, thanks. Updated my answer.
    Andomar : +1 best idea, but needs a cast from integer to float, and an extra ) for the sum on the second line
  • This will get you the count per range, you can easily determine the percentage from there:

    declare @ranges table (beginInclusive float, endExclusive float)
    insert @ranges (beginInclusive, endExclusive)
        select 10, 20
        union all select 20, 32.5
    
    select
        r.beginInclusive,
        r.endExclusive,
        count(*)
    from t join @ranges on t.RangedValue >= r.beginInclusive and t.RangedValue < r.endExclusive
    group by 
        r.beginInclusive,
        r.endExclusive
    
  • I would usually use a subquery and get rangecounts and join in the total to get the percentage. Something like:

    SELECT 
      RangeCount/CNT as Percentage,
      Range
    FROM 
    (
    SELECT
      Count(*) AS CNT
    FROM
      SomeTable
    ) AS Total
    LEFT JOIN 
    (
    SELECT
      CASE Val <= 10 then
           '0 up to 10'
      ELSE 
           CASE when Val <= 20
             '11 to 20'
           ELSE 
            '> 20'
           END
        END
      END AS Range,
      COUNT(*) AS RangeCount
    FROM 
       SomeTable
    GROUP BY
       Range
    ) AS RangeTotals
    
  • SELECT B.Description, Total = COUNT(*) / CONVERT(money, (SELECT COUNT(*) FROM Target T2))
    FROM Target T
    JOIN (
        SELECT  Description = '0 to 10', LBound = 0, UBound = 10 
        UNION ALL 
        SELECT Description = '10 to 20', LBound = 10, UBound = 20
    ) B ON T.Value >= LBound AND T.Value < B.UBound
    GROUP BY B.Description
    
  • Declare @1 as int
    Declare @2 as int
    Declare @TotalRows as int
    
    set @1 = (Select COUNT(id) FROM dbo.Table_1 WHERE id >= 10 and id <= 20)
    set @2 = (Select COUNT(id) FROM dbo.Table_1 WHERE id > 20 AND id <= 32.5);
    set @TotalRows = (Select Count(id) from dbo.Table_1);
    
    SELECT CAST(((@1 * 100)/@TotalRows) as nvarchar(32)) + '%', CAST(((@2 * 100)/@TotalRows) as nvarchar(32)) + '%';
    

    Little complicated, but that does work... i suppose...

    dbo.Table_1 only has 1 column, 'id' and it is of type int.

  • If it's something that you will be doing regularly, then you can create a table with the ranges and what constitutes those ranges. If not, you can set them up in a table variable or temporary table and join to that. It's basically JohnOpincar's solution, but with a table instead of a subquery.

    Also, in your example you list "10 to 20" and "20 to 32.5". Where is a row counted if it's exactly 20? You should probably make sure that your requirements are clear on that point before you deliver the final solution.

  • Joel's answer seems the best way to me. Posting to explain the query, and because the answer has an integer division sum/count which will return 1 or 0 instead of a percentage.

    For the 20 -> 32.5 range:

    select CAST(SUM(
        CASE WHEN 20 < field AND field <= 32.5 THEN 1 ELSE 0 END
        ) as float) / COUNT(*) * 100.0
    from table
    

    The case returns 1 when the value is in range. Because there's no group by clause, the sum adds the result of the case for every row in the table. Convert to float, divide by the number of rows =count(*), and you get the percentage.

    You can also write it like:

    select SUM(
        CASE WHEN 20 < field AND field <= 32.5 THEN 1.0 ELSE 0.0 END
        ) / COUNT(*) * 100
    from table
    

    Here the CASE will result in a float 1.0 instead of the integer 1.

0 comments:

Post a Comment