
I am a long-time user of SQL, but something that I never quite got to grips with was the use of the grouping sets, Cube, and rollup functionality that modern SQL database systems introduced a number of years ago.
Part of the reason for that was probably because of the type of databases I was working with and querying. They were mostly OLTP database systems and I just didn’t ever seem to need to use those operators in my day-to-day work.
Another part of me didn’t fully understand how they worked. I found the documentation around them confusing, so I eventually put them to one side on the metaphorical "nice to know but don’t need ’em " pile.
So, what’s changed? Well, nothing actually, but on a personal note, I just hate it when the systems I work with introduce functionality that I don’t fully understand or use.
I often think back to a similar situation when windowing functions first became part of SQL. I embraced them straight away, used them – a lot – , became reasonably proficient in them and was able to do things with them that would have been very difficult if not impossible with regular SQL.
So, it might have taken me over a decade to take another look at these Grouping operators closely but I wanted to get a real feel about what Grouping Sets, Rollup, and Cube do and why and how you use them.
I have always found that when it comes to SQL the best way to learn new skills is to set up a little data set sample, run queries against it and see what’s spat out at the end. That’s what we’ll do here.
But first, a little preamble and explanation of Grouping Sets, Rollup, and Cube.
GROUPING SETS, ROLLUP, and CUBE
The first thing to note is that these operators are extensions of the regular GROUP BY clause in Sql that we all know and love. They allow you to generate multiple aggregate groupings in a single query. They’re useful when you need to perform complex aggregations and/or generate subtotals or grand totals across different, multiple combinations of columns.
They are all closely related to each other and, in fact, the Rollup and Cube operators are just shorthand ways of expressing two common uses of general Grouping Sets syntax.
GROUPING SETS
syntax is used to define one or more groupings of data within a single query. It is useful when you want to aggregate data across multiple columns independently within the same query.
ROLLUP
is used to create subtotals that add up to a grand total. It’s a way of adding hierarchical grouping to the GROUP BY
clause. This means you get not only the grouped data but also aggregates at multiple levels. It’s particularly useful for generating reports that require sub and grand totalling.
CUBE
generates all possible combinations of groupings for the specified columns. It is useful for data analysis where you need to evaluate all dimensions of the data. Use it when you need a full set of aggregation to analyze the data across multiple columns simultaneously.
Setting up a test environment
Many of you will have access to a database, maybe through work or something as simple as a SQLite installation on your local system. You can follow along with my examples on either.
If you don’t have access to a DB, there are free online options that you can use too. For that, my go-to choice is Oracle’s online Live SQL website. This gives you access to a virtual Oracle DB right in your browser where you can create tables and run queries as if the DB was local to you.
You need an account to use it but it’s free to register. Access it by visiting this link:
You should see a page like this.

Once you log in/register, click on the Start Coding
button and you should get to this page.

As you can see, the interface is pretty simple. Just two boxes. One to input your SQL commands and one to show the output of those SQL commands.
Sample table creation & population
So that we can "see the woods for the trees", the data set is deliberately simple. Just 5 records in total. Trust me, this will help later on.
CREATE TABLE sales ( sales_product VARCHAR2(50), sales_quantity INT, sales_region VARCHAR2(50) );
INSERT INTO sales (sales_product, sales_region, sales_quantity) VALUES ('Laptop', 'North', 100);
INSERT INTO sales (sales_product, sales_region, sales_quantity) VALUES ('Desktop', 'South', 60);
INSERT INTO sales (sales_product, sales_region, sales_quantity) VALUES ('Laptop', 'South', 90);
INSERT INTO sales (sales_product, sales_region, sales_quantity) VALUES ('Desktop', 'East', 70);
INSERT INTO sales (sales_product, sales_region, sales_quantity) VALUES ('Tablet', 'West', 30);
SELECT * FROM sales;
SALES_PRODUCT SALES_QUANTITY SALES_REGION
-----------------------------------------
Laptop 100 North
Desktop 60 South
Laptop 90 South
Desktop 70 East
Tablet 30 West
Using Grouping sets
One of the main uses of grouping sets is when you want to see separate aggregations of two or more of your columns simultaneously in one query.
For example, let’s say we wanted to return the SUM of our sales_quantity figures by both sales_product and sales_region. Normally this would be a union all of two separate queries, each grouping by an individual column.
Here’s how we combine those using grouping sets.
SELECT sales_product, sales_region, SUM(sales_quantity) AS total_sales
FROM sales
GROUP BY GROUPING SETS ((sales_product), (sales_region));
This gives the following result set.
SALES_PRODUCT SALES_REGION TOTAL_SALES
----------------------------------------
- North 100
- East 70
- South 150
- West 30
Tablet - 30
Desktop - 130
Laptop - 190
Rollup
Let’s see what happens when we replace the grouping sets with rollup.
SELECT sales_product, sales_region, SUM(sales_quantity) AS total_sales
FROM sales
GROUP BY ROLLUP(sales_product, sales_region);
SALES_PRODUCT SALES_REGION TOTAL_SALES
----------------------------------------
Laptop North 100
Laptop South 90
Laptop - 190
Tablet West 30
Tablet - 30
Desktop East 70
Desktop South 60
Desktop - 130
- - 350
Rollup retrieves records that include the grand total, in addition to subtotals for combinations of all but the right-most column of the specified grouping columns. In other words, for this example, we assume the sales_region is fixed (null) and aggregate over each unique sales_product. Notice how the 150 total_sales figure for the South region in the grouping sets query has been split up into two entries here. One with 60 for a desktop and one with 90 for a laptop.
If you were to re-write the above query without using rollup, it would look something like this:
-- Sales by product and region
SELECT sales_product AS product, sales_region AS region, SUM(sales_quantity) AS total_sales
FROM sales
GROUP BY sales_product, sales_region
UNION ALL
-- Sales by product only (region is NULL)
SELECT sales_product AS product, NULL AS region, SUM(sales_quantity) AS total_sales
FROM sales
GROUP BY sales_product
UNION ALL
-- Grand total (both product and region are NULL)
SELECT NULL AS product, NULL AS region, SUM(sales_quantity) AS total_sales
FROM sales;
Cube
Now replace rollup with cube.
SELECT sales_product, sales_region, SUM(sales_quantity) AS total_sales
FROM sales
GROUP BY CUBE (sales_product, sales_region);
SALES_PRODUCT SALES_REGION TOTAL_SALES
----------------------------------------
- - 350
- East 70
- West 30
- North 100
- South 150
Laptop - 190
Laptop North 100
Laptop South 90
Tablet - 30
Tablet West 30
Desktop - 130
Desktop East 70
Desktop South 60
Cube is like a concatenation of the unique records of the grouping sets and rollup queries. It retrieves records for ALL possible combinations of the columns specified in the CUBE clause in addition to a grand total record.
To reinforce that there is nothing magical about these operators, here is the equivalent to the above CUBE query just using regular GROUP BY and UNION ALL clauses.
This shows that, deep down , they’re just a short-hand – albeit more performant – replacement for regular SQL queries.
SELECT sales_product, sales_region, SUM(sales_quantity) AS total_sales
FROM sales
GROUP BY sales_product, sales_region
UNION ALL
SELECT sales_product, NULL AS sales_region, SUM(sales_quantity) AS total_sales
FROM sales
GROUP BY sales_product
UNION ALL
SELECT NULL AS sales_product, sales_region, SUM(sales_quantity) AS total_sales
FROM sales
GROUP BY sales_region
UNION ALL
SELECT NULL AS sales_product, NULL AS sales_region, SUM(sales_quantity) AS total_sales
FROM sales;
Summary
I hope this has given you a bit of insight into how grouping sets, rollup, and cube work and should be used. They might not be something you use every day if you’re principally working with an OLTP type database system but I think they are invaluable if you regularly use an OLAP system or need to incorporate SQL aggregate data output into complex reports.
Sure, you can do everything that these grouping operators can do by using regular GROUP BY and UNION ALL syntax, but it will be more complex, less performant and more error prone if you do.
_OK, that’s all for me just now. I hope you found this article useful. If you did, please check out my profile page at this link. From there, you can see my other published stories and subscribe to get notified when I post new content._
I know times are tough and wallets constrained, but if you got real value from this article, please consider buying me a wee dram.
If you liked this content, I think you’ll find these articles interesting too.