From the course: SQL for Non-Programmers

Getting started with aggregate functions - SQL Tutorial

From the course: SQL for Non-Programmers

Getting started with aggregate functions

In many cases, we do not want to extract data row by row from a data table. Sometimes we need a summary of our data, for example, taking the sum or average of values. Aggregate functions in SQL enable you to perform calculations on groups of rows or an entire table, without having to retrieve all of the data. These functions can help you analyze large data sets with ease, simplifying the process of data analysis and allowing you to quickly summarize your data to uncover meaningful insights. The most commonly used aggregate functions are count, sum, average, min, and max. Aggregate functions follow a certain syntax. The aggregate function should be placed before the column name of the column that we want to perform the function on, and the column name should be enclosed in parentheses. It's actually quite intuitive. Let's try it out together with some examples. Starting with the count function. Count is used to count the number of rows in a table, or the number of rows that meet a certain condition. To see this in action, let's move back over to SLS and continue to explore the Two tTees Olive Oil sales data. Our management team is back with new request, and I have a feeling aggregate functions can help us answer their questions. Let's start by typing in our SQL editor. SELECT COUNT, open parentheses, asterisks, close parentheses, FROM oliveoil.sales. And let's execute this. This returns the number 5,130. Since we did not specify any column name within the COUNT function, and instead, use the asterisk symbol to return all rows and columns, this is confirming that there are 5,130 total rows in our sales table. Each row in our table is an order, so we now know there were 5,130 unique orders made. Our management team would like to know how many orders have been made for gift baskets. Using the same query, let's go to the next section and type, SELECT COUNT * FROM oliveoil.sales, we can add a WHERE filter to select only where product category equals gift basket. So I'll put that on the next line. WHERE product category equals gift basket. When we execute the query, we see that 792 orders were made for gift baskets. We can also specify a particular column to count the values within it. Management has asked us how many customers we have. This seems easy enough. Let's type into the next section. SELECT COUNT customer ID FROM oliveoil.sales. And let's execute this. This returns 5,130, the same number of rows in our data table. That seems odd because I know that some customers are making multiple orders. Keep in mind that count is used to count the number of rows. Every row in our table has a customer ID, many of them appearing multiple times throughout because the same customer often orders more than once. What management wants to know is the unique count of customers, so we can place distinct within the parentheses and before the column name. And if we execute this, now the output makes more sense, 2,417 unique customers. We are ready to report back to management. You will notice, though, that the name of the column in our output is quite long. It says, count distinct customer ID, and it's not the most intuitive for a column name. We can use the as keyword to rename a column or even a table using an alias. This is incredibly helpful in making our output more readable. As should be placed after your column name, followed by an alias, the new name that you want to give to the column. Just be sure not to use spaces in your alias, and follow the best practice of using lowercase letters with words separated by an underscore. Using the same query as before, we can add, AS customer count right after our count function. And if we execute this, it will return the same results as before, 2,417, but with the column renamed to customer count, which is more readable. The as keyword doesn't have to be used only with aggregate functions. It can be used to rename any column with or without a function being used. For example, we can rename customer ID as client ID. Let's type into the next section. SELECT customer ID as client ID FROM oliveoil.sales. When we execute this, it returns the same values as if we were retrieving customer ID. The only difference is the name of the column in our output now appears as client ID. The as keyword is great for you to know so you can produce user friendly output. This is incredibly helpful as your data queries become more complex. The use of count and the as keyword are a great place to start with aggregate functions, but we can use them to do so much more. I've just gotten a word that management would like a summary report that shows the total sales and the average order value of our olive oil business. Let's see how we can use aggregate functions to solve this.

Contents