[SQL Basic] How many movies have the same rating?— SQL GROUP BY/HAVING
#11. Marketing/Data Analysis for Beginners
In the last couple of tutorials, we’ve been dealing with SQL queries that handle numeric data. Now we’re going to learn how to group together that data in a column and how to organize it.
⭐️ We will start a more condensed, short glossary from here onward!
#Glossary:
📌Lesson 3: SELECT / FROM / WHERE
📌Lesson 4: ORDER BY
📌Lesson 5: AND / OR / IN
📌Lesson 6: LIKE
📌Lesson 7: AS /DISTINCT / IS NULL
📌Lesson 9: COUNT / SUM
📌Lesson 10: AVG / MAX / MIN
🆕 🌟
📌GROUP BY: group results by a specified column
📌HAVING: narrowing down/specifying the location further
#Table of Contents
Query 1. Find how many customers are active and organize them by the originating store (#GROUP BY)
SELECT store_id, count(active) FROM customer WHERE active=1 GROUP BY store_id;
Query 2. Find the quantity of movies for every rating and rename the new column to ‘number’. Sort the results in descending order and filter out anything less than 180. (#GROUP BY #HAVING)
SELECT rating, COUNT(rating) AS 'number' FROM film GROUP BY rating HAVING number>180 ORDER BY number DESC;
#GROUP BY
Q1. Find how many customers are active and organize them by the originating store
GROUP BY (column name you want to group by)
Let’s check out the customer table first. Run this query:
SELECT * FROM `sakila`.`customer`;
Here we can see the customer ID, store ID, the customer’s full name, email, and address ID. Let’s focus on the store ID and active columns for now. If we want to know how many customers are currently active at our imaginary film store, we can try running this query:
SELECT COUNT(active) FROM customer;
But this doesn’t give us very clear information. We still don’t know how many are active, since this query only shows how many customers have information in the ‘active’ column. It doesn’t specify whether they are active (specified with a 1) or not active (specified with a 0). So let’s filter this information by adding WHERE active=1
!
SELECT COUNT(active) FROM customer WHERE active=1;
Okay, now we know we have 584 active customers. But there is one more factor we don’t know yet… which store has how many active customers? We need to plug in the store ID into our query.. but how?
By using the GROUP BY query, we can get a clear and organized look at this information. First, we need to add store_id to our SELECT
query in order to view that column. Next, we want to specify with the GROUP BY
query how we want the results to be organized. Here’s how to plug it in:
SELECT store_id, COUNT(active) FROM customer WHERE active=1 GROUP BY store_id;
Now we can clearly see that store #1 has 318 active customers,
and store #2 has 266 active customers!
#HAVING
Q2. Find the quantity of movies for every rating and rename the new column to ‘number’. Sort the results in descending order and filter out anything less than 180.
GROUP BY (column name you want to group by) having (specified)
Going back to the film table, let’s check some movie ratings.
To view the ratings column, we need to specify it under the SELECT
query. We also want to count how many ratings there are, so let’s add the COUNT
query. Let’s also use what he just learned and throw in the GROUP BY
query :
SELECT rating, COUNT(rating) FROM film GROUP BY rating;
Here are the 5 distinct movie ratings and how many we have of each. The title of the count column isn’t as pretty and neat as rating, so let’s fix that with the AS
query. Just add it after the COUNT
query and you’ll get this:
SELECT rating, COUNT(rating) AS'number' FROM film GROUP BY rating;
Let’s say we want to only see movies that we have MORE than 180 copies of — how can we filter that information? We’ve tried a similar query before with the WHERE
query in lesson 5, but this time we’re going to use the HAVING
query.
SELECT rating, COUNT(rating) AS'number' FROM film GROUP BY rating HAVING number>180 ORDER BY number DESC;
🔑 Why do we use HAVING instead of WHERE?
In SQL, the WHERE
query is processed before the GROUP BY
query. That means the information wouldn’t be divided by rating when the WHERE
query searches. But the HAVING
query gets processed after GROUP BY
, so it can be used to constrain results by the group we specified.
Just remember, the WHERE
query applies to all rows in the result set.
The HAVING
query is applied to the groups created by a GROUP BY
query.
🔑 Comparing DISTINCT and GROUP BY
Try running both these queries in SQLGate. What’s the result?
SELECT DISTINCT replacement_cost FROM film;
SELECT replacement_cost FROM film GROUP BY replacement_cost;
As you can see, the two queries are very similar. They will give you the same results in this case. The only difference is the order.
⭐️But for future use, know that GROUP BY
allows you to use aggregate functions ( AVG
, MAX
, MIN
, SUM
, and COUNT
) whereas DISTINCT
is used to remove duplicates.
#Practice Time
Your challenge today is to find the number of movies in the film table. You must group them by price, and rename the column as ‘number’. Ready? Go!
🔑 The only information we want here is the price, so plug in rental_rate into your SELECT
query, and use COUNT
to find the number you want. Do you remember what query we need to rename the new column created by COUNT
?
👍Take a moment to try it on your own.
🔑 Use GROUP BY
! We don’t need DISTINCT
since we’re working with COUNT
.
👍Take a moment to try it on your own.
Alright, here’s the answer:
SELECT rental_rate, COUNT(rental_rate) AS number FROM film GROUP BY rental_rate;
Good job! Here’s a table of all our movies separated by price. Need some visuals? Follow our video tutorial linked below:
# Tutorial Video
# [SQL Basic Series] — Marketing/Data Analysis for Beginners
1. What is SQL? — Database Introduction
2. Downloading SQLGate and Connecting to a Database
3. What kind of data is in the ‘film’ table? — SELECT/FROM/WHERE
4. How to sort film lists by price (rental rate)? — ORDER BY
5. Can I rent a 3 hour movie for $0.99? — AND/OR/IN
6. How can I find a movie that’s thrilling? — LIKE
7. How to rename columns and see unique values? — AS/DISTINCT/IS NULL
8. Let’s see Sample Database tables!
9. What is the total length of all the films added together? — COUNT/SUM
10. What is the average film length? — AVG/MIN/MAX
11. How many movies have the same rating? — GROUP BY/HAVING
12. Let’s designate a value based on the rental duration. — CASE
13. Let’s find out the movie genres by joining tables! — INNER JOIN
14. Let’s check movie rentals by the inventory list! — LEFT/RIGHT JOIN
15. Let’s find customers that have rented movies priced $9.99! — Subquery
16. How to work with Date Functions in SQL — DATE_FORMAT, DATEDIFF
17. How to work with String Functions in SQL — CONCAT, LENGTH, SUBSTR
18. Let’s use some Numeric SQL functions — TRUNCATE, ROUND, MOD
19. Let’s learn some advanced CASE queries!
20. Let’s learn some advanced JOIN queries!