Problem Statement
Table: Users
Each row of this table contains the ID and the name of one user.
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| user_name | varchar |
+---------------+---------+
user_id is the primary key for this table.
Table: Engagement
Each row of this table records the daily engagement score of a user.
+---------------+------+
| Column Name | Type |
+---------------+------+
| user_id | int |
| engagement | int |
| date | date |
+---------------+------+
(user_id, date) is the primary key for this table.
Write a solution to find the engagement level of each user for February 2020.
The engagement levels are defined as:
- Low if the average
engagementis less than20, - Medium if the average
engagementis between20and60(inclusive), and - High if the average
engagementis greater than60.
Return the result table in any order.
Example
Input:
Users table:
+---------+-----------+
| user_id | user_name |
+---------+-----------+
| 1 | Alice |
| 2 | Bob |
| 3 | Charlie |
| 4 | David |
| 5 | Eve |
+---------+-----------+Engagement table:
+---------+------------+------------+
| user_id | engagement | date |
+---------+------------+------------+
| 1 | 10 | 2020-02-01 |
| 1 | 25 | 2020-02-02 |
| 2 | 30 | 2020-02-01 |
| 2 | 40 | 2020-02-03 |
| 3 | 65 | 2020-02-05 |
| 3 | 70 | 2020-02-06 |
| 4 | 23 | 2020-02-07 |
| 4 | 24 | 2020-02-08 |
| 5 | 55 | 2020-02-09 |
| 5 | 60 | 2020-02-10 |
+---------+------------+------------+Output:
+-----------+----------------+
| user_name | engagement_level |
+-----------+----------------+
| Alice | Low |
| Bob | Medium |
| Charlie | High |
| David | Medium |
| Eve | Medium |
+-----------+----------------+Try It Yourself
Database Exercise
Database Schema:
-- Database schema would be rendered hereExercise Script:
-- Exercise script would be rendered hereAvailable actions: Execute
Solution
To determine the engagement level of each user for February 2020, we need to calculate the average engagement score for each user during that month and categorize them based on the defined thresholds. The process involves joining the Users and Engagement tables, filtering the relevant dates, computing the average engagement, and assigning the appropriate engagement level.
- Join
UsersandEngagementTables: Combine user information with their corresponding engagement records. - Filter Engagement Records for February 2020: Select only the engagement data that falls within February 2020.
- Calculate Average Engagement: Compute the average engagement score for each user during the specified period.
- Assign Engagement Levels: Categorize users as Low, Medium, or High based on their average engagement scores.
SQL Query
SELECT
user_name,
CASE
WHEN AVG(engagement) < 20 THEN 'Low'
WHEN AVG(engagement) <= 60 THEN 'Medium'
ELSE 'High'
END AS engagement_level
FROM Users
JOIN Engagement ON Users.user_id = Engagement.user_id
AND YEAR(date) = '2020' AND MONTH(date) = '02'
GROUP BY user_name;Step-by-Step Approach
Step 1: Join Users and Engagement Tables for February 2020
Combine user information with their engagement records that occurred in February 2020.
SQL Query:
SELECT
Users.user_name,
Engagement.engagement
FROM Users
JOIN Engagement ON Users.user_id = Engagement.user_id
AND YEAR(Engagement.date) = '2020'
AND MONTH(Engagement.date) = '02';Explanation:
-
SELECT Users.user_name, Engagement.engagement:- Retrieves the user’s name and their engagement score.
-
FROM Users JOIN Engagement ON ...:- Performs an inner join between
UsersandEngagementtables based onuser_id.
- Performs an inner join between
-
AND YEAR(Engagement.date) = '2020' AND MONTH(Engagement.date) = '02':- Filters the engagement records to include only those from February 2020.
Output After Step 1:
+-----------+------------+
| user_name | engagement |
+-----------+------------+
| Alice | 10 |
| Alice | 25 |
| Bob | 30 |
| Bob | 40 |
| Charlie | 65 |
| Charlie | 70 |
| David | 23 |
| David | 24 |
| Eve | 55 |
| Eve | 60 |
+-----------+------------+Step 2: Calculate Average Engagement for Each User
Compute the average engagement score for each user during February 2020.
SQL Query:
SELECT
user_name,
AVG(engagement) AS avg_engagement
FROM Users
JOIN Engagement ON Users.user_id = Engagement.user_id
AND YEAR(Engagement.date) = '2020'
AND MONTH(Engagement.date) = '02'
GROUP BY user_name;Explanation:
-
SELECT user_name, AVG(engagement) AS avg_engagement:- Selects each user’s name and calculates their average engagement score.
-
GROUP BY user_name:- Groups the data by
user_nameto perform the aggregation for each individual.
- Groups the data by
Output After Step 2:
+-----------+----------------+
| user_name | avg_engagement |
+-----------+----------------+
| Alice | 17.50 |
| Bob | 35.00 |
| Charlie | 67.50 |
| David | 23.50 |
| Eve | 57.50 |
+-----------+----------------+Step 3: Assign Engagement Levels Based on Average Scores
Categorize each user as Low, Medium, or High based on their average engagement scores.
SQL Query:
SELECT
user_name,
CASE
WHEN AVG(engagement) < 20 THEN 'Low'
WHEN AVG(engagement) <= 60 THEN 'Medium'
ELSE 'High'
END AS engagement_level
FROM Users
JOIN Engagement ON Users.user_id = Engagement.user_id
AND YEAR(Engagement.date) = '2020'
AND MONTH(Engagement.date) = '02'
GROUP BY user_name;Explanation:
CASE WHEN ... THEN ...:- Assigns engagement levels based on the calculated average engagement:
< 20: Low<= 60: Medium> 60: High
- Assigns engagement levels based on the calculated average engagement:
Final Output:
+-----------+-------------------+
| user_name | engagement_level |
+-----------+-------------------+
| Alice | Low |
| Bob | Medium |
| Charlie | High |
| David | Medium |
| Eve | Medium |
+-----------+-------------------+