Problem
Table: Events
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| business_id | int |
| event_type | varchar |
| occurrences | int |
+---------------+---------+
(business_id, event_type) is the primary key (combination of columns with unique values) of this table.
Each row in the table logs the info that an event of some type occurred at some business for a number of times.Problem Definition
The average activity for a particular event_type is the average occurrences across all companies that have this event.
An active business is a business that has more than one event_type such that their occurrences is strictly greater than the average activity for that event.
Write a solution to find all active businesses.
Example
Image scaled to 70%
Output
Image scaled to 45%
Try It YourSelf
Database Exercise
Database Schema:
-- Database schema would be rendered hereExercise Script:
-- Exercise script would be rendered hereAvailable actions: Execute
Solution
In tackling this challenge, our strategy involves two key steps using MySQL. First, within a subquery, we calculate the average occurrences for each event type by employing the AVG function over a window partitioned by event type. Subsequently, in the main query, we filter records to select only those where the occurrences surpass the calculated average.
The final step includes grouping the results by business_id and applying additional filters to ensure inclusion only for businesses with occurrences consistently exceeding the average across multiple events.
SELECT business_id
FROM (SELECT *,
Avg(occurrences)
OVER(
partition BY event_type) AS avgo
FROM Events) x
WHERE occurrences > avgo
GROUP BY business_id
HAVING Count(business_id) > 1 Let’s break down the query step by step:
Step 1: Subquery (Alias x)
The subquery calculates the average occurrences for each event_type using the window function Avg(occurrences) OVER (PARTITION BY event_type) AS avgo. This calculates the average occurrences for each row based on the event_type.
SELECT *,
Avg(occurrences) OVER (PARTITION BY event_type) AS avgo
FROM events;Output After Step 1:
+-------------+------------+-------------+------+
| business_id | event_type | occurrences | avgo |
+-------------+------------+-------------+------+
| 1 | reviews | 7 | 5.0 |
| 3 | reviews | 3 | 5.0 |
| 1 | ads | 11 | 8.0 |
| 2 | ads | 7 | 8.0 |
| 3 | ads | 6 | 8.0 |
| 1 | page views | 3 | 7.5 |
| 2 | page views | 12 | 7.5 |
+-------------+------------+-------------+------+Step 2: Main Query
The main query filters rows where occurrences is greater than the calculated average (avgo) and groups the result by business_id. It also applies a HAVING clause to select only those business_id with a count greater than 1.
SELECT business_id
FROM (subquery) x
WHERE occurrences > avgo
GROUP BY business_id
HAVING count(business_id) > 1;Final Output:
+-------------+
| business_id |
+-------------+
| 1 |
+-------------+