Problem
Table: Matches
+-------------+------+
| Column Name | Type |
+-------------+------+
| player_id | int |
| match_day | date |
| result | enum |
+-------------+------+
(player_id, match_day) is the primary key (combination of columns with unique values) for this table.
Each row of this table contains the ID of a player, the day of the match they played, and the result of that match.
The result column is an ENUM (category) type of ('Win', 'Draw', 'Lose').Problem Definition
The winning streak of a player is the number of consecutive wins uninterrupted by draws or losses.
Write a solution to count the longest winning streak for each player.
Return the result table in any order.
Example
Image scaled to 60%
Output
Image scaled to 35%
Try It Yourself
Database Exercise
Database Schema:
-- Database schema would be rendered hereExercise Script:
-- Exercise script would be rendered hereAvailable actions: Execute
Solution
To determine the longest winning streak for each player, we need to analyze the Matches table to identify consecutive wins without any interruptions by draws or losses. Below is a comprehensive breakdown of the approach, step-by-step execution, and detailed explanations for each line of the SQL query.
Approach Overview
-
Transform Match Results:
- Convert the categorical
resultinto numerical indicators to facilitate streak calculations.
- Convert the categorical
-
Calculate Cumulative Non-Win Counts (
se):- For each player, compute a running total of non-win results to identify streak segments.
-
Determine Streak Lengths:
- Group matches based on the cumulative non-win counts to isolate winning streaks.
-
Compute Longest Streak per Player:
- For each player, identify the maximum streak length from the grouped data.
-
Present the Final Results:
- Display each player’s ID alongside their longest winning streak.
SQL Query
SELECT player_id,
Max(streak) AS longest_streak
FROM (SELECT player_id,
se,
CASE
WHEN se = 0 THEN Count(se)
ELSE Count(se) - 1
END AS streak
FROM (SELECT player_id,
match_day,
Sum(r)
OVER (
partition BY player_id
ORDER BY match_day) AS se
FROM (SELECT player_id,
match_day,
CASE
WHEN result = 'win' THEN 0
ELSE 1
END r
FROM Matches) a)b
GROUP BY player_id,
se) c
GROUP BY player_id;Step-by-Step Approach
Step 1: Transform Match Results into Numerical Indicators
Convert the categorical result into numerical values where a win is represented by 0 and any other outcome (Draw or Lose) is represented by 1. This transformation simplifies the calculation of winning streaks.
SQL Snippet:
SELECT player_id,
match_day,
CASE
WHEN result = 'win' THEN 0
ELSE 1
END r
FROM Matches;Explanation:
-
SELECT player_id, match_day,- Selects the
player_idandmatch_daycolumns from theMatchestable.
- Selects the
-
CASE- Initiates a conditional statement to transform the
resultcolumn.
- Initiates a conditional statement to transform the
-
WHEN result = 'win' THEN 0- Assigns a value of
0if theresultis'win'.
- Assigns a value of
-
ELSE 1- Assigns a value of
1for any other result ('Draw'or'Lose').
- Assigns a value of
-
END r- Concludes the
CASEstatement and aliases the resulting value asr.
- Concludes the
-
FROM Matches;- Specifies the
Matchestable as the data source.
- Specifies the
Intermediate Output After Step 1:
+-----------+------------+---+
| player_id | match_day | r |
+-----------+------------+---+
| 1 | 2022-01-17 | 0 |
| 1 | 2022-01-18 | 0 |
| 1 | 2022-01-25 | 0 |
| 1 | 2022-01-31 | 1 |
| 1 | 2022-02-08 | 0 |
| 2 | 2022-02-06 | 1 |
| 2 | 2022-02-08 | 1 |
| 3 | 2022-03-30 | 0 |
+-----------+------------+---+Step 2: Calculate Cumulative Non-Win Counts (se)
For each player, compute a running total of non-win results (r) ordered by match_day. This cumulative sum (se) helps in identifying streak segments by indicating interruptions in winning streaks.
SQL Snippet:
SELECT player_id,
match_day,
Sum(r)
OVER (
partition BY player_id
ORDER BY match_day) AS se
FROM (SELECT player_id,
match_day,
CASE
WHEN result = 'win' THEN 0
ELSE 1
END r
FROM Matches) a;Explanation:
-
SELECT player_id, match_day,- Selects the
player_idandmatch_daycolumns from the subquerya.
- Selects the
-
Sum(r)- Calculates the cumulative sum of
rvalues.
- Calculates the cumulative sum of
-
OVER (- Introduces the window function to define how the cumulative sum is calculated.
-
partition BY player_id- Divides the data into partitions based on
player_id. This means the cumulative sum resets for each player.
- Divides the data into partitions based on
-
ORDER BY match_day)- Orders the data within each partition by
match_dayto ensure the cumulative sum is calculated chronologically.
- Orders the data within each partition by
-
AS se- Aliases the resulting cumulative sum as
se.
- Aliases the resulting cumulative sum as
-
FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a;- Specifies the subquery
awhich transforms theresultinto numerical indicators as done in Step 1.
- Specifies the subquery
Intermediate Output After Step 2 (se):
+-----------+------------+----+
| player_id | match_day | se |
+-----------+------------+----+
| 1 | 2022-01-17 | 0 |
| 1 | 2022-01-18 | 0 |
| 1 | 2022-01-25 | 0 |
| 1 | 2022-01-31 | 1 |
| 1 | 2022-02-08 | 1 |
| 2 | 2022-02-06 | 1 |
| 2 | 2022-02-08 | 2 |
| 3 | 2022-03-30 | 0 |
+-----------+------------+----+Step 3: Determine Streak Lengths
Identify streak segments by grouping consecutive matches based on the cumulative non-win counts (se). Calculate the length of each streak.
SQL Snippet:
SELECT player_id,
se,
CASE
WHEN se = 0 THEN Count(se)
ELSE Count(se) - 1
END AS streak
FROM (SELECT player_id,
match_day,
Sum(r)
OVER (
partition BY player_id
ORDER BY match_day) AS se
FROM (SELECT player_id,
match_day,
CASE
WHEN result = 'win' THEN 0
ELSE 1
END r
FROM Matches) a)b
GROUP BY player_id,
se;Explanation:
-
SELECT player_id, se,- Selects the
player_idand cumulative non-win countsefrom the subqueryb.
- Selects the
-
CASE- Initiates a conditional statement to calculate the streak length based on
se.
- Initiates a conditional statement to calculate the streak length based on
-
WHEN se = 0 THEN Count(se)- If
se = 0, it means the player has not had any non-wins up to that point, indicating an uninterrupted streak of wins. The streak length is equal to the count of such entries.
- If
-
ELSE Count(se) - 1- If
se > 0, it indicates that the streak has been interrupted by at least one non-win. The streak length is calculated as the count minus one to exclude the match that broke the streak.
- If
-
END AS streak- Concludes the
CASEstatement and aliases the result asstreak.
- Concludes the
-
FROM (...) b- Specifies that the data is being selected from the subquery
b, which containsplayer_id,match_day, andse.
- Specifies that the data is being selected from the subquery
-
GROUP BY player_id, se;- Groups the data by
player_idandseto aggregate the streak lengths accordingly.
- Groups the data by
Intermediate Output After Step 3:
+-----------+----+--------+
| player_id | se | streak |
+-----------+----+--------+
| 1 | 0 | 3 |
| 1 | 1 | 0 |
| 2 | 1 | 0 |
| 2 | 2 | 0 |
| 3 | 0 | 1 |
+-----------+----+--------+Step 4: Compute Longest Streak per Player
For each player, identify the maximum streak length from the previously calculated streaks.
SQL Snippet:
SELECT player_id,
Max(streak) AS longest_streak
FROM (SELECT player_id,
se,
CASE
WHEN se = 0 THEN Count(se)
ELSE Count(se) - 1
END AS streak
FROM (SELECT player_id,
match_day,
Sum(r)
OVER (
partition BY player_id
ORDER BY match_day) AS se
FROM (SELECT player_id,
match_day,
CASE
WHEN result = 'win' THEN 0
ELSE 1
END r
FROM Matches) a)b
GROUP BY player_id,
se) c
GROUP BY player_id;Explanation:
-
SELECT player_id, Max(streak) AS longest_streak- Selects the
player_idand calculates the maximum streak value for each player, aliasing it aslongest_streak.
- Selects the
-
FROM (...) c- Specifies that the data is being selected from the subquery
c, which containsplayer_id,se, andstreak.
- Specifies that the data is being selected from the subquery
-
GROUP BY player_id;- Groups the data by
player_idto perform the aggregation (Max(streak)) for each player.
- Groups the data by
Final Output:
+-----------+----------------+
| player_id | longest_streak |
+-----------+----------------+
| 1 | 3 |
| 2 | 0 |
| 3 | 1 |
+-----------+----------------+