Problem
Table: Activity
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id, event_date) is the primary key (combination of columns with unique values) of this table.
This table shows the activity of players of some games.
Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on someday using some device.Problem Definition
Write a solution to report the device that is first logged in for each player.
Example
Image scaled to 55%
Output
Image scaled to 50%
Try It Yourself
Database Exercise
Database Schema:
-- Database schema would be rendered hereExercise Script:
-- Exercise script would be rendered hereAvailable actions: Execute
Solution
To identify the first device each player logged in with, we can leverage SQL’s window functions, specifically FIRST_VALUE(). This function allows us to retrieve the first occurrence of a value within a specified window partition. By partitioning the data by player_id and ordering it by event_date, we can effectively determine the initial device used by each player.
- Partition Data by Player: Group the records based on
player_idto analyze each player’s activity individually. - Order Events Chronologically: Within each player group, order the records by
event_dateto identify the sequence of device usage. - Retrieve the First Device: Use the
FIRST_VALUE()window function to extract the device ID from the earliest event for each player. - Eliminate Duplicate Entries: Apply
DISTINCTto ensure each player appears only once in the final result.
SQL Query
SELECT DISTINCT player_id,
FIRST_VALUE(device_id) OVER (
PARTITION BY player_id
ORDER BY event_date
) AS device_id
FROM Activity;Step-by-Step Approach
Step 1: Partition Data by Player and Order by Event Date
Group the activity records by player_id and order each group chronologically based on event_date to prepare for identifying the first device used.
SQL Query:
SELECT player_id,
device_id,
event_date,
FIRST_VALUE(device_id) OVER (
PARTITION BY player_id
ORDER BY event_date
) AS first_device
FROM Activity;Explanation:
-
SELECT player_id, device_id, event_date, FIRST_VALUE(device_id) OVER (...) AS first_device:- Retrieves each player’s ID, the device used, the date of the event, and the first device used by the player.
-
FIRST_VALUE(device_id) OVER (PARTITION BY player_id ORDER BY event_date) AS first_device:- The
FIRST_VALUE()function scans each partition (grouped byplayer_id) and retrieves thedevice_idfrom the earliestevent_date.
- The
-
FROM Activity:- Indicates that the data is sourced from the
Activitytable.
- Indicates that the data is sourced from the
Output After Step 1:
Based on the example input, the intermediate result would be:
+-----------+-----------+-------------+-------------+
| player_id | device_id | event_date | first_device|
+-----------+-----------+-------------+-------------+
| 1 | 2 | 2016-03-01 | 2 |
| 1 | 2 | 2016-05-02 | 2 |
| 2 | 3 | 2016-03-01 | 3 |
| 3 | 1 | 2016-03-01 | 1 |
| 3 | 4 | 2018-07-03 | 1 |
+-----------+-----------+-------------+-------------+Step 2: Select Distinct Player IDs with Their First Device
Extract each player’s ID along with their first device, ensuring that each player appears only once in the final result.
SQL Query:
SELECT DISTINCT player_id,
FIRST_VALUE(device_id) OVER (
PARTITION BY player_id
ORDER BY event_date
) AS device_id
FROM Activity;Explanation:
SELECT DISTINCT player_id, FIRST_VALUE(device_id) OVER (...) AS device_id:- Selects each unique
player_idand associates it with the firstdevice_iddetermined by the window function.
- Selects each unique
Final Output:
+-----------+-----------+
| player_id | device_id |
+-----------+-----------+
| 1 | 2 |
| 2 | 3 |
| 3 | 1 |
+-----------+-----------+