Problem
Table: Purchases
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| purchase_date | date |
| customer_id | int |
| product | varchar |
+---------------+---------+
There is no primary key for this table. It may contain duplicates.
Each row of this table contains the purchase date, customer ID, and product name for each product purchased by customers.Problem Definition
Write a solution to find for each customer the total number of distinct products purchased and a list of distinct products bought.
The products bought by each customer should be sorted lexicographically.
Return the result table ordered by customer_id.
Example
Input:
Purchases table:
+---------------+-------------+------------+
| purchase_date | customer_id | product |
+---------------+-------------+------------+
| 2020-05-01 | 1 | Milk |
| 2020-05-01 | 2 | Eggs |
| 2020-06-01 | 2 | Milk |
| 2020-06-01 | 3 | Bread |
| 2020-06-02 | 1 | Water |
| 2020-07-01 | 2 | Eggs |
| 2020-07-01 | 3 | Milk |
+---------------+-------------+------------+Output:
+-------------+----------+-------------------+
| customer_id | num_products | products |
+-------------+----------+-------------------+
| 1 | 2 | Milk,Water |
| 2 | 2 | Eggs,Milk |
| 3 | 2 | Bread,Milk |
+-------------+----------+-------------------+Try It Yourself
Database Exercise
Database Schema:
-- Database schema would be rendered hereExercise Script:
-- Exercise script would be rendered hereAvailable actions: Execute
Solution
To solve this problem, the approach involves using SQL queries to analyze the Purchases table and find the total number of distinct products purchased and a list of distinct products bought by each customer. The desired output requires counting the distinct products and listing their names for each unique customer_id.
The COUNT function with the DISTINCT keyword is used to calculate the number of distinct products purchased by each customer. This information is captured in the “num_products” column. Additionally, the GROUP_CONCAT function concatenates the distinct product names for each customer. The ORDER BY clause is used within the GROUP_CONCAT function to ensure that the product names are sorted lexicographically.
The results are then grouped by customer_id using the GROUP BY clause, ensuring that the calculations are performed for each unique customer. The final step involves ordering the result table by customer_id in ascending order, as the problem statement specifies.
SELECT customer_id,
Count(DISTINCT product) AS num_products,
Group_concat(DISTINCT product ORDER BY product ASC SEPARATOR ',') AS products
FROM Purchases
GROUP BY customer_id
ORDER BY customer_id ASC; Let’s break down the query into sub-steps:
Step 1: Count the Number of Distinct Products Purchased by Each Customer
We use the COUNT(DISTINCT product) to count the number of distinct products purchased by each customer.
SELECT
customer_id,
COUNT(DISTINCT product) AS num_products
FROM
purchases
GROUP BY
customer_id;Output After Step 1:
+-------------+--------------+
| customer_id | num_products |
+-------------+--------------+
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
+-------------+--------------+Step 2: Concatenate Distinct Sorted Product Names for Each Customer
We use GROUP_CONCAT to concatenate the product names for each customer, sorting them lexicographically.
SELECT
customer_id,
num_products,
GROUP_CONCAT(DISTINCT product ORDER BY product ASC SEPARATOR ',') AS products
FROM
purchases
GROUP BY
customer_id;Output After Step 2:
+-------------+--------------+-----------------+
| customer_id | num_products | products |
+-------------+--------------+-----------------+
| 1 | 2 | Milk,Water |
| 2 | 2 | Eggs,Milk |
| 3 | 2 | Bread,Milk |
+-------------+--------------+-----------------+Step 3: Order the Result
We order the result table by customer_id.
ORDER BY customer_id ASC;Final Output:
+-------------+--------------+-----------------+
| customer_id | num_products | products |
+-------------+--------------+-----------------+
| 1 | 2 | Milk,Water |
| 2 | 2 | Eggs,Milk |
| 3 | 2 | Bread,Milk |
+-------------+--------------+-----------------+