Problem
Table: Accounts
+----------------+------+
| Column Name | Type |
+----------------+------+
| account_id | int |
| max_income | int |
+----------------+------+
account_id is the column with unique values for this table.
Each row contains information about the maximum monthly income for one bank account.Table: Transactions
+----------------+----------+
| Column Name | Type |
+----------------+----------+
| transaction_id | int |
| account_id | int |
| type | ENUM |
| amount | int |
| day | datetime |
+----------------+----------+
transaction_id is the column with unique values for this table.
Each row contains information about one transaction.
type is ENUM (category) type of ('Creditor','Debtor') where 'Creditor' means the user deposited money into their account and 'Debtor' means the user withdrew money from their account.
amount is the amount of money deposited/withdrawn during the transaction.Problem Definition
A bank account is suspicious if the total income exceeds the max_income for this account for two or more consecutive months. The total income of an account in some month is the sum of all its deposits in that month (i.e., transactions of the type 'Creditor').
Write a solution to report the IDs of all suspicious bank accounts.
Return the result table in any order.
Example
Output
Image scaled to 30%
Try It Yourself
Database Exercise
Database Schema:
-- Database schema would be rendered hereExercise Script:
-- Exercise script would be rendered hereAvailable actions: Execute
Solution
To identify suspicious bank accounts based on their transaction histories, we need to analyze the Accounts and Transactions tables. A bank account is deemed suspicious if it meets below condition:
- Total Income Exceeds
max_incomefor Two or More Consecutive Months:- Total Income for a month is the sum of all deposits (
'Creditor'transactions) made into the account during that month. - The Total Income must exceed the account’s
max_incomefor two or more consecutive months.
- Total Income for a month is the sum of all deposits (
Approach Overview
- Calculate Monthly Total Income for Each Account: Aggregate
'Creditor'transactions byaccount_idand month to determine the Total Income for each account per month. - Identify Months Where Total Income Exceeds
max_income: Compare the Total Income against themax_incomefor each account and mark the months where the income exceeds the threshold. - Determine Consecutive Months of Exceeding Income: Analyze the marked months to find accounts that have two or more consecutive months where their Total Income exceeded
max_income. - Retrieve Suspicious Account IDs: Select the
account_ids of accounts that meet the criteria.
SQL Query
WITH incomes AS (
SELECT
a.account_id,
DATE_FORMAT(day, '%Y-%m-01') AS trans_month,
SUM(amount) AS total_income,
b.max_income
FROM
Transactions a
INNER JOIN
Accounts b ON a.account_id = b.account_id
WHERE
type = 'Creditor'
GROUP BY
a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income
HAVING
IF(SUM(amount) > b.max_income, 1, 0) = 1
),
consec_income AS (
SELECT
account_id,
TIMESTAMPDIFF(
MONTH,
LAG(trans_month, 1) OVER (
PARTITION BY account_id
ORDER BY trans_month
),
trans_month
) AS prev_month
FROM
incomes
)
SELECT DISTINCT
account_id
FROM
consec_income
WHERE
prev_month = 1;Step-by-Step Approach
Step 1: Calculate Monthly Total Income for Each Account (incomes CTE)
Compute the Total Income for each account for every month by summing up all 'Creditor' transactions.
SQL Query:
WITH incomes AS (
SELECT
a.account_id,
DATE_FORMAT(day, '%Y-%m-01') AS trans_month,
SUM(amount) AS total_income,
b.max_income
FROM
Transactions a
INNER JOIN
Accounts b ON a.account_id = b.account_id
WHERE
type = 'Creditor'
GROUP BY
a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income
HAVING
IF(SUM(amount) > b.max_income, 1, 0) = 1
)
SELECT *
FROM incomes;Explanation:
-
DATE_FORMAT(day, '%Y-%m-01') AS trans_month:- Extracts the year and month from the
daycolumn to group transactions by the first day of each month.
- Extracts the year and month from the
-
SUM(amount) AS total_income:- Calculates the total deposited amount (
'Creditor'transactions) for each account per month.
- Calculates the total deposited amount (
-
JOIN Accounts b ON a.account_id = b.account_id:- Associates each transaction with its corresponding account to access the
max_income.
- Associates each transaction with its corresponding account to access the
-
WHERE type = 'Creditor':- Filters transactions to include only deposits.
-
GROUP BY a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income:- Groups the data by
account_idand month to aggregate the deposits.
- Groups the data by
-
HAVING IF(SUM(amount) > b.max_income, 1, 0) = 1:- Retains only those groups where the Total Income exceeds the
max_income.
- Retains only those groups where the Total Income exceeds the
Output After Step 1:
+------------+------------+--------------+------------+
| account_id | trans_month| total_income | max_income |
+------------+------------+--------------+------------+
| 3 | 2021-05-01 | 0 | 21000 |
| 3 | 2021-06-01 | 298000 | 21000 |
| 3 | 2021-07-01 | 64900 | 21000 |
| 4 | 2021-05-01 | 49300 | 10400 |
| 4 | 2021-06-01 | 10400 | 10400 |
| 4 | 2021-07-01 | 56300 | 10400 |
+------------+------------+--------------+------------+Step 2: Determine Consecutive Months of Exceeding Income (consec_income CTE)
Identify accounts that have two or more consecutive months where their Total Income exceeded max_income.
SQL Query:
WITH incomes AS (
SELECT
a.account_id,
DATE_FORMAT(day, '%Y-%m-01') AS trans_month,
SUM(amount) AS total_income,
b.max_income
FROM
Transactions a
INNER JOIN
Accounts b ON a.account_id = b.account_id
WHERE
type = 'Creditor'
GROUP BY
a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income
HAVING
IF(SUM(amount) > b.max_income, 1, 0) = 1
),
consec_income AS (
SELECT
account_id,
TIMESTAMPDIFF(
MONTH,
LAG(trans_month, 1) OVER (
PARTITION BY account_id
ORDER BY trans_month
),
trans_month
) AS prev_month
FROM
incomes
)
SELECT *
FROM consec_income;Explanation:
-
LAG(trans_month, 1) OVER (PARTITION BY account_id ORDER BY trans_month):- Retrieves the previous month’s
trans_monthfor eachaccount_id.
- Retrieves the previous month’s
-
TIMESTAMPDIFF(MONTH, previous_month, current_month) AS prev_month:- Calculates the difference in months between the current month and the previous month.
- A result of
1indicates that the current month is exactly one month after the previous month, signifying consecutive months.
-
consec_incomeCTE:- Contains each
account_idand the difference in months (prev_month) between consecutive entries where income exceededmax_income.
- Contains each
Output After Step 2:
+------------+------------+
| account_id | prev_month |
+------------+------------+
| 3 | NULL |
| 3 | 1 |
| 4 | NULL |
| 4 | 2 |
+------------+------------+Explanation of Output:
-
Account 3:
- First Entry (2021-06-01): No previous month →
prev_month= NULL. - Second Entry (2021-07-01): Previous month = 2021-06-01 →
prev_month= 1 (Consecutive).
- First Entry (2021-06-01): No previous month →
-
Account 4:
- First Entry (2021-05-01): No previous month →
prev_month= NULL. - Second Entry (2021-07-01): Previous month = 2021-05-01 →
prev_month= 2 (Not consecutive).
- First Entry (2021-05-01): No previous month →
Step 3: Retrieve Suspicious Account IDs
Select the account_ids of accounts that have two or more consecutive months where their Total Income exceeded max_income.
SQL Query:
WITH incomes AS (
SELECT
a.account_id,
DATE_FORMAT(day, '%Y-%m-01') AS trans_month,
SUM(amount) AS total_income,
b.max_income
FROM
Transactions a
INNER JOIN
Accounts b ON a.account_id = b.account_id
WHERE
type = 'Creditor'
GROUP BY
a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income
HAVING
IF(SUM(amount) > b.max_income, 1, 0) = 1
),
consec_income AS (
SELECT
account_id,
TIMESTAMPDIFF(
MONTH,
LAG(trans_month, 1) OVER (
PARTITION BY account_id
ORDER BY trans_month
),
trans_month
) AS prev_month
FROM
incomes
)
SELECT DISTINCT
account_id
FROM
consec_income
WHERE
prev_month = 1;Explanation:
-
WHERE prev_month = 1:- Filters the
consec_incomeCTE to include only those records where the current month is exactly one month after the previous month, indicating consecutive months.
- Filters the
-
SELECT DISTINCT account_id:- Ensures that each suspicious
account_idappears only once in the final output, even if there are multiple pairs of consecutive months.
- Ensures that each suspicious
Output After Step 3:
+------------+
| account_id |
+------------+
| 3 |
+------------+