SQL Join 3 Tables and Calculate Total and Percentage
Introduction
In this article, we will explore how to perform a SQL join on three tables and calculate total sales and error rates for a specific date. We will use sample data and provide a step-by-step guide on how to write the query.
Background
To understand this tutorial, it’s essential to have a basic understanding of SQL and table joins. A table join is used to combine rows from two or more tables based on a related column between them. In this example, we will be joining three tables: tbl1, tbl2, and junction.
Table Schema
Let’s take a look at the schema for each table:
Tbl1 (Sales)
| Column Name | Data Type |
|---|---|
| id | int |
| date | date |
| store_id | varchar(255) |
| sold_count | int |
| Column Name | Data Type |
|---|---|
| id | int |
| error_id | int |
| error_type | varchar(255) |
Junction
| Column Name | Data Type |
|---|---|
| tb1_id | int |
| tb2_id | int |
Tbl2 (Errors)
| Column Name | Data Type |
|---|---|
| id | int |
| error_id | int |
| error_type | varchar(255) |
Sample Data
To demonstrate the query, let’s create sample data for each table:
-- Create tables
CREATE TABLE tbl1 (
id INT PRIMARY KEY,
date DATE,
store_id VARCHAR(255),
sold_count INT
);
CREATE TABLE tbl2 (
id INT PRIMARY KEY,
error_id INT,
error_type VARCHAR(255)
);
CREATE TABLE junction (
tb1_id INT,
tb2_id INT,
FOREIGN KEY (tb1_id) REFERENCES tbl1(id),
FOREIGN KEY (tb2_id) REFERENCES tbl2(id)
);
-- Insert data
INSERT INTO tbl1 (id, date, store_id, sold_count)
VALUES
(1, '2000-01-01 10:00', 'store1', 30),
(2, '2000-01-02 12:00', 'store1', 20),
(3, '2000-01-01 13:00', 'store2', 40),
(4, '2000-01-01 17:00', 'store1', 50);
INSERT INTO tbl2 (id, error_id, error_type)
VALUES
(1, 1, 'error_type_A'),
(2, 2, 'error_type_A'),
(3, 3, 'error_type_B'),
(4, 4, 'error_type_B'),
(5, 5, 'error_type_B');
INSERT INTO junction (tb1_id, tb2_id)
VALUES
(1, 1),
(1, 2),
(2, 3),
(3, 4),
(4, 5);
SQL Join and Calculation
The query provided in the question is a good starting point. However, it has some issues that we need to address.
-- Query with sample data
SELECT sum(sold) sold,
sum(errors) errors,
sum(errors)/sum(sold) error_rate
FROM (
SELECT max(sold_count) sold,
count(*) errors
FROM tbl1
JOIN junction j ON tbl1.id = j.tb1_id
JOIN tbl2 on j.tb2_id = tbl2.id
WHERE tbl1.date = '2000-01-01'
GROUP BY tbl1.id
) a;
The query is using an inner join to combine rows from tbl1, junction, and tbl2 based on the tb1_id and tb2_id. However, this will only include rows where there are matches in all three tables. We want to include all rows from tbl1 that match the date, regardless of whether there is a match in tbl2.
To fix this, we can use a left join instead:
-- Query with sample data using left join
SELECT sum(sold) sold,
sum(errors) errors,
sum(errors)/sum(sold) error_rate
FROM (
SELECT max(sold_count) sold,
count(*) errors
FROM tbl1
LEFT JOIN junction j ON tbl1.id = j.tb1_id
LEFT JOIN tbl2 on j.tb2_id = tbl2.id
WHERE tbl1.date = '2000-01-01'
GROUP BY tbl1.id
) a;
However, this will include rows from tbl2 that don’t have matches in junction, which is not what we want. To fix this, we need to use a subquery or CTE (Common Table Expression) to filter out the rows.
-- Query with sample data using subquery
SELECT sum(sold) sold,
sum(errors) errors,
sum(errors)/sum(sold) error_rate
FROM (
SELECT j.tb1_id, tbl1.id, tbl1.sold_count, tbl2.error_id, tbl2.error_type
FROM junction j
JOIN tbl1 ON j.tb1_id = tbl1.id
LEFT JOIN tbl2 on j.tb2_id = tbl2.id
WHERE tbl1.date = '2000-01-01'
) subquery
GROUP BY subquery.tb1_id, subquery.id, subquery.sold_count, subquery.error_id, subquery.error_type;
This query will include all rows from tbl1 that match the date, and then filter out any rows where there is no match in junction.
Conclusion
In this article, we explored how to perform a SQL join on three tables and calculate total sales and error rates for a specific date. We used sample data and provided a step-by-step guide on how to write the query. The final query uses a subquery to filter out rows without matches in junction.
Last modified on 2024-04-23