[SQL]: Joins (One to Many)
SQL
09/29/2019
Join is useful when working with multiple tables.
Real world data is complicated and interrelated.
Relationship Basics
- One to One Relationship
- One to Many Relationship -- common.
Example: students <--> notes. Students have many notes on different classes but those notes belong to one student - Many to Many Relationship -- also common.
Example: students <--> classes. Students take many classes, and within those classes there are many students.
One to Many Relationship (1:Many)
Example: Customers & Orders. Customers can have many orders, but those orders belong to each customer
Bad Example
CREATE TABLE customersAndOrders ( customer_fname VARCHAR(30), customer_lname VARCHAR(30), email VARCHAR(60), purchase_date DATE, price DECIMAL(7,2));
+----------------+----------------+-------------------+---------------+--------+| customer_fname | customer_lname | email | purchase_date | price |+----------------+----------------+-------------------+---------------+--------+| Smith | Kent | [email protected] | 2018-03-16 | 106.11 || Smith | Kent | [email protected] | 2019-01-11 | 32.76 || Sam | Park | [email protected] | 2014-07-14 | 423.03 || Sam | Park | [email protected] | 2015-02-16 | 13.16 || Dan | Lee | [email protected] | NULL | NULL || Grey | George | [email protected] | NULL | NULL |+----------------+----------------+-------------------+---------------+--------+
This does work, but it's a bad design--it has a multiple duplicate information.
FOREIGN KEY
Let's keep our data separate using FOREIGN KEY
SQL file to create and insert into customers and orders tables is available from
customers.orders_tables.txt Documentation on FOREIGN KEY
CREATE TABLE customers ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, fname VARCHAR(30), lname VARCHAR(30), email VARCHAR(60));-- MySQL versionCREATE TABLE orders ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, order_date DATE, price DECIMAL(8,2), customer_id INT, FOREIGN KEY(customer_id) REFERENCES customers(id));-- FOREIGN KEY assignment in SQL Server / Oracle / MS Access (different RDMS)--customer_id INT FOREIGN KEY REFERENCES customers(id)
+-------+-------------+------+-----+---------+----------------+| Field | Type | Null | Key | Default | Extra |+-------+-------------+------+-----+---------+----------------+| id | int(11) | NO | PRI | NULL | auto_increment || fname | varchar(30) | YES | | NULL | || lname | varchar(30) | YES | | NULL | || email | varchar(60) | YES | | NULL | |+-------+-------------+------+-----+---------+----------------++-------------+--------------+------+-----+---------+----------------+| Field | Type | Null | Key | Default | Extra |+-------------+--------------+------+-----+---------+----------------+| id | int(11) | NO | PRI | NULL | auto_increment || order_date | date | YES | | NULL | || price | decimal(8,2) | YES | | NULL | || customer_id | int(11) | YES | MUL | NULL | |+-------------+--------------+------+-----+---------+----------------+
Note that orders have a FOREIGN KEY customerid referencing **_id in customers**. FOREIN KEY is used to reference an external table.
customer_id is a naming convention to denote that it's referencing id in customer table
If the foreign key doesn't exist in customers, it will throw an error because there must be a customer that has the order
INSERT INTO orders(order_date, price, customer_id)VALUES (DATE(NOW()), 30, 99);
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`online`.`orders`, CONSTRAINT `orders_ibfk_1` FOREIGN KEY (`customer_id`) REFERENCES `customers` (`id`))
INSERT INTO customers(fname, lname, email)
INSERT INTO orders(order_date, price, customer_id)VALUES ('2018-03-16', 106.11, 1);
+----+-------+-------+--------------+| id | fname | lname | email |+----+-------+-------+--------------+| 1 | Smith | Kent | [email protected] |+----+------------+--------+-------------+| id | order_date | price | customer_id |+----+------------+--------+-------------+| 1 | 2018-03-16 | 106.11 | 1 |+----+------------+--------+-------------+
You can't drop the table that has data that is being referenced. You can remove orders table first then customers table or both at once.
DROP TABLE orders, customers; -- orders should come first otherwise customers table won't be removed
Add more data
INSERT INTO customers(fname, lname, email)
INSERT INTO orders(order_date, price, customer_id)VALUES ('2019-01-11', 32.76, 1),('2014-07-14', 423.03, 2),('2015-02-16', 13.16, 2)('2019-05-27', 207.88, 5);
+----+-------+--------+--------------------+| id | fname | lname | email |+----+-------+--------+--------------------+| 1 | Smith | Kent | [email protected] || 2 | Sam | Park | [email protected] || 3 | Dan | Lee | [email protected] || 4 | Grey | George | [email protected] || 5 | Joy | Grey | [email protected] |+----+-------+--------+--------------------++----+------------+--------+-------------+| id | order_date | price | customer_id |+----+------------+--------+-------------+| 1 | 2018-03-16 | 106.11 | 1 || 2 | 2019-01-11 | 32.76 | 1 || 3 | 2014-07-14 | 423.03 | 2 || 4 | 2015-02-16 | 13.16 | 2 || 5 | 2019-05-27 | 207.88 | 5 |+----+------------+--------+-------------+
Adding FOREIGN KEY
If you want to add a foreign key, you can add with the following:
ALTER TABLE ordersADD FOREIGN KEY (customer_id) REFERENCES customers(id);-- Adding FOREIGN KEY with specified constraintALTER TABLE ordersADD CONSTRAINT FK_CustomerOrderFOREIGN KEY (customer_id) REFERENCES customers(id);
DROP a FOREIGN KEY Constraint
ALTER TABLE ordersDROP FOREIGN KEY FK_CustomerOrder;
If you didn't specify the CONSTRAINT when creating a FOREIGN KEY, MySQL generates automatically internally. To access auto-generated name of CONSTRAINT, use the following
SHOW CREATE TABLE orders;
+--------+--------------+| Table | Create Table |+--------+--------------+| orders | CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `order_date` date DEFAULT NULL, `price` decimal(8,2) DEFAULT NULL, `customer_id` int(11) DEFAULT NULL, PRIMARY KEY (`id`), KEY `customer_id` (`customer_id`), CONSTRAINT `orders_ibfk_1` FOREIGN KEY (`customer_id`) REFERENCES `customers` (`id`)) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1 |
Now you can remove foreign constraint key with
ALTER TABLE orders drop FOREIGN KEY orders_ibfk_1;
ON DELETE CASCADE
When customers2 is deleted which has a corresponding data in orders2, delete orders2 as well
CREATE TABLE customers2 ( id INT PRIMARY KEY AUTO_INCREMENT);CREATE TABLE orders2 ( customer_id INT, FOREIGN KEY(customer_id) REFERENCES customers2(id) ON DELETE CASCADE);INSERT INTO customers2 VALUES (1);INSERT INTO orders2 VALUES(1);
DELETE FROM customers2 WHERE id=1;
SELECT * FROM orders2;
$ SELECT * FROM orders2;Empty set (0.00 sec)
Deletion on customers2 also removed data in orders2
Combining two tables
What if you want information combined together from both tables?
For example,
SELECT id FROM customers WHERE lname="Park";SELECT * FROM orders WHERE customer_id=2;
+----+| id |+----+| 2 |+----++----+------------+--------+-------------+| id | order_date | price | customer_id |+----+------------+--------+-------------+| 3 | 2014-07-14 | 423.03 | 2 || 4 | 2015-02-16 | 13.16 | 2 |+----+------------+--------+-------------+
Using Sub-query
SELECT * FROM orders WHERE customer_id = ( SELECT id FROM customers WHERE lname = "Park" );
+----+------------+--------+-------------+| id | order_date | price | customer_id |+----+------------+--------+-------------+| 3 | 2014-07-14 | 423.03 | 2 || 4 | 2015-02-16 | 13.16 | 2 |+----+------------+--------+-------------+
The sub-query..
SELECT id FROM customers WHERE lname = "Park"
is equal to 2 in this case. It yields the same results as previous query. Also note that the query just has one semi-colon at the end.
This isn't very ideal nor useful.
Cross Join
Use JOIN to conjoin two tables--stick them together. This however results in a messy table.
SELECT * FROM customers, orders;
+----+-------+--------+--------------------+----+------------+--------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+----+-------+--------+--------------------+----+------------+--------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 2 | Sam | Park | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 3 | Dan | Lee | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 4 | Grey | George | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 5 | Joy | Grey | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 2 | Sam | Park | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 3 | Dan | Lee | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 4 | Grey | George | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 5 | Joy | Grey | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 1 | Smith | Kent | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 3 | Dan | Lee | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 4 | Grey | George | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 5 | Joy | Grey | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 1 | Smith | Kent | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 3 | Dan | Lee | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 4 | Grey | George | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 1 | Smith | Kent | [email protected] | 5 | 2019-05-27 | 207.88 | 5 || 2 | Sam | Park | [email protected] | 5 | 2019-05-27 | 207.88 | 5 || 3 | Dan | Lee | [email protected] | 5 | 2019-05-27 | 207.88 | 5 || 4 | Grey | George | [email protected] | 5 | 2019-05-27 | 207.88 | 5 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |+----+-------+--------+--------------------+----+------------+--------+-------------+
You can see a multiple duplicate information--takes every customer and conjoins every order like a multiplication. Yet, we can start to boil it down.
Inner Join (Implicit)
Inner Join selects records from table A and B where the join condition is met
SELECT * FROM customers, ordersWHERE customers.id = orders.customer_id;
+----+-------+-------+--------------------+----+------------+--------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+----+-------+-------+--------------------+----+------------+--------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |+----+-------+-------+--------------------+----+------------+--------+-------------+
customers.id is in format of $tableName.$col to avoid ambiguity. i.e. Both orders and students have id columns.
You don't necessarily have to put the table name like orders.customer_id since customer_id only exists in orders table, but this is a convention to follow
Selecting certain columns
SELECT fname, lname, order_date, priceFROM customers, ordersWHERE customers.id = orders.customer_id;
+-------+-------+------------+--------+| fname | lname | order_date | price |+-------+-------+------------+--------+| Smith | Kent | 2018-03-16 | 106.11 || Smith | Kent | 2019-01-11 | 32.76 || Sam | Park | 2014-07-14 | 423.03 || Sam | Park | 2015-02-16 | 13.16 || Joy | Grey | 2019-05-27 | 207.88 |+-------+-------+------------+--------+
Inner Join (Explicit)
More conventional way of joining--more cleaner and easily understood. Preferred way.
SELECT fname, lname, order_date, priceFROM customersJOIN orders ON customers.id = orders.customer_id;
+-------+-------+------------+--------+| fname | lname | order_date | price |+-------+-------+------------+--------+| Smith | Kent | 2018-03-16 | 106.11 || Smith | Kent | 2019-01-11 | 32.76 || Sam | Park | 2014-07-14 | 423.03 || Sam | Park | 2015-02-16 | 13.16 || Joy | Grey | 2019-05-27 | 207.88 |+-------+-------+------------+--------+
Creating the union table where customers.id = orders.customer_id
Yields the same result as above implicit inner join.
It's optional to explicitly write
INNER JOIN orders ON ..
Different order
SELECT * FROM ordersJOIN customers ON customers.id = orders.id;
+----+------------+--------+-------------+----+-------+--------+--------------------+| id | order_date | price | customer_id | id | fname | lname | email |+----+------------+--------+-------------+----+-------+--------+--------------------+| 1 | 2018-03-16 | 106.11 | 1 | 1 | Smith | Kent | [email protected] || 2 | 2019-01-11 | 32.76 | 1 | 2 | Sam | Park | [email protected] || 3 | 2014-07-14 | 423.03 | 2 | 3 | Dan | Lee | [email protected] || 4 | 2015-02-16 | 13.16 | 2 | 4 | Grey | George | [email protected] || 5 | 2019-05-27 | 207.88 | 5 | 5 | Joy | Grey | [email protected] |+----+------------+--------+-------------+----+-------+--------+--------------------+
Shows the data in different order of columns unless you specify columns after SELECT
More example
SELECT fname, lname, order_date, priceFROM ordersJOIN customers ON orders.customer_id = customers.idORDER BY orders.order_date;
+-------+-------+------------+--------+| fname | lname | order_date | price |+-------+-------+------------+--------+| Sam | Park | 2014-07-14 | 423.03 || Sam | Park | 2015-02-16 | 13.16 || Smith | Kent | 2018-03-16 | 106.11 || Smith | Kent | 2019-01-11 | 32.76 || Joy | Grey | 2019-05-27 | 207.88 |+-------+-------+------------+--------+
Display ordered by order date
SELECT fname, lname, SUM(price) AS total_spentFROM customers JOIN orders ONcustomers.id = orders.customer_idGROUP BY customers.idORDER BY total_spent DESC;
+-------+-------+-------------+| fname | lname | total_spent |+-------+-------+-------------+| Sam | Park | 436.19 || Joy | Grey | 207.88 || Smith | Kent | 138.87 |+-------+-------+-------------+
Shows total amount spent on order(s) in descending order
Arbitrary Join -- Avoid doing it
SELECT * FROM customersJOIN orders ON customers.id = orders.id;
+----+-------+--------+--------------------+----+------------+--------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+----+-------+--------+--------------------+----+------------+--------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 2 | Sam | Park | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 3 | Dan | Lee | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 4 | Grey | George | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |+----+-------+--------+--------------------+----+------------+--------+-------------+
Joining on arbitrary conditions is allowed, but this is meaningless.
LEFT Join
Selects all data from A along with many matching data in B.
SELECT * FROM customersLEFT JOIN orders ON customers.id = orders.customer_id;
+----+-------+--------+--------------------+------+------------+--------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+----+-------+--------+--------------------+------+------------+--------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 || 3 | Dan | Lee | [email protected] | NULL | NULL | NULL | NULL || 4 | Grey | George | [email protected] | NULL | NULL | NULL | NULL |+----+-------+--------+--------------------+------+------------+--------+-------------+
Information of customers is displayed on left and orders on right. The information that does not match is marked as NULL.
SELECT fname, lname, order_date, priceFROM customersLEFT JOIN orders ON customers.id = orders.customer_id;
+-------+--------+------------+--------+| fname | lname | order_date | price |+-------+--------+------------+--------+| Smith | Kent | 2018-03-16 | 106.11 || Smith | Kent | 2019-01-11 | 32.76 || Sam | Park | 2014-07-14 | 423.03 || Sam | Park | 2015-02-16 | 13.16 || Joy | Grey | 2019-05-27 | 207.88 || Dan | Lee | NULL | NULL || Grey | George | NULL | NULL |+-------+--------+------------+--------+
LEFT Join can be useful in this case; for example if you want to tabulate information including who did not place any order.
SELECT fname, lname, SUM(price) AS total_spentFROM customersLEFT JOIN orders ON customers.id = orders.customer_idGROUP BY customers.idORDER BY total_spent DESC;
+-------+--------+-------------+| fname | lname | total_spent |+-------+--------+-------------+| Sam | Park | 436.19 || Joy | Grey | 207.88 || Smith | Kent | 138.87 || Dan | Lee | NULL || Grey | George | NULL |+-------+--------+-------------+
IFNULL
You can add conditional statement for NULL values
SELECT fname, lname,IFNULL(SUM(price), 0) AS total_spentFROM customersLEFT JOIN orders ON customers.id = orders.customer_idGROUP BY customers.idORDER BY total_spent DESC;
+-------+--------+-------------+| fname | lname | total_spent |+-------+--------+-------------+| Sam | Park | 436.19 || Joy | Grey | 207.88 || Smith | Kent | 138.87 || Dan | Lee | 0.00 || Grey | George | 0.00 |+-------+--------+-------------+
INNER vs LEFT Join
Both join data from two tables, but..
- Inner Join selects all data from A and B where the join condition is met (exact overlap)
- Left Join selects all data from A along with matching records in B
Right Join
Similar to Left Join, Right Join selects all data from B along with matching record in A.
INNER vs LEFT vs RIGHT Join
SELECT * FROM customersINNER JOIN orders ON customers.id = orders.customer_id;
SELECT * FROM customersLEFT JOIN orders ON customers.id = orders.customer_id;
SELECT * FROM customersRIGHT JOIN orders ON customers.id = orders.customer_id;
+----+-------+-------+--------------------+----+------------+--------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+----+-------+-------+--------------------+----+------------+--------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |+----+-------+-------+--------------------+----+------------+--------+-------------++----+-------+--------+--------------------+------+------------+--------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+----+-------+--------+--------------------+------+------------+--------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 || 3 | Dan | Lee | [email protected] | NULL | NULL | NULL | NULL || 4 | Grey | George | [email protected] | NULL | NULL | NULL | NULL |+----+-------+--------+--------------------+------+------------+--------+-------------++------+-------+-------+--------------------+----+------------+--------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+------+-------+-------+--------------------+----+------------+--------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |+------+-------+-------+--------------------+----+------------+--------+-------------+
Results of Inner Join and Right Join are the same because there is no order that does not match with customers.
Left vs Right Join
SELECT * FROM customersLEFT JOIN orders ON customers.id = orders.customer_id;
SELECT * FROM ordersRIGHT JOIN customers ON customers.id = orders.customer_id;
+----+-------+--------+--------------------+------+------------+--------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+----+-------+--------+--------------------+------+------------+--------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 || 3 | Dan | Lee | [email protected] | NULL | NULL | NULL | NULL || 4 | Grey | George | [email protected] | NULL | NULL | NULL | NULL |+----+-------+--------+--------------------+------+------------+--------+-------------++------+------------+--------+-------------+----+-------+--------+--------------------+| id | order_date | price | customer_id | id | fname | lname | email |+------+------------+--------+-------------+----+-------+--------+--------------------+| 1 | 2018-03-16 | 106.11 | 1 | 1 | Smith | Kent | [email protected] || 2 | 2019-01-11 | 32.76 | 1 | 1 | Smith | Kent | [email protected] || 3 | 2014-07-14 | 423.03 | 2 | 2 | Sam | Park | [email protected] || 4 | 2015-02-16 | 13.16 | 2 | 2 | Sam | Park | [email protected] || 5 | 2019-05-27 | 207.88 | 5 | 5 | Joy | Grey | [email protected] || NULL | NULL | NULL | NULL | 3 | Dan | Lee | [email protected] || NULL | NULL | NULL | NULL | 4 | Grey | George | [email protected] |+------+------------+--------+-------------+----+-------+--------+--------------------+
Left and Right Join achieve the exact same goal. It shows in a different order, but it's trivial. If you specify the column orders, it will result in the same.
Some RDMS don't even support RIGHT JOIN
SELECT fname, lname, price FROM customersLEFT JOIN ordersON customers.id = orders.customer_id;
SELECT fname, lname, price FROM ordersRIGHT JOIN customersON customers.id = orders.customer_id;
+-------+--------+--------+| fname | lname | price |+-------+--------+--------+| Smith | Kent | 106.11 || Smith | Kent | 32.76 || Sam | Park | 423.03 || Sam | Park | 13.16 || Joy | Grey | 207.88 || Dan | Lee | NULL || Grey | George | NULL |+-------+--------+--------+
Tweak the data to distinguish with Inner Join
Let's modify data in left table, a.k.a. table A or orders table because then JOIN on customers will have NULL in it.
-- DROP the FOREIGN KEYALTER TABLE ordersDROP FOREIGN KEY orders_ibfk_1; -- retrieved from SHOW CREATE TABLE orders;
-- INSERT data into ordersINSERT INTO orders(order_date, price, customer_id)VALUES ('2000-02-02', 3000, 99), (CURDATE(), 276, 45);
+----+------------+---------+-------------+| id | order_date | price | customer_id |+----+------------+---------+-------------+| 1 | 2018-03-16 | 106.11 | 1 || 2 | 2019-01-11 | 32.76 | 1 || 3 | 2014-07-14 | 423.03 | 2 || 4 | 2015-02-16 | 13.16 | 2 || 5 | 2019-05-27 | 207.88 | 5 || 8 | 2000-02-02 | 3000.00 | 99 || 9 | 2019-09-30 | 276.00 | 45 |+----+------------+---------+-------------+
SELECT * FROM customers JOIN orders ON customers.id = orders.customer_id;
+----+-------+-------+--------------------+----+------------+--------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+----+-------+-------+--------------------+----+------------+--------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |+----+-------+-------+--------------------+----+------------+--------+-------------+
There is no change in Left Join because the added rows in orders does not have matching data with customers
SELECT * FROM customersRIGHT JOIN orders ON customers.id = orders.customer_id;
+------+-------+-------+--------------------+----+------------+---------+-------------+| id | fname | lname | email | id | order_date | price | customer_id |+------+-------+-------+--------------------+----+------------+---------+-------------+| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 || 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 || 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 || 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 || 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 || NULL | NULL | NULL | NULL | 8 | 2000-02-02 | 3000.00 | 99 || NULL | NULL | NULL | NULL | 9 | 2019-09-30 | 276.00 | 45 |+------+-------+-------+--------------------+----+------------+---------+-------------+
However, RIGHT JOIN now has NULL values because there is no matching data for newly added date in customers
Adding conditions
SELECT IFNULL(fname, "MISSING") AS fname, IFNULL(lname, "MISSING") AS lname, SUM(price) AS total_spentFROM customersRIGHT JOIN orders ON customers.id = orders.customer_idGROUP BY customer_id;
+---------+---------+-------------+| fname | lname | total_spent |+---------+---------+-------------+| Smith | Kent | 138.87 || Sam | Park | 436.19 || Joy | Grey | 207.88 || MISSING | MISSING | 276.00 || MISSING | MISSING | 3000.00 |+---------+---------+-------------+
Challenge
Solution is available at the end of page
- Create a table with the following information
students: id, first name
notes: title, content, student_id(FOREIGN KEY)
After creating the table, insert the following data:
INSERT INTO students(fname) VALUES('David'), ('Foster'), ('Jason'), ('Sam'), ('Ham');
INSERT INTO notes(title, pages, student_id)VALUES ("algebra note", 16, 1), ("This is cs566 note", 555, 1), ("I like calculus", 7, 2), ("Literature course note from yesterday", 2, 2), ("notes on SQL", 75, 4);
- Print this
+--------+---------------------------------------+-------+| fname | title | pages |+--------+---------------------------------------+-------+| David | This is cs566 note | 555 || Sam | notes on SQL | 75 || David | algebra note | 16 || Foster | I like calculus | 7 || Foster | Literature course note from yesterday | 2 |+--------+---------------------------------------+-------+
- Print this
+--------+---------------------------------------+-------+| fname | title | pages |+--------+---------------------------------------+-------+| David | algebra note | 16 || David | This is cs566 note | 555 || Foster | I like calculus | 7 || Foster | Literature course note from yesterday | 2 || Jason | NULL | NULL || Sam | notes on SQL | 75 || Ham | NULL | NULL |+--------+---------------------------------------+-------+
- Print this
+--------+---------------------------------------+-------+| fname | title | pages |+--------+---------------------------------------+-------+| David | algebra note | 16 || David | This is cs566 note | 555 || Foster | I like calculus | 7 || Foster | Literature course note from yesterday | 2 || Jason | MISSING | 0 || Sam | notes on SQL | 75 || Ham | MISSING | 0 |+--------+---------------------------------------+-------+
- Print this
+--------+----------+| fname | avg |+--------+----------+| David | 285.5000 || Sam | 75.0000 || Foster | 4.5000 || Jason | 0.0000 || Ham | 0.0000 |+--------+----------+
- Print this. Long notes if AVG(pages) > 70
+--------+----------+--------------+| fname | avg | notes_status |+--------+----------+--------------+| David | 285.5000 | Long notes || Sam | 75.0000 | Long notes || Foster | 4.5000 | Short notes || Jason | 0.0000 | Short notes || Ham | 0.0000 | Short notes |+--------+----------+--------------+
Solution
- Creating table
CREATE TABLE students ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, fname VARCHAR(50));CREATE TABLE notes ( title VARCHAR(100), pages INT, student_id INT, FOREIGN KEY(student_id) REFERENCES students(id) ON DELETE CASCADE);
- INNER JOIN
SELECT fname, title, pagesFROM students JOIN notes ON students.id = notes.student_idORDER BY pages DESC;-- ORSELECT fname, title, pagesFROM students RIGHT JOIN notes ON students.id = notes.student_idORDER BY pages DESC;
- LEFT JOIN
SELECT fname, title, pagesFROM studentsLEFT JOIN notes ON students.id = notes.student_id;
- IFNULL
SELECT fname, IFNULL(title, 'MISSING') AS title, IFNULL(pages, 0) AS pagesFROM studentsLEFT JOIN notes ON students.id = notes.student_id;
- GROUP BY
SELECT fname, IFNULL(AVG(pages), 0) AS avgFROM studentsLEFT JOIN notes ON students.id = notes.student_idGROUP BY students.idORDER BY avg DESC;
- CASE
SELECT fname, IFNULL(AVG(pages), 0) AS avg, CASE WHEN AVG(pages) IS NULL THEN 'Short notes' -- Not necessary but a good practice WHEN AVG(pages) > 70 THEN "Long notes" ELSE "Short notes" END AS "notes_status"FROM studentsLEFT JOIN notes ON students.id = notes.student_idGROUP BY students.idORDER BY avg DESC;
Note that in CASE, you can't use refer to "avg".
Use IS NULL for condition with WHEN