[SQL]: Joins (One to Many)

SQL

09/29/2019


Join is useful when working with multiple tables.

Real world data is complicated and interrelated.

Relationship Basics

  1. One to One Relationship
  2. One to Many Relationship -- common.
    Example: students <--> notes. Students have many notes on different classes but those notes belong to one student
  3. Many to Many Relationship -- also common.
    Example: students <--> classes. Students take many classes, and within those classes there are many students.

One to Many Relationship (1:Many)

Example: Customers & Orders. Customers can have many orders, but those orders belong to each customer

Bad Example

SQL
CREATE TABLE customersAndOrders (
customer_fname VARCHAR(30),
customer_lname VARCHAR(30),
email VARCHAR(60),
purchase_date DATE,
price DECIMAL(7,2)
);
TEXT
+----------------+----------------+-------------------+---------------+--------+
| customer_fname | customer_lname | email | purchase_date | price |
+----------------+----------------+-------------------+---------------+--------+
| Smith | Kent | [email protected] | 2018-03-16 | 106.11 |
| Smith | Kent | [email protected] | 2019-01-11 | 32.76 |
| Sam | Park | [email protected] | 2014-07-14 | 423.03 |
| Sam | Park | [email protected] | 2015-02-16 | 13.16 |
| Dan | Lee | [email protected] | NULL | NULL |
| Grey | George | [email protected] | NULL | NULL |
+----------------+----------------+-------------------+---------------+--------+

This does work, but it's a bad design--it has a multiple duplicate information.

FOREIGN KEY

Let's keep our data separate using FOREIGN KEY
SQL file to create and insert into customers and orders tables is available from

customers.orders_tables.sql
customers.orders_tables.txt Documentation on FOREIGN KEY
SQL
CREATE TABLE customers (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
fname VARCHAR(30),
lname VARCHAR(30),
email VARCHAR(60)
);
-- MySQL version
CREATE TABLE orders (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
order_date DATE,
price DECIMAL(8,2),
customer_id INT,
FOREIGN KEY(customer_id) REFERENCES customers(id)
);
-- FOREIGN KEY assignment in SQL Server / Oracle / MS Access (different RDMS)
--customer_id INT FOREIGN KEY REFERENCES customers(id)
TEXT
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| fname | varchar(30) | YES | | NULL | |
| lname | varchar(30) | YES | | NULL | |
| email | varchar(60) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| order_date | date | YES | | NULL | |
| price | decimal(8,2) | YES | | NULL | |
| customer_id | int(11) | YES | MUL | NULL | |
+-------------+--------------+------+-----+---------+----------------+

Note that orders have a FOREIGN KEY customerid referencing **_id in customers**. FOREIN KEY is used to reference an external table.

customer_id is a naming convention to denote that it's referencing id in customer table

If the foreign key doesn't exist in customers, it will throw an error because there must be a customer that has the order

SQL
INSERT INTO orders(order_date, price, customer_id)
VALUES (DATE(NOW()), 30, 99);
TEXT
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint
fails (`online`.`orders`, CONSTRAINT `orders_ibfk_1` FOREIGN KEY (`customer_i
d`) REFERENCES `customers` (`id`))
SQL
INSERT INTO customers(fname, lname, email)
VALUES ('Smith', 'Kent', '[email protected]');
INSERT INTO orders(order_date, price, customer_id)
VALUES ('2018-03-16', 106.11, 1);
TEXT
+----+-------+-------+--------------+
| id | fname | lname | email |
+----+-------+-------+--------------+
| 1 | Smith | Kent | [email protected] |
+----+------------+--------+-------------+
| id | order_date | price | customer_id |
+----+------------+--------+-------------+
| 1 | 2018-03-16 | 106.11 | 1 |
+----+------------+--------+-------------+

You can't drop the table that has data that is being referenced. You can remove orders table first then customers table or both at once.

SQL
DROP TABLE orders, customers; -- orders should come first otherwise customers table won't be removed

Add more data

SQL
INSERT INTO customers(fname, lname, email)
VALUES ('Sam', 'Park', '[email protected]'),
('Dan', 'Lee', '[email protected]'),
('Grey', 'George', '[email protected]'),
('Joy', "Grey", '[email protected]');
INSERT INTO orders(order_date, price, customer_id)
VALUES ('2019-01-11', 32.76, 1),
('2014-07-14', 423.03, 2),
('2015-02-16', 13.16, 2)
('2019-05-27', 207.88, 5);
TEXT
+----+-------+--------+--------------------+
| id | fname | lname | email |
+----+-------+--------+--------------------+
| 1 | Smith | Kent | [email protected] |
| 2 | Sam | Park | [email protected] |
| 3 | Dan | Lee | [email protected] |
| 4 | Grey | George | [email protected] |
| 5 | Joy | Grey | [email protected] |
+----+-------+--------+--------------------+
+----+------------+--------+-------------+
| id | order_date | price | customer_id |
+----+------------+--------+-------------+
| 1 | 2018-03-16 | 106.11 | 1 |
| 2 | 2019-01-11 | 32.76 | 1 |
| 3 | 2014-07-14 | 423.03 | 2 |
| 4 | 2015-02-16 | 13.16 | 2 |
| 5 | 2019-05-27 | 207.88 | 5 |
+----+------------+--------+-------------+

Adding FOREIGN KEY

If you want to add a foreign key, you can add with the following:

SQL
ALTER TABLE orders
ADD FOREIGN KEY (customer_id) REFERENCES customers(id);
-- Adding FOREIGN KEY with specified constraint
ALTER TABLE orders
ADD CONSTRAINT FK_CustomerOrder
FOREIGN KEY (customer_id) REFERENCES customers(id);

DROP a FOREIGN KEY Constraint

SQL
ALTER TABLE orders
DROP FOREIGN KEY FK_CustomerOrder;

If you didn't specify the CONSTRAINT when creating a FOREIGN KEY, MySQL generates automatically internally. To access auto-generated name of CONSTRAINT, use the following

SQL
SHOW CREATE TABLE orders;
TEXT
+--------+--------------+
| Table | Create Table |
+--------+--------------+
| orders | CREATE TABLE `orders` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`order_date` date DEFAULT NULL,
`price` decimal(8,2) DEFAULT NULL,
`customer_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `customer_id` (`customer_id`),
CONSTRAINT `orders_ibfk_1` FOREIGN KEY (`customer_id`) REFERENCES `customers` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1 |

Now you can remove foreign constraint key with

SQL
ALTER TABLE orders drop FOREIGN KEY orders_ibfk_1;

ON DELETE CASCADE

When customers2 is deleted which has a corresponding data in orders2, delete orders2 as well

SQL
CREATE TABLE customers2 (
id INT PRIMARY KEY AUTO_INCREMENT
);
CREATE TABLE orders2 (
customer_id INT,
FOREIGN KEY(customer_id) REFERENCES customers2(id)
ON DELETE CASCADE
);
INSERT INTO customers2 VALUES (1);
INSERT INTO orders2 VALUES(1);
DELETE FROM customers2 WHERE id=1;
SELECT * FROM orders2;
TEXT
$ SELECT * FROM orders2;
Empty set (0.00 sec)

Deletion on customers2 also removed data in orders2

Combining two tables

What if you want information combined together from both tables?
For example,

SQL
SELECT id FROM customers WHERE lname="Park";
SELECT * FROM orders WHERE customer_id=2;
TEXT
+----+
| id |
+----+
| 2 |
+----+
+----+------------+--------+-------------+
| id | order_date | price | customer_id |
+----+------------+--------+-------------+
| 3 | 2014-07-14 | 423.03 | 2 |
| 4 | 2015-02-16 | 13.16 | 2 |
+----+------------+--------+-------------+

Using Sub-query

SQL
SELECT * FROM orders WHERE customer_id =
(
SELECT id FROM customers
WHERE lname = "Park"
);
TEXT
+----+------------+--------+-------------+
| id | order_date | price | customer_id |
+----+------------+--------+-------------+
| 3 | 2014-07-14 | 423.03 | 2 |
| 4 | 2015-02-16 | 13.16 | 2 |
+----+------------+--------+-------------+

The sub-query..

SQL
SELECT id FROM customers
WHERE lname = "Park"

is equal to 2 in this case. It yields the same results as previous query. Also note that the query just has one semi-colon at the end.

This isn't very ideal nor useful.

Cross Join

Use JOIN to conjoin two tables--stick them together. This however results in a messy table.

SQL
SELECT * FROM customers, orders;
TEXT
+----+-------+--------+--------------------+----+------------+--------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+----+-------+--------+--------------------+----+------------+--------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 2 | Sam | Park | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 3 | Dan | Lee | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 4 | Grey | George | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 5 | Joy | Grey | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 2 | Sam | Park | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 3 | Dan | Lee | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 4 | Grey | George | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 5 | Joy | Grey | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 1 | Smith | Kent | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 3 | Dan | Lee | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 4 | Grey | George | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 5 | Joy | Grey | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 1 | Smith | Kent | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 3 | Dan | Lee | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 4 | Grey | George | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 1 | Smith | Kent | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
| 2 | Sam | Park | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
| 3 | Dan | Lee | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
| 4 | Grey | George | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
+----+-------+--------+--------------------+----+------------+--------+-------------+

You can see a multiple duplicate information--takes every customer and conjoins every order like a multiplication. Yet, we can start to boil it down.

Inner Join (Implicit)

Inner Join selects records from table A and B where the join condition is met

SQL
SELECT * FROM customers, orders
WHERE customers.id = orders.customer_id;
TEXT
+----+-------+-------+--------------------+----+------------+--------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+----+-------+-------+--------------------+----+------------+--------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
+----+-------+-------+--------------------+----+------------+--------+-------------+

customers.id is in format of $tableName.$col to avoid ambiguity. i.e. Both orders and students have id columns.

You don't necessarily have to put the table name like orders.customer_id since customer_id only exists in orders table, but this is a convention to follow

Selecting certain columns

SQL
SELECT fname, lname, order_date, price
FROM customers, orders
WHERE customers.id = orders.customer_id;
TEXT
+-------+-------+------------+--------+
| fname | lname | order_date | price |
+-------+-------+------------+--------+
| Smith | Kent | 2018-03-16 | 106.11 |
| Smith | Kent | 2019-01-11 | 32.76 |
| Sam | Park | 2014-07-14 | 423.03 |
| Sam | Park | 2015-02-16 | 13.16 |
| Joy | Grey | 2019-05-27 | 207.88 |
+-------+-------+------------+--------+

Inner Join (Explicit)

More conventional way of joining--more cleaner and easily understood. Preferred way.

SQL
SELECT fname, lname, order_date, price
FROM customers
JOIN orders
ON customers.id = orders.customer_id;
TEXT
+-------+-------+------------+--------+
| fname | lname | order_date | price |
+-------+-------+------------+--------+
| Smith | Kent | 2018-03-16 | 106.11 |
| Smith | Kent | 2019-01-11 | 32.76 |
| Sam | Park | 2014-07-14 | 423.03 |
| Sam | Park | 2015-02-16 | 13.16 |
| Joy | Grey | 2019-05-27 | 207.88 |
+-------+-------+------------+--------+

Creating the union table where customers.id = orders.customer_id

Yields the same result as above implicit inner join.

It's optional to explicitly write

SQL
INNER JOIN orders ON ..

Different order

SQL
SELECT * FROM orders
JOIN customers ON customers.id = orders.id;
TEXT
+----+------------+--------+-------------+----+-------+--------+--------------------+
| id | order_date | price | customer_id | id | fname | lname | email |
+----+------------+--------+-------------+----+-------+--------+--------------------+
| 1 | 2018-03-16 | 106.11 | 1 | 1 | Smith | Kent | [email protected] |
| 2 | 2019-01-11 | 32.76 | 1 | 2 | Sam | Park | [email protected] |
| 3 | 2014-07-14 | 423.03 | 2 | 3 | Dan | Lee | [email protected] |
| 4 | 2015-02-16 | 13.16 | 2 | 4 | Grey | George | [email protected] |
| 5 | 2019-05-27 | 207.88 | 5 | 5 | Joy | Grey | [email protected] |
+----+------------+--------+-------------+----+-------+--------+--------------------+

Shows the data in different order of columns unless you specify columns after SELECT

More example

SQL
SELECT fname, lname, order_date, price
FROM orders
JOIN customers ON orders.customer_id = customers.id
ORDER BY orders.order_date;
TEXT
+-------+-------+------------+--------+
| fname | lname | order_date | price |
+-------+-------+------------+--------+
| Sam | Park | 2014-07-14 | 423.03 |
| Sam | Park | 2015-02-16 | 13.16 |
| Smith | Kent | 2018-03-16 | 106.11 |
| Smith | Kent | 2019-01-11 | 32.76 |
| Joy | Grey | 2019-05-27 | 207.88 |
+-------+-------+------------+--------+

Display ordered by order date

SQL
SELECT fname, lname, SUM(price) AS total_spent
FROM customers JOIN orders ON
customers.id = orders.customer_id
GROUP BY customers.id
ORDER BY total_spent DESC;
TEXT
+-------+-------+-------------+
| fname | lname | total_spent |
+-------+-------+-------------+
| Sam | Park | 436.19 |
| Joy | Grey | 207.88 |
| Smith | Kent | 138.87 |
+-------+-------+-------------+

Shows total amount spent on order(s) in descending order

Arbitrary Join -- Avoid doing it

SQL
SELECT * FROM customers
JOIN orders ON customers.id = orders.id;
TEXT
+----+-------+--------+--------------------+----+------------+--------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+----+-------+--------+--------------------+----+------------+--------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 2 | Sam | Park | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 3 | Dan | Lee | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 4 | Grey | George | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
+----+-------+--------+--------------------+----+------------+--------+-------------+

Joining on arbitrary conditions is allowed, but this is meaningless.

LEFT Join

Selects all data from A along with many matching data in B.

SQL
SELECT * FROM customers
LEFT JOIN orders ON customers.id = orders.customer_id;
TEXT
+----+-------+--------+--------------------+------+------------+--------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+----+-------+--------+--------------------+------+------------+--------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
| 3 | Dan | Lee | [email protected] | NULL | NULL | NULL | NULL |
| 4 | Grey | George | [email protected] | NULL | NULL | NULL | NULL |
+----+-------+--------+--------------------+------+------------+--------+-------------+

Information of customers is displayed on left and orders on right. The information that does not match is marked as NULL.

SQL
SELECT fname, lname, order_date, price
FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id;
TEXT
+-------+--------+------------+--------+
| fname | lname | order_date | price |
+-------+--------+------------+--------+
| Smith | Kent | 2018-03-16 | 106.11 |
| Smith | Kent | 2019-01-11 | 32.76 |
| Sam | Park | 2014-07-14 | 423.03 |
| Sam | Park | 2015-02-16 | 13.16 |
| Joy | Grey | 2019-05-27 | 207.88 |
| Dan | Lee | NULL | NULL |
| Grey | George | NULL | NULL |
+-------+--------+------------+--------+

LEFT Join can be useful in this case; for example if you want to tabulate information including who did not place any order.

SQL
SELECT fname, lname, SUM(price) AS total_spent
FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id
GROUP BY customers.id
ORDER BY total_spent DESC;
TEXT
+-------+--------+-------------+
| fname | lname | total_spent |
+-------+--------+-------------+
| Sam | Park | 436.19 |
| Joy | Grey | 207.88 |
| Smith | Kent | 138.87 |
| Dan | Lee | NULL |
| Grey | George | NULL |
+-------+--------+-------------+

IFNULL

You can add conditional statement for NULL values

SQL
SELECT fname, lname,
IFNULL(SUM(price), 0) AS total_spent
FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id
GROUP BY customers.id
ORDER BY total_spent DESC;
TEXT
+-------+--------+-------------+
| fname | lname | total_spent |
+-------+--------+-------------+
| Sam | Park | 436.19 |
| Joy | Grey | 207.88 |
| Smith | Kent | 138.87 |
| Dan | Lee | 0.00 |
| Grey | George | 0.00 |
+-------+--------+-------------+

INNER vs LEFT Join

Both join data from two tables, but..

  1. Inner Join selects all data from A and B where the join condition is met (exact overlap)
  2. Left Join selects all data from A along with matching records in B

Right Join

Similar to Left Join, Right Join selects all data from B along with matching record in A.

INNER vs LEFT vs RIGHT Join

SQL
SELECT * FROM customers
INNER JOIN orders
ON customers.id = orders.customer_id;
SELECT * FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id;
SELECT * FROM customers
RIGHT JOIN orders
ON customers.id = orders.customer_id;
TEXT
+----+-------+-------+--------------------+----+------------+--------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+----+-------+-------+--------------------+----+------------+--------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
+----+-------+-------+--------------------+----+------------+--------+-------------+
+----+-------+--------+--------------------+------+------------+--------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+----+-------+--------+--------------------+------+------------+--------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
| 3 | Dan | Lee | [email protected] | NULL | NULL | NULL | NULL |
| 4 | Grey | George | [email protected] | NULL | NULL | NULL | NULL |
+----+-------+--------+--------------------+------+------------+--------+-------------+
+------+-------+-------+--------------------+----+------------+--------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+------+-------+-------+--------------------+----+------------+--------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
+------+-------+-------+--------------------+----+------------+--------+-------------+

Results of Inner Join and Right Join are the same because there is no order that does not match with customers.

Left vs Right Join

SQL
SELECT * FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id;
SELECT * FROM orders
RIGHT JOIN customers
ON customers.id = orders.customer_id;
TEXT
+----+-------+--------+--------------------+------+------------+--------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+----+-------+--------+--------------------+------+------------+--------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
| 3 | Dan | Lee | [email protected] | NULL | NULL | NULL | NULL |
| 4 | Grey | George | [email protected] | NULL | NULL | NULL | NULL |
+----+-------+--------+--------------------+------+------------+--------+-------------+
+------+------------+--------+-------------+----+-------+--------+--------------------+
| id | order_date | price | customer_id | id | fname | lname | email |
+------+------------+--------+-------------+----+-------+--------+--------------------+
| 1 | 2018-03-16 | 106.11 | 1 | 1 | Smith | Kent | [email protected] |
| 2 | 2019-01-11 | 32.76 | 1 | 1 | Smith | Kent | [email protected] |
| 3 | 2014-07-14 | 423.03 | 2 | 2 | Sam | Park | [email protected] |
| 4 | 2015-02-16 | 13.16 | 2 | 2 | Sam | Park | [email protected] |
| 5 | 2019-05-27 | 207.88 | 5 | 5 | Joy | Grey | [email protected] |
| NULL | NULL | NULL | NULL | 3 | Dan | Lee | [email protected] |
| NULL | NULL | NULL | NULL | 4 | Grey | George | [email protected] |
+------+------------+--------+-------------+----+-------+--------+--------------------+

Left and Right Join achieve the exact same goal. It shows in a different order, but it's trivial. If you specify the column orders, it will result in the same.

Some RDMS don't even support RIGHT JOIN

SQL
SELECT fname, lname, price FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id;
SELECT fname, lname, price FROM orders
RIGHT JOIN customers
ON customers.id = orders.customer_id;
TEXT
+-------+--------+--------+
| fname | lname | price |
+-------+--------+--------+
| Smith | Kent | 106.11 |
| Smith | Kent | 32.76 |
| Sam | Park | 423.03 |
| Sam | Park | 13.16 |
| Joy | Grey | 207.88 |
| Dan | Lee | NULL |
| Grey | George | NULL |
+-------+--------+--------+

Tweak the data to distinguish with Inner Join

Let's modify data in left table, a.k.a. table A or orders table because then JOIN on customers will have NULL in it.

SQL
-- DROP the FOREIGN KEY
ALTER TABLE orders
DROP FOREIGN KEY orders_ibfk_1; -- retrieved from SHOW CREATE TABLE orders;
-- INSERT data into orders
INSERT INTO orders(order_date, price, customer_id)
VALUES ('2000-02-02', 3000, 99),
(CURDATE(), 276, 45);
TEXT
+----+------------+---------+-------------+
| id | order_date | price | customer_id |
+----+------------+---------+-------------+
| 1 | 2018-03-16 | 106.11 | 1 |
| 2 | 2019-01-11 | 32.76 | 1 |
| 3 | 2014-07-14 | 423.03 | 2 |
| 4 | 2015-02-16 | 13.16 | 2 |
| 5 | 2019-05-27 | 207.88 | 5 |
| 8 | 2000-02-02 | 3000.00 | 99 |
| 9 | 2019-09-30 | 276.00 | 45 |
+----+------------+---------+-------------+
SQL
SELECT * FROM customers JOIN orders
ON customers.id = orders.customer_id;
TEXT
+----+-------+-------+--------------------+----+------------+--------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+----+-------+-------+--------------------+----+------------+--------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
+----+-------+-------+--------------------+----+------------+--------+-------------+

There is no change in Left Join because the added rows in orders does not have matching data with customers

SQL
SELECT * FROM customers
RIGHT JOIN orders
ON customers.id = orders.customer_id;
TEXT
+------+-------+-------+--------------------+----+------------+---------+-------------+
| id | fname | lname | email | id | order_date | price | customer_id |
+------+-------+-------+--------------------+----+------------+---------+-------------+
| 1 | Smith | Kent | [email protected] | 1 | 2018-03-16 | 106.11 | 1 |
| 1 | Smith | Kent | [email protected] | 2 | 2019-01-11 | 32.76 | 1 |
| 2 | Sam | Park | [email protected] | 3 | 2014-07-14 | 423.03 | 2 |
| 2 | Sam | Park | [email protected] | 4 | 2015-02-16 | 13.16 | 2 |
| 5 | Joy | Grey | [email protected] | 5 | 2019-05-27 | 207.88 | 5 |
| NULL | NULL | NULL | NULL | 8 | 2000-02-02 | 3000.00 | 99 |
| NULL | NULL | NULL | NULL | 9 | 2019-09-30 | 276.00 | 45 |
+------+-------+-------+--------------------+----+------------+---------+-------------+

However, RIGHT JOIN now has NULL values because there is no matching data for newly added date in customers

Adding conditions

SQL
SELECT IFNULL(fname, "MISSING") AS fname,
IFNULL(lname, "MISSING") AS lname,
SUM(price) AS total_spent
FROM customers
RIGHT JOIN orders ON customers.id = orders.customer_id
GROUP BY customer_id;
TEXT
+---------+---------+-------------+
| fname | lname | total_spent |
+---------+---------+-------------+
| Smith | Kent | 138.87 |
| Sam | Park | 436.19 |
| Joy | Grey | 207.88 |
| MISSING | MISSING | 276.00 |
| MISSING | MISSING | 3000.00 |
+---------+---------+-------------+

Challenge

Solution is available at the end of page

  1. Create a table with the following information
    students: id, first name
    notes: title, content, student_id(FOREIGN KEY)

After creating the table, insert the following data:

SQL
INSERT INTO students(fname) VALUES
('David'), ('Foster'), ('Jason'), ('Sam'), ('Ham');
INSERT INTO notes(title, pages, student_id)
VALUES
("algebra note", 16, 1),
("This is cs566 note", 555, 1),
("I like calculus", 7, 2),
("Literature course note from yesterday", 2, 2),
("notes on SQL", 75, 4);
  1. Print this
TEXT
+--------+---------------------------------------+-------+
| fname | title | pages |
+--------+---------------------------------------+-------+
| David | This is cs566 note | 555 |
| Sam | notes on SQL | 75 |
| David | algebra note | 16 |
| Foster | I like calculus | 7 |
| Foster | Literature course note from yesterday | 2 |
+--------+---------------------------------------+-------+
  1. Print this
TEXT
+--------+---------------------------------------+-------+
| fname | title | pages |
+--------+---------------------------------------+-------+
| David | algebra note | 16 |
| David | This is cs566 note | 555 |
| Foster | I like calculus | 7 |
| Foster | Literature course note from yesterday | 2 |
| Jason | NULL | NULL |
| Sam | notes on SQL | 75 |
| Ham | NULL | NULL |
+--------+---------------------------------------+-------+
  1. Print this
TEXT
+--------+---------------------------------------+-------+
| fname | title | pages |
+--------+---------------------------------------+-------+
| David | algebra note | 16 |
| David | This is cs566 note | 555 |
| Foster | I like calculus | 7 |
| Foster | Literature course note from yesterday | 2 |
| Jason | MISSING | 0 |
| Sam | notes on SQL | 75 |
| Ham | MISSING | 0 |
+--------+---------------------------------------+-------+
  1. Print this
TEXT
+--------+----------+
| fname | avg |
+--------+----------+
| David | 285.5000 |
| Sam | 75.0000 |
| Foster | 4.5000 |
| Jason | 0.0000 |
| Ham | 0.0000 |
+--------+----------+
  1. Print this. Long notes if AVG(pages) > 70
TEXT
+--------+----------+--------------+
| fname | avg | notes_status |
+--------+----------+--------------+
| David | 285.5000 | Long notes |
| Sam | 75.0000 | Long notes |
| Foster | 4.5000 | Short notes |
| Jason | 0.0000 | Short notes |
| Ham | 0.0000 | Short notes |
+--------+----------+--------------+












Solution

  1. Creating table
SQL
CREATE TABLE students (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
fname VARCHAR(50)
);
CREATE TABLE notes (
title VARCHAR(100),
pages INT,
student_id INT,
FOREIGN KEY(student_id) REFERENCES students(id)
ON DELETE CASCADE
);
  1. INNER JOIN
SQL
SELECT fname, title, pages
FROM students JOIN notes
ON students.id = notes.student_id
ORDER BY pages DESC;
-- OR
SELECT fname, title, pages
FROM students RIGHT JOIN notes
ON students.id = notes.student_id
ORDER BY pages DESC;
  1. LEFT JOIN
SQL
SELECT fname, title, pages
FROM students
LEFT JOIN notes
ON students.id = notes.student_id;
  1. IFNULL
SQL
SELECT fname,
IFNULL(title, 'MISSING') AS title,
IFNULL(pages, 0) AS pages
FROM students
LEFT JOIN notes
ON students.id = notes.student_id;
  1. GROUP BY
SQL
SELECT fname,
IFNULL(AVG(pages), 0) AS avg
FROM students
LEFT JOIN notes
ON students.id = notes.student_id
GROUP BY students.id
ORDER BY avg DESC;
  1. CASE
SQL
SELECT fname,
IFNULL(AVG(pages), 0) AS avg,
CASE
WHEN AVG(pages) IS NULL THEN 'Short notes' -- Not necessary but a good practice
WHEN AVG(pages) > 70 THEN "Long notes"
ELSE "Short notes"
END AS "notes_status"
FROM students
LEFT JOIN notes
ON students.id = notes.student_id
GROUP BY students.id
ORDER BY avg DESC;

Note that in CASE, you can't use refer to "avg".

Use IS NULL for condition with WHEN


WRITTEN BY

Keeping a record