[SQL]: Joins (Many to Many)
SQL
09/30/2019
Many to Many Relationship
Many to Many relationship happens when a multiple records in a table are associated with a multiple records in another table.
Examples: students <-> classes, books <-> authors, blog posts <-> tags
We need an extra table(join table) that contains foreign key from both tables. For example, enrollments table below:
students <-> enrollments <-> classes
- students(id, fname, lname)
- enrollments(id, enrollment_dt, student_id, class_id)
- classes(id, title, description, class_num)
CREATE TABLE students ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, fname VARCHAR(50), lname VARCHAR(50));CREATE TABLE classes ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, title VARCHAR(50), description VARCHAR(200), class_num INT);-- need to be created at lastCREATE TABLE enrollments ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, enrollment_date DATE, student_id INT, class_id INT, FOREIGN KEY (student_id) REFERENCES students(id), FOREIGN KEY (class_id) REFERENCES classes(id));
Prepare Data (INSERT)
Access SQL/txt file to create table and insert data
enrollments.txt
Random data for **_enrollments_** was created with SqlGenerator.java
Inserted data
students:+----+--------+---------+| id | fname | lname |+----+--------+---------+| 1 | Henry | Lee || 2 | Ann | Ortiz || 3 | Andre | Gross || 4 | Tracy | Ortiz || 5 | Victor | Blair || 6 | Martha | Stevens || 7 | Pat | Bush |+----+--------+---------+
classes:+----+--------------------------------+----------------------------------------------------------------+-----------+| id | title | description | class_num |+----+--------------------------------+----------------------------------------------------------------+-----------+| 1 | Botany | Learn about plants | 101 || 2 | Intro to Computer Science | Find out how to code | 302 || 3 | Biology | We teach you the study of life | 105 || 4 | Accounting | How to document financial transactions | 307 || 5 | Intro to Music | Learn about variety types of music | 113 || 6 | Pharmacology | Study of drug | 553 || 7 | Nursing | This course provides provision of professional nursing care | 321 || 8 | European Literature | Read modern wirtten works from Europe | 223 || 9 | Drawing | Course in basic drawing skills | 206 || 10 | Photography | Fundamentals in technique with emphasis on individual projects | 601 || 11 | Languages and Cultures of Asia | Comparative perspective on Asian languages | 333 || 12 | Critical Reading and Writng | Introduction to literary analysis | 401 || 13 | Intro to Buddihism | General survey of Buddihist worldview and lifestyle | 109 || 14 | Western Civilization | Foundation of Western civilization | 3 |+----+--------------------------------+----------------------------------------------------------------+-----------+
enrollments:+----+-----------------+------------+----------+| 1 | 2017-09-26 | 3 | 14 || 2 | 2016-09-02 | 6 | 4 || 3 | 2015-02-07 | 3 | 7 || 4 | 2016-05-08 | 7 | 3 || 5 | 2013-12-28 | 6 | 11 || 6 | 2018-10-16 | 1 | 2 || 7 | 2015-02-21 | 6 | 2 || 8 | 2014-07-19 | 4 | 3 || 9 | 2016-11-27 | 7 | 1 || 10 | 2017-09-09 | 6 | 6 || 11 | 2016-04-27 | 3 | 6 || 12 | 2015-10-05 | 7 | 12 || 13 | 2017-02-27 | 6 | 8 || 14 | 2012-06-19 | 4 | 7 || 15 | 2014-09-05 | 4 | 4 || 16 | 2017-12-18 | 2 | 13 || 17 | 2013-08-13 | 2 | 3 || 18 | 2019-01-28 | 4 | 14 || 19 | 2015-08-08 | 5 | 5 || 20 | 2012-01-14 | 6 | 6 || 21 | 2016-05-12 | 2 | 9 || 22 | 2017-03-27 | 1 | 6 || 23 | 2013-12-17 | 4 | 13 || 24 | 2016-10-06 | 1 | 2 || 25 | 2016-12-05 | 3 | 14 || 26 | 2014-09-17 | 2 | 2 || 27 | 2013-10-16 | 5 | 12 || 28 | 2018-12-23 | 5 | 8 || 29 | 2015-02-14 | 1 | 6 || 30 | 2012-06-14 | 1 | 13 || 31 | 2016-11-22 | 4 | 2 || 32 | 2013-02-14 | 5 | 11 || 33 | 2012-06-22 | 2 | 4 || 34 | 2019-01-20 | 6 | 12 || 35 | 2019-07-09 | 4 | 6 || 36 | 2016-04-25 | 1 | 6 || 37 | 2018-02-10 | 4 | 6 || 38 | 2018-09-13 | 5 | 4 || 39 | 2012-08-26 | 3 | 11 || 40 | 2015-10-26 | 6 | 5 || 41 | 2012-04-16 | 4 | 12 || 42 | 2018-01-12 | 5 | 11 || 43 | 2012-05-05 | 5 | 1 |+----+-----------------+------------+----------+
In enrollments table, there is class with class_id=10 that doesn't assign any student (Photography).
Join classes with enrollments
INNER JOIN
SELECT title, enrollment_date FROM classesJOIN enrollments ON classes.id = enrollments.class_id;
+--------------------------------+-----------------+| Botany | 2016-11-27 || Botany | 2012-05-05 || Intro to Computer Science | 2018-10-16 || Intro to Computer Science | 2015-02-21 || Intro to Computer Science | 2016-10-06 || Intro to Computer Science | 2014-09-17 || Intro to Computer Science | 2016-11-22 || Biology | 2016-05-08 || Biology | 2014-07-19 || Biology | 2013-08-13 || Accounting | 2016-09-02 || Accounting | 2014-09-05 || Accounting | 2012-06-22 || Accounting | 2018-09-13 || Intro to Music | 2015-08-08 || Intro to Music | 2015-10-26 || Pharmacology | 2017-09-09 || Pharmacology | 2016-04-27 || Pharmacology | 2012-01-14 || Pharmacology | 2017-03-27 || Pharmacology | 2015-02-14 || Pharmacology | 2019-07-09 || Pharmacology | 2016-04-25 || Pharmacology | 2018-02-10 || Nursing | 2015-02-07 || Nursing | 2012-06-19 || European Literature | 2017-02-27 || European Literature | 2018-12-23 || Drawing | 2016-05-12 || Languages and Cultures of Asia | 2013-12-28 || Languages and Cultures of Asia | 2013-02-14 || Languages and Cultures of Asia | 2012-08-26 || Languages and Cultures of Asia | 2018-01-12 || Critical Reading and Writing | 2015-10-05 || Critical Reading and Writing | 2013-10-16 || Critical Reading and Writing | 2019-01-20 || Critical Reading and Writing | 2012-04-16 || Intro to Buddhism | 2017-12-18 || Intro to Buddhism | 2013-12-17 || Intro to Buddhism | 2012-06-14 || Western Civilization | 2017-09-26 || Western Civilization | 2019-01-28 || Western Civilization | 2016-12-05 |+--------------------------------+-----------------+
Displays all classes with enrollment_date
GROUP BY & Inner Join
SELECT title, COUNT(enrollment_date) AS students_assigned FROM classesJOIN enrollments ON classes.id = enrollments.class_idGROUP BY classes.idORDER BY students_assigned;
+--------------------------------+-------------------+| title | students_assigned |+--------------------------------+-------------------+| Drawing | 1 || Nursing | 2 || European Literature | 2 || Botany | 2 || Intro to Music | 2 || Intro to Buddhism | 3 || Western Civilization | 3 || Biology | 3 || Critical Reading and Writing | 4 || Accounting | 4 || Languages and Cultures of Asia | 4 || Intro to Computer Science | 5 || Pharmacology | 8 |+--------------------------------+-------------------+13 rows in set (0.00 sec)
Shows classes with # students assigned in class
GROUP BY & Left Join
Use Left Join to include class that doesn't have any student assigned.
SELECT title, COUNT(enrollment_date) AS students_assigned FROM classesLEFT JOIN enrollments ON classes.id = enrollments.class_idGROUP BY classes.idORDER BY students_assigned;
+--------------------------------+-------------------+| title | students_assigned |+--------------------------------+-------------------+| Photography | 0 || Drawing | 1 || Nursing | 2 || European Literature | 2 || Botany | 2 || Intro to Music | 2 || Western Civilization | 3 || Intro to Buddhism | 3 || Biology | 3 || Languages and Cultures of Asia | 4 || Accounting | 4 || Critical Reading and Writing | 4 || Intro to Computer Science | 5 || Pharmacology | 8 |+--------------------------------+-------------------+
- We now have Photography class that doesn't have any student.
- Try not to use GROUP BY title. Using GROUP BY classes.id is a better practice.
- For Photography, COUNT(enrollment_date) returns 0 not NULL
- In contrast, MIN, MAX, AVG will return NULL
WHERE & HAVING
Show class(es) without any student assigned
With WHERE clause
SELECT title, enrollment_dateFROM classes LEFT JOIN enrollments ON classes.id = enrollments.class_idWHERE enrollment_date IS NULL;
HAVING
Show class(es) that has 2 students
SELECT title, COUNT(enrollment_date)FROM classes LEFT JOIN enrollments ON classes.id = enrollments.class_idGROUP BY classes.idHAVING COUNT(enrollment_date) = 2;
+-------------+| title |+-------------+| Photography |+-------------+
An aggregate may not appear in the WHERE unless it is in a sub-query contained in HAVING. HAVING is useful in combination with GROUP BY.
LEFT JOIN is needed because we need all information from classes table which doesn't have a matching information with enrollments.
Join students with enrollment
INNER JOIN
SELECT fname, lname,COUNT(enrollment_date) AS "numClasses"FROM students JOIN enrollments ONstudents.id = enrollments.student_idGROUP BY students.id;
+--------+---------+------------+| fname | lname | numClasses |+--------+---------+------------+| Henry | Lee | 6 || Ann | Ortiz | 5 || Andre | Gross | 5 || Tracy | Ortiz | 9 || Victor | Blair | 7 || Martha | Stevens | 8 || Pat | Bush | 3 |+--------+---------+------------+
Same result for LEFT JOIN and RIGHT JOIN.
JOIN ALL 3 tables
SELECT title, class_num, CONCAT(fname, ' ', lname) AS studentFROM students
JOIN enrollments ON students.id = enrollments.student_idJOIN classes ON classes.id = enrollments.class_id
ORDER BY title;
+--------------------------------+-----------+----------------+| title | class_num | student |+--------------------------------+-----------+----------------+| Accounting | 307 | Martha Stevens || Accounting | 307 | Victor Blair || Accounting | 307 | Ann Ortiz || Accounting | 307 | Tracy Ortiz || Biology | 105 | Tracy Ortiz || Biology | 105 | Pat Bush || Biology | 105 | Ann Ortiz || Botany | 101 | Pat Bush || Botany | 101 | Victor Blair || Critical Reading and Writing | 401 | Martha Stevens || Critical Reading and Writing | 401 | Tracy Ortiz || Critical Reading and Writing | 401 | Victor Blair || Critical Reading and Writing | 401 | Pat Bush || Drawing | 206 | Ann Ortiz || European Literature | 223 | Martha Stevens || European Literature | 223 | Victor Blair || Intro to Buddhism | 109 | Henry Lee || Intro to Buddhism | 109 | Tracy Ortiz || Intro to Buddhism | 109 | Ann Ortiz || Intro to Computer Science | 302 | Ann Ortiz || Intro to Computer Science | 302 | Henry Lee || Intro to Computer Science | 302 | Tracy Ortiz || Intro to Computer Science | 302 | Henry Lee || Intro to Computer Science | 302 | Martha Stevens || Intro to Music | 113 | Victor Blair || Intro to Music | 113 | Martha Stevens || Languages and Cultures of Asia | 333 | Victor Blair || Languages and Cultures of Asia | 333 | Martha Stevens || Languages and Cultures of Asia | 333 | Victor Blair || Languages and Cultures of Asia | 333 | Andre Gross || Nursing | 321 | Andre Gross || Nursing | 321 | Tracy Ortiz || Pharmacology | 553 | Henry Lee || Pharmacology | 553 | Tracy Ortiz || Pharmacology | 553 | Henry Lee || Pharmacology | 553 | Martha Stevens || Pharmacology | 553 | Henry Lee || Pharmacology | 553 | Andre Gross || Pharmacology | 553 | Martha Stevens || Pharmacology | 553 | Tracy Ortiz || Western Civilization | 3 | Andre Gross || Western Civilization | 3 | Andre Gross || Western Civilization | 3 | Tracy Ortiz |+--------------------------------+-----------+----------------+43 rows in set (0.00 sec)
Display classes along with students.
Remove duplicates
Note there's a data error where one student is enrolled in class a multiple times. To remove duplicate students, GROUP BY title and students.id together.
SELECT title, CONCAT(fname, ' ', lname) AS studentFROM students
JOIN enrollments ON students.id = enrollments.student_idJOIN classes ON classes.id = enrollments.class_idGROUP BY title, students.id;
+--------------------------------+----------------+| title | student |+--------------------------------+----------------+| Accounting | Ann Ortiz || Accounting | Tracy Ortiz || Accounting | Victor Blair || Accounting | Martha Stevens || Biology | Ann Ortiz || Biology | Tracy Ortiz || Biology | Pat Bush || Botany | Victor Blair || Botany | Pat Bush || Critical Reading and Writing | Tracy Ortiz || Critical Reading and Writing | Victor Blair || Critical Reading and Writing | Martha Stevens || Critical Reading and Writing | Pat Bush || Drawing | Ann Ortiz || European Literature | Victor Blair || European Literature | Martha Stevens || Intro to Buddhism | Henry Lee || Intro to Buddhism | Ann Ortiz || Intro to Buddhism | Tracy Ortiz || Intro to Computer Science | Henry Lee || Intro to Computer Science | Ann Ortiz || Intro to Computer Science | Tracy Ortiz || Intro to Computer Science | Martha Stevens || Intro to Music | Victor Blair || Intro to Music | Martha Stevens || Languages and Cultures of Asia | Andre Gross || Languages and Cultures of Asia | Victor Blair || Languages and Cultures of Asia | Martha Stevens || Nursing | Andre Gross || Nursing | Tracy Ortiz || Pharmacology | Henry Lee || Pharmacology | Andre Gross || Pharmacology | Tracy Ortiz || Pharmacology | Martha Stevens || Western Civilization | Andre Gross || Western Civilization | Tracy Ortiz |+--------------------------------+----------------+36 rows in set (0.00 sec)
Show error data
Show the error data. i.e. student in class a multiple times
SELECT title, CONCAT(fname, ' ', lname) AS student,COUNT(students.id) AS duplicate_cntFROM studentsJOIN enrollments ON students.id = enrollments.student_idJOIN classes ON classes.id = enrollments.class_idGROUP BY title, students.idHAVING COUNT(students.id) > 1;
+--------------------------------+----------------+---------------+| title | student | duplicate_cnt |+--------------------------------+----------------+---------------+| Intro to Computer Science | Henry Lee | 2 || Languages and Cultures of Asia | Victor Blair | 2 || Pharmacology | Henry Lee | 3 || Pharmacology | Tracy Ortiz | 2 || Pharmacology | Martha Stevens | 2 || Western Civilization | Andre Gross | 2 |+--------------------------------+----------------+---------------+
Henry Lee shows up in Intro to Computer Science twice, VIctor Blair in Languages and Cultures of Asia and so on.
UNIQUE KEY(id, id2)
Setting tuple of student_id and class_id as UNIQUE (or PRIMARY) KEY can prevent above data error.
CREATE TABLE students2 ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, fname VARCHAR(50), lname VARCHAR(50));CREATE TABLE classes2 ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, title VARCHAR(50), description VARCHAR(200), class_num INT);-- need to be created at lastCREATE TABLE enrollments2 ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, enrollment_date DATE, student_id INT, class_id INT, FOREIGN KEY (student_id) REFERENCES students(id), FOREIGN KEY (class_id) REFERENCES classes(id), UNIQUE KEY (student_id, class_id));
INSERT INTO students2(fname, lname)VALUES("hey", "man");
INSERT INTO classes2 (title, description, class_num)VALUES ("Botany", "study of plants", 207);
INSERT INTOenrollments2(enrollment_date, student_id, class_id)VALUES (CURDATE(), 1, 1);
-- Triggers errorINSERT INTOenrollments2(enrollment_date, student_id, class_id)VALUES (CURDATE(), 1, 1);
ERROR 1062 (23000): Duplicate entry '1-1' for key 'PRIMARY'
UNIQUE KEY(students_id, class_id)
This line of code prevents classes from having duplicate student or students to have a duplicate courses. You can also use PRIMARY KEY, but in a table only one primary key is allowed.
UNIQUE VS PRIMARY KEY
Primary Key
- There can be only one primary key in a table
- Some DBMS doesn't allow NULL
- Primary key is a unique identifier
Unique Key
- There can be more than one unique key
- Unique can have NULL values
- It can be candidate key
- Unique key can be NULL. i.e. multiple rows can have NULL values thus not necessarily "unique"