[SQL]: Aggregate Functions
SQL
09/26/2019
COUNT
SELECT COUNT(*) FROM students;
- Output:
+----------+| COUNT(*) |+----------+| 15 |+----------+
SELECT COUNT(fname), COUNT( DISTINCT lname ) AS uniqueLnames, COUNT( DISTINCT lname, fname ) AS uniqueFullnamesFROM students;
- Output:
+--------------+--------------+-----------------+| COUNT(fname) | uniqueLnames | uniqueFullnames |+--------------+--------------+-----------------+| 15 | 12 | 14 |+--------------+--------------+-----------------+
SELECT COUNT(lname) - COUNT( DISTINCT(lname)) AS duplicateLnamesFROM students;
- Output
+-----------------+| duplicateLnames |+-----------------+| 3 |+-----------------+
SELECT COUNT(*) FROM studentsWHERE major LIKE "%science%";
- Show how many majors contain "science"
- Output:
+----------+| COUNT(*) |+----------+| 3 |+----------+
GROUP BY
GROUP BY summarizes identical data into single rows. For example, consider an imaginary query and output below. GROUP BY by default can order the list in ascending order, depending on SQL mode.
SELECT major, lname FROM students_temp;
+---------------------+-------+| major | lname |+---------------------+-------+| Computer engineering| Smith || Computer science | Min || Economics | Min || Accounting | Smith || Communication Art | Kim |+---------------------+-------+
Result of GROUP BY lname
SELECT major, lname FROM students_tempGROUP BY lname;
+---------------------+-------+| major | lname |+---------------------+-------+| Computer engineering| Smith || Computer science | Min || Communication Art | Kim |+---------------------+-------+
What happens behind the scene
Though it isn't shown in above output, SQL groups by lname behind the scene
| Computer engineering| Smith || Accounting | Smith |
| Computer science | Min || Economics | Min |
| Communication Art | Kim |
Since the data is grouped we can do things like how many majors each person has
SELECT lname, COUNT(*) FROM students_tempGROUP BY lname;
+-------+----------+| lname | COUNT(*) |+-------+----------+| Smith | 2 || Min | 2 || Kim | 1 |+-------+----------+
Group BY fullname
SELECT fname, lname, COUNT(*) FROM studentsGROUP BY fname, lname;
+----------+----------+----------+| fname | lname | COUNT(*) |+----------+----------+----------+| Angelina | Randall | 1 || Brad | Johnson | 1 || Brent | Lang | 1 || Dan | Joyce | 1 || Dave | Smith | 1 || Diego | Miller | 1 || Ellis | Min | 2 || Jerimiah | Tucker | 1 || John | Johnson | 1 || Lamont | Marks | 1 || Lena | Shah | 1 || Mary | Fletcher | 1 || Sam | Smith | 1 || Tenzin | Pema | 1 |+----------+----------+----------+
GROUP BY with CONCAT
SELECT CONCAT(major, ' has ', COUNT(*), ' student(s)')FROM studentsGROUP BY major;
+-------------------------------------------------+| CONCAT(major, ' has ', COUNT(*), ' student(s)') |+-------------------------------------------------+| Accounting has 1 student(s) || Art has 1 student(s) || Biochemistry has 1 student(s) || Biology has 2 student(s) || Business has 1 student(s) || Chemistry has 1 student(s) || Computer Science has 2 student(s) || Economics has 1 student(s) || Food Science has 1 student(s) || Japanese 10% has 1 student(s) || Linguistics has 1 student(s) || no_major has 1 student(s) || Physics has 1 student(s) |+-------------------------------------------------+
MIN / MAX
SELECT MIN(credit)FROM students;
+-------------+| MIN(credit) |+-------------+| 6 |+-------------+
Shows the lowest credit student has
What if you want to show the name with MIN credit?
- Method 1 (WHERE)
SELECT CONCAT(fname, ' ', lname) AS name, MIN(credit)FROM studentsWHERE credit = (SELECT MIN(credit) FROM students);
- Method 2 (ORDER BY)
SELECT CONCAT(fname, ' ', lname) AS name, creditFROM studentsORDER BY credit LIMIT 1;
- WRONG
SELECT fname, MIN(credit)FROM students;
- 1st solution is inefficient because it's running two queires
- Method 2 is preferred
- Correct output:
+--------------+--------+| name | credit |+--------------+--------+| Diego Miller | 6 |+--------------+--------+
- Incorrect output (3.)
+-------+-------------+| fname | MIN(credit) |+-------+-------------+| Ellis | 6 |+-------+-------------+
- This creates a wrong output because MIN(credit) executes separate from fname
MIN/MAX & GROUP BY
Ex) Show the student's full name along with their lower credit amount
SELECT CONCAT(fname,' ', lname) as fullname, MIN(credit)FROM studentsGROUP BY fname, lname;
+------------------+-------------+| fullname | MIN(credit) |+------------------+-------------+| ... | ... || Ellis Min | 13 || ... | ... |+------------------+-------------+
We have two Ellis Min in our data. The above query selects the minimum credit 13, instead of 5555. The other students aren't duplicate so they don't matter much for this problem
In other words, Ellis Min behind the scene is grouped by like below
| Ellis Min | 5555 || Ellis Min | 13 |
Then MIN(credit) will pick the lowest amount between two
SUM
SELECT SUM(credit)FROM students;
+-------------+| SUM(credit) |+-------------+| 8427 |+-------------+
Adds all credits of all students
SUM & GROUP BY
Ex) Sum all credits each student have
SELECT CONCAT(fname, ' ', lname) as name, SUM(credit)FROM studentsGROUP BY fname, lname;
+------------------+-------------+| name | SUM(credit) |+------------------+-------------+| ... | ... || Ellis Min | 5568 || ... | ... |+------------------+-------------+
Ellis Min now has 5568 credits (5555 + 13)
AVG
SELECT AVG(age)FROM students;
+----------+| avg(age) |+----------+| 21.6000 |+----------+
Find the average of students' age
ROUND
Rounds a number to a specified number of decimal places
SELECT ROUND(AVG(age))FROM students;
+-----------------+| ROUND(AVG(age)) |+-----------------+| 22 |+-----------------+
No rounding.
SELECT ROUND(AVG(age), 2)FROM students;
Round to 2 decimal place.
SELECT ROUND(AVG(age), -1)FROM students;
+---------------------+| ROUND(AVG(age), -1) |+---------------------+| 20 |+---------------------+
Round to -1 decimal places.
AVG & GROUP BY
Try find out average of age by the same major & round it up to 1 decimal
SELECT major, ROUND(AVG(age), 1) AS "avg age by major"FROM studentsGROUP BY major;
+------------------+------------------+| major | avg age by major |+------------------+------------------+| Accounting | 22.0 || Art | 26.0 || Biochemistry | 18.0 || Biology | 22.0 || Business | 20.0 || Chemistry | 23.0 || Computer Science | 21.5 || Economics | 18.0 || Food Science | 31.0 || Japanese 10% | 16.0 || Linguistics | 19.0 || no_major | 27.0 || Physics | 17.0 |+------------------+------------------+
Duplicate majors are grouped together and shows the relevant average age