Database: SELECT: Subqueries
The results of a SELECT command can be used in other commands,
typically a SELECT, INSERT, or CREATE VIEW statement.
Subqueries are also used in the UNION, EXCEPT, and INTERSECT operations.
The inner SELECT is called a subquery or subselect.
Subqueries produce: tables, columns, rows, or single values
Every SELECT produces a table - this is one of the things that makes SQL so powerful.
When using a SELECT inside another statement, be aware of the shape of the resulting table:
- table with potentially many rows and columns.
- column of potentially many values (often used with IN).
- row which can be used to insert in another table, or used in a SQL-92
multiple-value comparison (not implemented in all database systems).
- single value which can be used in a comparison, expression, etc.
All of these are possibilities, but only some of these are appropriate in some contexts.
Subqueries that produce single values
When a single value is produced by a subquery (typically from a summary function),
the result is commonly used in a WHERE clause in an arithmetic comparison.
Subqueries that produce a column
If a subquery produces a column of values, the IN
operator is often used to see
if a value is in that set of values.
The subquery can also be preceded by a comparison operator followed by
ANY
or ALL
, and the comparison is true only if
it succeeds for any/all of the subquery values.
Subqueries that produce rows
The EXISTS
test for a subquery returns true if there were any
rows in the subquery result. Can be preceded by the NOT
operator.
EXISTS
is only used with subqueries.
Restrictions on subqueries
ORDER BY
is not allowed because there is no use of subqueries that
makes use of order, and it would
just add an inefficiency.
Subqueries in WHERE clause
Subqueries in the WHERE clause of SELECT statement are common.
- They may reference a column in the enclosing table (a so-called outer reference).
- A column of values produced by a subquery can be used with IN.
Example - Better than average students
The following lists all students who have a better than average GPA.
The calculation of the average GPA is done with an inner SELECT statement
which produces a single value, the average. This result can be used
anywhere a single value can be used, in this case in a comparison in
the WHERE clause.
SELECT stu_gpa, stu_lname, stu_fname
FROM Students
WHERE stu_gpa > (SELECT AVG(stu_gpa) FROM Students);
ORDER BY stu_gpa DESC;
Example - Find all classmates of a student
Problem: Find all students who are studying with Mickey Mouse.
Let's first solve the problem of which classes Mickey Mouse is taking.
SELECT crs_id
FROM Students, Enrollment, Sections
WHERE Students.stu_id = Enrollment.stu_id
AND Enrollment.sec_id = Sections.sec_id
AND stu_fname="Mickey" AND stu_lname="Mouse";
The above SELECT produces a single column of results, and
a single column or values which can be used with IN.
SELECT DISTINCT stu_id, stu_lname, stu_fname
FROM Students, Enrollment, Sections
WHERE Students.stu_id = Enrollment.stu_id
AND Enrollment.sec_id = Sections.sec_id
AND crs_id IN (SELECT crs_id
FROM Students, Enrollment, Sections
WHERE Students.stu_id = Enrollment.stu_id
AND Enrollment.sec_id = Sections.sec_id
AND stu_fname="Mickey" AND stu_lname="Mouse");
DISTINCT
is used so that students who are in
more than one class with Mickey will only appear once.
The only glitch in this result is that Mickey Mouse will show up
as a classmate of himself. This can be fixed by adding another
condition to the outer WHERE clause that eliminated him from the
result.