ADVANCED QUERYING INTERMEDIATE lesson-09

Subqueries & Nested Queries

🕑 30 min 📚 Lesson 9 of 15 ✅ 5 quiz questions
A subquery is a SELECT statement nested inside another SQL statement. They can appear in the WHERE clause, FROM clause, or even the SELECT list — letting you answer complex questions in a single query without temporary tables.

What Is a Subquery?

A subquery (also called an inner query or nested query) is a complete SELECT statement enclosed in parentheses and used inside another SQL statement. The outer query receives the result of the inner query and uses it as a value or a set of values.

Subqueries can appear in three positions:

  • WHERE clause — filter rows based on the subquery result
  • FROM clause — treat the subquery as a derived (temporary) table
  • SELECT list — compute a value for each row (correlated subquery)

Subquery in WHERE

The most common use — filter rows using a value or set produced by the inner query.

SQL
-- Students older than the average age
SELECT name, age FROM students
WHERE age > (SELECT AVG(age) FROM students);

-- Students enrolled in 'Python Basics'
SELECT name FROM students
WHERE id IN (
  SELECT student_id FROM enrollments
  WHERE course_id = (SELECT id FROM courses WHERE course_name = 'Python Basics')
);

-- Students NOT enrolled in any course
SELECT name FROM students
WHERE id NOT IN (SELECT DISTINCT student_id FROM enrollments);

The innermost subquery returns a single value (id); the middle subquery returns a list of student_id values; the outer query uses IN to filter against that list.

Subquery in FROM — Derived Table

When you put a subquery in the FROM clause it becomes a derived table — a temporary result set you can query just like a real table. You must give it an alias.

SQL
SELECT avg_data.grade, avg_data.avg_age
FROM (
  SELECT grade, AVG(age) AS avg_age
  FROM students
  GROUP BY grade
) AS avg_data
WHERE avg_data.avg_age > 16;

The inner query computes average ages per grade; the outer query filters only grades where the average exceeds 16. This is impossible without a subquery (or CTE) because WHERE runs before aggregation.

Correlated Subquery

A correlated subquery references a column from the outer query. It executes once per row of the outer query — making it powerful but potentially slower on large datasets.

SQL
-- For each student, count their enrollments
SELECT s.name,
  (SELECT COUNT(*) FROM enrollments e WHERE e.student_id = s.id) AS course_count
FROM students s;

Notice s.id — the inner subquery uses a column from the outer students alias s. For each student row, the database re-runs the inner query with that student's ID.

EXISTS vs IN

Both EXISTS and IN check whether rows satisfy a condition, but they work differently:

SQL
-- EXISTS: true if subquery returns any rows
SELECT name FROM students s
WHERE EXISTS (
  SELECT 1 FROM enrollments e WHERE e.student_id = s.id
);

-- IN: check against a list
SELECT name FROM students WHERE grade IN ('A', 'A+');
-- EXISTS is faster when subquery result is large
💡
Performance Tip: Prefer EXISTS
Use EXISTS instead of IN when checking large subquery results — it stops at the first match and doesn't need to build the full result set in memory.
🏻 Knowledge Check — Subqueries
1. What does a subquery in a WHERE clause return?
Correct! A WHERE subquery returns a scalar value (for =) or a list (for IN/NOT IN) used to filter the outer query.
2. What makes a subquery "correlated"?
Correct! A correlated subquery references a column from the outer query, so it re-executes for each outer row.
3. What is a "derived table"?
Correct! A derived table is a subquery placed in the FROM clause, treated as a temporary named result set.
4. What issue can arise with NOT IN when the subquery contains NULLs?
Correct! NULL comparisons always return UNKNOWN in SQL, so NOT IN with NULLs in the subquery can return zero rows unexpectedly. Use NOT EXISTS instead.
5. EXISTS stops processing when…
Correct! EXISTS returns TRUE as soon as it finds the first matching row — it doesn't need to scan the rest of the subquery result.