SQL Subqueries: Databases for Developers #11


In SQL, everything is a table. Even the
results of a select. So you can base queries on the result of other queries.
It’s time to enter the world of subqueries When writing queries you put
your tables in the from clause. But because select statements return a table
you can replace a physical table with a query. This is known as an inline view. This enables you to break up your query into smaller parts then stitch them back
together. This can make writing complex queries a
lot easier. For example say you’ve got a table full of bricks and a book of models you can make with these bricks. The book tells you how many bricks you
need of each color and shape to make each model. You want to find out which
designs you’ve got enough bricks for. In order to do this there’s three things
you need to do. First you need to count how many bricks you’ve got of each color and shape. Then compare this for the number of bricks of each type each model needs. Finally you need to ensure you have all the different colors and shapes
needed for each design. (To do this) First you need to group the bricks by color and shape so
you can count them. And then at the end you need a group by model so you can check you’ve got all the different types needed to make them. Trying to do this in standard SQL with joins and group by is tricky at best. Using inline views makes it a lot easier. To do this start by grouping your bricks by color and shape. Then write another query around this to see which colors and shapes you
have enough of for which models. Then stick this in yet another inline view to ensure you’ve got all the different colors and shapes needed for each design. At this point you may be thinking This doesn’t look any easier. To understand the SQL you’ve got to work from the inside out which is tricky at best. Particularly if the author got their indentation wrong! Luckily there’s a
better way. Using the with clause, also known as common table expressions or subquery factoring, you can name subqueries You can then access these just like a
normal table. This goes above your select, enabling you
to build up your sub-queries with meaningful names as you go along. So you can break down the previous query into three separate units. First you count the
bricks of each type. Then join this to the bricks needed for each model. And finish by checking which designs you have all the bricks needed of each type. Not only is this easier to write and understand later, it’s also easier to
debug. To validate an inline view it’s hard to figure out which parts of your
query you need to run. Using with, just change your final select at the bottom
to the name of the subquery you want to test So subqueries enables you to
replace tables with a subquery but you can also join to subqueries in your
where clause. The classic example for doing this is when you want to find any
rows matching a list of values. Say you want to find all the bricks that are
yellow, green, and blue. Now you can use IN and manually write out these values. But
it’s easy to make a mistake when doing this particularly if you want to include
all the different shades of these colors. And with colors having thousands of
tints you can run into another problem. In Oracle Database IN has a hard upper
limit of a thousand values. Luckily you can overcome both of these problems by storing colors in another table. Now instead of writing out the colors you
can replace the in list with a query on this table. Unlike the handwritten list the table can have as many values as you want. Or you could replace in with exists. This gives you all the bricks where colors
query returns at least one row. This leads to an important difference between IN and EXISTS. Using IN you can query the colors table and you’ll only get the
matching bricks without having to join them. But with EXISTS as long as there’s
at least one row in the colors table you’ll get all the bricks of every color! To fix it and only get the bricks the colors in a colors table you need to
join the tables inside the sub-query When your sub-query joins to a table
from the parent query like this it’s known as a correlated sub-query. If it doesn’t it’s uncorrelated. This join is easy to overlook so check your subqueries carefully to avoid fetching unwanted data. Another trap you can fall
into here is selecting a column from the wrong table. For example let’s say the
colors column in the bricks table is called color and in the colors table
it’s called color name. If you select color in the sub-query
it’s not obvious that you’re just checking this column equals itself. Which of course is true giving you all the bricks. To avoid this make sure you
qualify all columns with their table. This can be tedious if they have long
names. Luckily you can save yourself some typing by using aliases. These go after
the table name and before any references to the columns. So to ensure you’ll
always access the right table in your SQL I strongly recommend that you
always use table aliases. But as long as you get the joins right IN and EXISTS return the same results. You can also negate these so to find all the rows in
one table that aren’t in another by sticking NOT in front of them. With one important gotcha. If any other rows that the subquery returns are null, even just one, then when using NOT IN the whole expression is unknown. So the database doesn’t return it. So if nulls are possible you must use NOT EXISTS. We’ve looked at subqueries in the from and where clauses. But there’s one more type
of subquery we need to cover: Scalar subqueries These are like normal subqueries but they have a couple of special properties. They can return at most one row and one column. This enables you to use them in most places you have
a column, such as your select clause. Due to features of Oracle Database such as scalar subquery factoring this can make them faster than a join in some
cases. So far we’ve only talked about select But you can use sub queries in
insert, update, and delete too! For example: The values cause of insert. The set
clause of update and the where clause of for update and delete. And, if you’re feeling really adventurous, you can update or delete a subquery too! But in
order to do this the rows you’re changing must be key preserved. That is
they appear just once in your result set with appropriate constraints to
guarantee this. So we’ve seen subqueries are a powerful technique has a lot of
applications. And, as we’ll see in future episodes, not only do they make your
SQL easier to read, write, and debug they can be critical to getting the
right answer. Thanks for watching I really hope you enjoyed this, but more importantly, I hope you learned something. If you’ve got any questions about subqueries then just put them in the comments. And stay tuned for more SQL magic!

Danny Hutson

Leave a Reply

Your email address will not be published. Required fields are marked *