Both techniques are going to be tested during the exam.
IN and EXISTS are not the same. And they work absolutely differently.
IN picks the list of matching values.
EXISTS returns the boolean values like true or false. This is what makes them differ. IN picks, EXISTS returns. In turn, EXISTS returns the boolean value only (TRUE or FALSE), while IN can work with a variety of data types.
Let’s consider the IN condition used in combination with a subquery where multiple columns are returned by subquery and the correspondent number of columns is being expected by the WHERE clause
- As we can see the WHERE clause has two columns to be compared which are surrounded by brackets – department_id and salary. So, we have just defined which columns we are going to be looking for. In fact, the number of columns may be bigger, not only two. It can be equal to the maximum number of columns in the table being compared.
- From other side, we have a subquery which also returns exactly two columns after evaluating its own conditions in its own WHERE clause. The important remark here – the number of columns in this subquery must match the number of columns in the parent WHERE clause, otherwise you will get an error.
- And we have IN operator in between.
In order to understand how this query will work, let’s first see what WHERE clause in parent query will return, should we use those two columns as a projection:
As we can see 107 rows were returned since it was a simple projection. As we have 107 rows in employees table, all of them were returned through parent WHERE clause, and now we have to select some of them by means of IN operator. In this regard, let’s consider what subquery will return:
As we can see, the subquery has returned just five rows. These five rows are definitely present in a bigger set of data returned by parent query. Now, the IN operator will have to be engaged. As a result, only five rows will be matched and returned by parent query. You can imagine the IN operator as INTERSECT in this case. It will return only those rows that are present in both sets of data.
BE CAREFUL WITH NULLS, THEY ARE NOT WORKING WITH IN
You must be careful with columns containing NULLs. If you use them in the above-described query, they won’t return any results at all. As we know NULLs can’t be compared even to each other. When we have to compare NULL with another NULL we get an undefined result. Oracle can’t say for sure that one NULL is equal to another NULL, so the entire comparison operation will be invalidated and won’t return anything. Let’s introduce commission_pct column into this query as a third column. commission_pct column contains a lot of NULL values unlike department_id and salary columns in subquery.
As we can see above commission_pct column results with all five NULLs for this subquery, therefore this dataset containing five rows will be compared with the corresponding dataset produced by WHERE clause from the parent query:
Again, this parent query above will have 107 rows to be intersected with five rows produced by the subquery. But this time the match won’t be found because Oracle cannot compare NULL to NULL. This comparison will always result in UNKNOWN, UNDEFINED, therefore zero rows are returned ( see below)
Ok, we have finished consideration of IN. Let’s now consider EXISTS operator and how will it differ from IN.
The idea of EXISTS operator is very simple, it returns TRUE or FALSE.
- It returns TRUE if at least one row was returned by the subquery. In that case, EXISTS operator will trigger the execution of the parent query and parent query may return some data.
- It returns FALSE if no rows were returned by the subquery. In that case, EXISTS operators won’t trigger the execution of the parent query and no rows will be returned respectively by parent query as well.
Let’s modify our query to emphasize EXISTS usage:
As we can see, the query has returned 107 rows. Why? Why not just 5 as one may have expected? Because it acts as a boolean function, unlike the previous case where IN was acting as INTERSECT between two datasets. In case with EXISTS, we don’t even have two datasets to be compared to each other. We simply evaluate is it TRUE or FALSE. If the subquery returns TRUE, then EXISTS allow the parent query to be executed. And the parent query was SELECT * FROM EMPLOYEES, so all 107 rows were returned, therefore.
The Oracle documentation notes that:
“If the selective predicate is in the subquery, then use IN. If the selective predicate is in the parent query, then use EXISTS.” Also, the IN clause can’t compare anything with NULL values, but the EXISTS clause can compare everything with NULLs.
Same is true about NOT IN vs NOT EXISTS