Two students in my database class came to me yesterday with questions about a nested query they had written. The query looked fine, but when executed the query returned no rows. After considerable investigation I realized the subquery being evaluated against was not only returning the obvious values, but also returned NULL.
It turns out NULL gets very special treatment in Oracle. NULL is treated as unknown. Basically it cannot be evaluated against anything because you can’t evaluate something you can’t measure. To try to make this clear I offer the following examples:
Note: dual is a table used frequently in testing. It has no data but can be used to return calculations, text, test conditions, etc.
Here’s a basic select and condition that will always succeed:
SELECT 'true' FROM dual WHERE 1 = 1;
'TRU
----
true
I think we can agree that, at least in the reality where we care to evaluate database queries, that 1=1, so this query returns the rows selected. In this case we have only selected one row, the string ‘true’
Now let’s take a look at the unusual behavior of NULL. First, here’s a query that should return no rows:
SELECT 'true' FROM dual WHERE 1 = NULL;
no rows selected
This makes sense because NULL does not equal 1, but now let’s look at another form of this statement:
SELECT 'true' FROM dual WHERE 1 != NULL;
no rows selected
Logically we think that 1 is different from NULL, so this should have returned ‘true’, but Oracle has a different idea. Oracle evaluates this by asking “Does 1 not equal an unknown value?” This makes as much sense to Oracle as asking “Does 3.17 equal a tree?” or “Is my birthday red?” so no matter what makes sense to us, Oracle evaluates this condition as FALSE.
We can take this one step further by executing the following query:
SELECT 'true' FROM dual WHERE NULL = NULL;
no rows selected
This illustrates that Oracle is completely unwilling to even try to evaluate NULL, but it starts to make sense that you would not say one unknown is, or isn’t equal to another unknown; therefore, NULL cannot be said to either be equal, or not equal to NULL.
Now let’s take a look at an IN condition.
SELECT 'true' FROM dual WHERE 5 NOT IN (1, 2, 3);
'TRU
----
true
This IN statement returns TRUE because 5 is NOT IN the set of 1, 2, 3. Now let’s look at a slight variation.
SELECT 'true' FROM dual WHERE 5 NOT IN (1, 2, 3, NULL);
now rows selected
While 5 does not explicitly appear in the set, we do not know what NULL is. Since we cannot evaluate on the unknown NULL, the condition fails and no rows are returned.
So we can see that NULL must be handled as a special case. To handle this we must use IS NULL or IS NOT NULL. If we want to evaluate two values to see if they are both NULL we could use the following:
SELECT 'true' FROM dual WHERE NULL IS NULL;
'TRU
----
true
Here we see that, while NULL = NULL is not a valid condition, NULL IS NULL works just fine. Now let’s consider this in the context of a subquery.
SELECT first_name, last_name FROM faculty
WHERE id NOT IN (SELECT instructor_id FROM class);
This query will be valid only if the subquery SELECT instructor_id FROM class
does not return any NULL values. If there are entries in the class table which have NULL values in the instructor_id column, the WHERE condition will always fail and no rows will be returned.
To make this statement more reliable (since we may plan not to have any NULL values now but some may make it in there) we can add a condition to the subquery.
SELECT first_name, last_name FROM faculty
WHERE id NOT IN (
SELECT instructor_id FROM class
WHERE instructor_id IS NOT NULL);
Now the result set from the subquery will never contain NULL and the condition will be properly evaluated.
Another crazy twist to this is with dates. If the date is null then you cannot compare it to a value date. So if you say something like
SELECT * FROM table a
where a.date1 a.date2
if date1 is null and there is a value in date2 then you would think that that row would return but it does not.
If you want to include nulls in your not equal to comparator then something like this will work:
SELECT * FROM table a
where nvl(to_char(a.date1,’dd’),’ ‘) a.date2