NULL is often interpreted by SQL to mean an unknown value. If you think about it that way, you will see why arithmetic and comparisons involving NULL can give unexpected results.
If one or both of the operands of the arithmetic operators (+, -, *, /) is NULL, the result is NULL. If you think of NULL as standing for an unknown value, then it makes sense that adding to an unknown value will result in an unknown value (ie, NULL).
Unlike most programming languages, SQL uses three valued logic - true, false, and unknown (NULL).
If NULL is used in a comparison (< <= = <> >= >), the result is unknown (represented by NULL) - not true, not false. Again it makes sense that comparing against an unknown value can't be evaluated as true or false -- the comparison result is unknown. However, this does cause a lot of surprise in one case - when you think you're comparing exactly for NULL. For example, to find all flights that had no copilot, you might (naievely) write this.
-- BAD BAD BAD - No records will be selected even if copilot field is NULL..
SELECT flight_id
FROM Flights
WHERE flight_copilot = NULL;
This will select records with a WHERE condition that is true. But using a
NULL in a comparison can not result in true (or false). The solution is to
use the special IS NULL
or IS NOT NULL
comparison.
-- Good, records with NULL in copilot field will be selected.
SELECT flight_id
FROM Flights
WHERE flight_copilot IS NULL;
NULL commonly has several interpretations. For example, is the flight_copilot
field NULL because the data wasn't recorded (missing data), there was no copilot,
or the field was NULLed when the copilot was removed from the database, or maybe this is
an aircraft that doesn't have room for a copilot so it means that it's inapplicable?
Creating special values in for these cases like these can help clarify
what is meant by NULL. Some database experts even recommend never using
NULL.
Another solution to the multiple possible meanings of NULL is to use an extra column which tells what that NULL means - missing data, not applicable, etc.
If a field can be NULL, it's necessary for the DBMS to keep extra information
about each row for that column. Usually it is just one extra bit, but it has to
be tested each time. Adding a NOT NULL
constraint to a field
may actually be faster. Disclaimer: I've never seen benchmarks to quantify this difference,
so it may be more theoretical than practical.