for more details. The attached file contains a repro.
In summary the expressions that we generate to compensate for database null comparisons contain terms that are sometimes unnecessary. This leads to different issues:
- Perf regressions because the database is unable to leverage indexes when the predicates are complex
- Large stacks (and potentially stack overflows) because query trees contain additional expressions
We should look at removing the unnecessary terms we add to discriminate between NULL and false when we know that the expression if final (e.g. that it will not be composed over with a negation), e.g., in the following expression:
((x = y) AND NOT (x IS NULL OR y IS NULL)) OR (x IS NULL AND y IS NULL)
The part 'AND NOT (x IS NULL OR y IS NULL)' is only there to prevent the result of the whole expression from being NULL when x is NULL or y is NULL but both are not simultaneously NULL. When we know that this expression won't be negated then and is only combined
with other ORs or ANDs to form the predicate then we can get rid of that term because NULL and false will have the same effects on the results of the query, e.g. rows won't be returned for either NULL or false.