Skip to content

Conversation

ekoifman
Copy link
Contributor

@ekoifman ekoifman commented Aug 1, 2025

What changes were proposed in this pull request?

Add an Optimizer rule looks for Left Outer Join followed by IsNull(rhs column) and rewrites it as Anti Join if possible

In the simplest case it expects select L.* from L LOJ R on l1 = r1 where r1 is null
or select L.* from L LOJ (select s1 as r1, s2 as r2 from S) R on l1 = r1 where r2 is null.

Why are the changes needed?

This rewrite may provide a performance improvement.
The implementation is not comprehensive but is meant to address the common cases seen in
legacy systems submitting queries with "synthetic anti join" expressed as Left Outer
followed by IsNull filter.

Does this PR introduce any user-facing change?

It may change the query plan to use Left Anti join instead of the original Left Outer join

How was this patch tested?

Existing and newly added Unit tests

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Aug 1, 2025
@abhiips07
Copy link

what will happen when query is something like this

select L.* from L LOJ R on l1 = r1 where r1 is null and r2 = 1;

will this also be converted into anti join?

@ekoifman
Copy link
Contributor Author

ekoifman commented Aug 1, 2025

what will happen when query is something like this

select L.* from L LOJ R on l1 = r1 where r1 is null and r2 = 1;

will this also be converted into anti join?

It will not. In this query, LOJ would be converted to Inner first because of r2=1

@ekoifman
Copy link
Contributor Author

@wangyum @cloud-fan could one of you take a look please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants