Skip to content

Conversation

@losipiuk
Copy link
Member

@losipiuk losipiuk commented Dec 7, 2025

Description

Allow empty pages without blocks. It is valid for connectors/table
functions to produce empty pages with empty block list interleaved with
pages with data. Empty pages are filtered out by Driver.

Additional context and related issues

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)


public static void validateOutputPageTypes(SourcePage page, List<Type> expectedTypes, Supplier<String> debugContextSupplier)
{
if (page.getPositionCount() == 0 && page.getChannelCount() == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would drop && page.getChannelCount() == 0
It is valid to create an empty Page as new Page(0, new Block[N]), but that would throw a nullptr exception in the current code.
We should probably tighten up the Page constructor to disallow that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is valid to create an empty Page as new Page(0, new Block[N]), but that would throw a nullptr exception in the current code.

how would that be a valid page?
The blocks are not nullable in the page

We should probably tighten up the Page constructor to disallow that.

it doesn't have to be a runtime check (might be redundant in hot path)
the fact there is no such check should not be construed as implying that null blocks are OK

public static void validateOutputPageTypes(SourcePage page, List<Type> expectedTypes, Supplier<String> debugContextSupplier)
{
if (page.getPositionCount() == 0 && page.getChannelCount() == 0) {
// allow empty pages without blocks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says what we do, but doesn't say why. Please update & update PR description accordingly.

Also, it's not a universal truth that empty page can be any shape. For example

verify(typeEncoders.length == sourcePageChannels.length, "Source page channels and type encoders must have the same length");
try {
generator.writeStartArray();
for (Page page : pages) {
Block[] blocks = new Block[sourcePageChannels.length];
for (int i = 0; i < sourcePageChannels.length; i++) {
blocks[i] = page.getBlock(sourcePageChannels[i]);

This may not break because empty pages are filtered somewhere earlier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says what we do, but doesn't say why. Please update & update PR description accordingly.

That is a good point. I do not know any concrete place in code which does produce such pages. It is based on hint from @raunaqmorarka that it is valid scenario, so I sent this PR defensively.
@raunaqmorarka do you maybe have specific code pointers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know any concrete place in code which does produce such pages.

example

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

Allow empty pages without blocks. It is valid for connectors/table
functions to produce empty pages with empty block list interleaved with
pages with data. Empty pages are filtered out by Driver.
@losipiuk losipiuk force-pushed the lukaszos/relax-validation-for-empty-pages-7e2694 branch from 6fe187a to 7850ace Compare December 8, 2025 13:52
public static void validateOutputPageTypes(SourcePage page, List<Type> expectedTypes, Supplier<String> debugContextSupplier)
{
if (page.getPositionCount() == 0 && page.getChannelCount() == 0) {
// Allow empty pages without blocks. It is valid for connectors/table functions to produce empty pages with empty block list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per

if (page != null && page.getPositionCount() != 0) {

drop page.getChannelCount() == 0 condition and update the comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood from the discussion that such pages are really broken and not created? Are we creating such anywhere. Or just drop just in case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't know of any code creating empty pages with zero position count and non-zero, but incorrect, channel count

Or just drop just in case?

this. since the Driver condition is what it and same is in Scan Filter operator

if (page.getPositionCount() == 0) {
return WorkProcessor.of();

{
if (page.getPositionCount() == 0 && page.getChannelCount() == 0) {
// Allow empty pages without blocks. It is valid for connectors/table functions to produce empty pages with empty block list
// interleaved with pages with data. Example of such behavior can be found in TableChangesFunctionProcessor (this is not exactly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// interleaved with pages with data. Example of such behavior can be found in TableChangesFunctionProcessor (this is not exactly
// interleaved with pages with data. Example of such behavior can be found in Delta's TableChangesFunctionProcessor (this is not exactly

if (page.getPositionCount() == 0 && page.getChannelCount() == 0) {
// Allow empty pages without blocks. It is valid for connectors/table functions to produce empty pages with empty block list
// interleaved with pages with data. Example of such behavior can be found in TableChangesFunctionProcessor (this is not exactly
// related to SourcePage, but illustrates the contract, which is the same for Pages and SourcePages).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that for a SourcePage, the check may be triggering a load of lazy data.
If we had applied the check outside of ScanFilterAndProjectOperator, this wouldn't be a problem (and we wouldn't see SourcePage), but now maybe it is?

@losipiuk
Copy link
Member Author

losipiuk commented Dec 8, 2025

Incorporated into #27582

@losipiuk losipiuk closed this Dec 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants