Skip to content

Conversation

@guluo2016
Copy link
Member

Purpose

Linked issue: close #6767

Tests

API and Format

Documentation

@guluo2016
Copy link
Member Author

@yuzelin @Zouxxyy @JingsongLi Can you review this pr when you have time, thanks!
If I missed anything, please let me know, thanks.


IntColumnVector timeVec =
new IntColumnVector() {
final int[] values = new int[] {0, 1000, 2000, 3000, 4000};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current situation: Similar index mixing (using i to read source data, or using row to check the Arrow vector) can easily occur repeatedly in multiple field writers, leading to hidden bugs.

Recommendation:

Recommendation: Define and strictly distinguish between sourceIndex (e.g., rowIndex = startIndex + i) and targetIndex(i) in the implementation.

Within the loop body of each writer, use sourceIndex to read data from the ColumnVector, and use targetIndex to write data to the ArrowVector; use sourceIndex to check for null values.

Example:

// Assume startIndex, batchRows, columnVector, and timeMilliVector are known.
for (int i = 0; i < batchRows; i++) {
    int sourceIndex = startIndex + i; // 从 columnVector 读取的索引
    int targetIndex = i;              // 写到 Arrow 向量的位置
    if (columnVector.isNullAt(sourceIndex)) {
        timeMilliVector.setNull(targetIndex);
    } else {
        int value = ((IntColumnVector) columnVector).getInt(sourceIndex);
        timeMilliVector.setSafe(targetIndex, value);
    }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t modify ArrowFieldWriters.java to keep it consistent with the rest of the code.
Updating the test case to improve readability, thanks for the review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] ArrowFieldWriters.TimeWriter ignores startIndex and reads incorrect rows

2 participants