Skip to content

Conversation

@OussamaSaoudi
Copy link
Collaborator

@OussamaSaoudi OussamaSaoudi commented Nov 19, 2025

🥞 Stacked PR

Use this link to review incremental changes.


What changes are proposed in this pull request?

This PR introduces serialization and deserialization for the ScanLogReplayProcessor.

This PR adds serialization and deserialization for LogReplayProcessor.

It also removes the StaticInsert transform since:

  1. It was unused
  2. It complicates serialization of Transforms

How was this change tested?

Round trip test is performed with the LogReplayProcessor. Invalid internal state is expected to result in an error.

@OussamaSaoudi OussamaSaoudi changed the title serde first draft feat: Distributed Log Replay serialization/deserialization Nov 19, 2025
@github-actions github-actions bot added the breaking-change Change that require a major version bump label Nov 24, 2025
@OussamaSaoudi OussamaSaoudi force-pushed the stack/dlr_serde branch 6 times, most recently from 0da2ae4 to 274c7bf Compare November 26, 2025 19:13
@codecov
Copy link

codecov bot commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 90.69401% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.85%. Comparing base (6972292) to head (eec7cbd).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
kernel/src/log_reader/checkpoint_manifest.rs 85.25% 8 Missing and 15 partials ⚠️
kernel/src/distributed/sequential_phase.rs 87.86% 10 Missing and 11 partials ⚠️
kernel/src/log_reader/commit.rs 87.75% 1 Missing and 5 partials ⚠️
kernel/src/scan/log_replay.rs 97.81% 3 Missing and 2 partials ⚠️
kernel/src/utils.rs 85.71% 0 Missing and 3 partials ⚠️
kernel/src/log_segment.rs 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1503      +/-   ##
==========================================
+ Coverage   84.55%   84.85%   +0.29%     
==========================================
  Files         123      129       +6     
  Lines       33366    34826    +1460     
  Branches    33366    34826    +1460     
==========================================
+ Hits        28213    29550    +1337     
- Misses       3768     3830      +62     
- Partials     1385     1446      +61     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

predicate_schema,
column_mapping_mode: self.state_info.column_mapping_mode,
};
let internal_state_blob = serde_json::to_vec(&internal_state)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we don't care too much about compatibility across binary versions here (if we do, then we might want to include a version byte, so readers can understand if the serialization format has changed).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm good callout. I'll put that as a warning in the docs.

// Deserialize internal state from json
let internal_state: InternalScanState = serde_json::from_slice(&state.internal_state_blob)
.map_err(|e| Error::generic(format!("Failed to deserialize internal state: {}", e)))?;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for safety, should we check that there aren't any unexpected keys?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Added a test for that :)

@OussamaSaoudi OussamaSaoudi mentioned this pull request Dec 5, 2025
obj["new_field"] = serde_json::json!("my_new_value");
let invalid_blob = obj.to_string();

let internal_res: Result<InternalScanState, _> = serde_json::from_str(&invalid_blob);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix.


#[test]
fn deserialize_internal_state_with_extry_fields_fails() {
// Test that missing predicate_schema when predicate exists is detected
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy-pasta?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change Change that require a major version bump

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants