-
Notifications
You must be signed in to change notification settings - Fork 559
(tree) Add versioning to shared tree's summary #25876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
(tree) Add versioning to shared tree's summary #25876
Conversation
|
The PR description currently says:
Since 2.73.0 has not yet been released, it seems fine to write the metadata when when minVersionForCollab is greater or equal to version 2.73.0, so I suggest updating the PR description. |
packages/dds/tree/src/feature-libraries/detachedFieldIndexSummarizer.ts
Outdated
Show resolved
Hide resolved
|
If I understand correctly, this PR ensures that future changes to the summary tree structures (so long as they update the version numbers written to the relevant metadata summary blobs) will make FF clients >= v2_73 fail fast when they do not know that version. This doesn't solve the issue of FF clients < v2_73, which is why the PR description suggests that future changes be made in a way that breaks them too. My question is: why not proactively make such a change (either in this PR or as an immediate follow-up)? |
packages/dds/tree/src/feature-libraries/detachedFieldIndexSummarizer.ts
Outdated
Show resolved
Hide resolved
| /** | ||
| * The storage key for the blob containing metadata for the detached field index's summary. | ||
| */ | ||
| export const detachedFieldIndexMetadataKey = ".metadata"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This starting with a "." is interesting.
If this is indented to mean something, maybe establish a pattern that "." keys are part of the top level summary not the summary items or something, it should be documented somewhere central, and this should link to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I used "." here is because that's the convention the runtime follows for blobs that are not actual data. For example, the metadata, alias, blobs added by the container runtime. The .attributes blob added by channel context's to a DDS's summary, etc.
However, there is no documentation for this and no real reason to do it. It's just a conventional that is followed. Not sure what a good place to add this would be. Maybe to ISummaryTree?
| * The type for the metadata in the detached field index's summary. | ||
| * Using type definition instead of interface to make this compatible with JsonCompatible. | ||
| */ | ||
| // eslint-disable-next-line @typescript-eslint/consistent-type-definitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not follow our linter's recommendations? Did you read the documentation for that rule and decide why its suggestion should apply to our codebase, but not to this specific instance? If so, please include that reason as a comment here. If not, go read it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I'm pretty sure I know what it recommends and I agree with it here for a better developer experience, but I should have to know that from memory to review this: the linter is supposed to help tell people things when relevant.
Note 2: its way easier to get to that rule documentation from in the IDE when looking at the warning (click the link in the warning IIRC) than for a reviewer here. That's another reason to explain why this suppression is ok despite our choice to enable this lint inline here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the documentation and agree that having consistent types is better for readability. I did add the reason for using a type here in the doc comments above. Do you think that's not sufficient info or maybe you missed it?
packages/dds/tree/src/feature-libraries/detachedFieldIndexSummarizer.ts
Outdated
Show resolved
Hide resolved
packages/dds/tree/src/feature-libraries/detachedFieldIndexSummarizer.ts
Outdated
Show resolved
Hide resolved
| services, | ||
| parse, | ||
| ); | ||
| assert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure an "assert" is the right error type for this. When encountering a document too new for the current code, that indicates a usage error by the application team (either bad rollout schedule, bad min version for collab) or a decision to simply not support this client.
In none of those cases is this error a bug in the fluid framework code, which is what asserts indicate.
If there was some upstream checking which was supposed to ensure such a case would never make it to this point, then an assert here would be valid. If that is the case, can you add a comment explaining what check would catch this and give a usage effort before we get here? If not, replace this with a usage error with useful customer facing details (maybe we should have a shared utility for throwing such forwards compat errors?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to a UsageError. Although, not sure how to incorporate information about minVersionForCollab, etc. in the error.
packages/dds/tree/src/feature-libraries/forest-summary/forestSummarizer.ts
Outdated
Show resolved
Hide resolved
| // Load the incremental summary builder so that it can download any incremental chunks in the | ||
| // snapshot. | ||
| await this.incrementalSummaryBuilder.load(services, readAndParseBlob); | ||
| await this.incrementalSummaryBuilder.load(services, async (id: string) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are a lot of "id"s in this code. Could you use a more specific name or type? Just looking at this code I don't even know what this async function is (can't see the parameter name for it from the call site).
It seems a bit odd to be adding this async function as a parameter here in this PR: is this part of the versioning logic somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the id to be called chunkPath. Also, updated the incrementalSummaryBuilder.load to take an parameter object to make it easier to see the names of parameter.
I updated this code because I added a utility in this PR to read a blob from snapshot and parse it. It is not related to versioning but just removes duplicate code.
packages/dds/tree/src/feature-libraries/schema-index/schemaSummarizer.ts
Outdated
Show resolved
Hide resolved
That is a good point. We could make a breaking change in v1. Let me create a work item to do that as a follow up. |
AB#53973 |
| } | ||
|
|
||
| const supportedReadVersions = new Set<DetachedFieldIndexSummaryVersion>([ | ||
| DetachedFieldIndexSummaryVersion.v1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit confusing not to have an entry to represent the current version: supportedReadVersions.has(minVersionToDetachedFieldIndexSummaryVersion("2.72.0")) returns false despite the fact that such a format is supported.
In the message codec this was handled by having MessageFormatVersion.undefined = 0. I'm not sure if there are better patterns, but it seems better than nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see that VersionedSummarizer treats undefined as a special value. Having a DetachedFieldIndexSummaryVersion.vUndefined = 0 will mess that up...
Maybe the VersionedSummarizer ctor could take an optional parameter that represents a version number (DetachedFieldIndexSummaryVersion.vUndefined) for which no version information should be written/read in the summary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a vPreVersioning entry which is 0. VersionedSummarizer will write metadata blob for versions that are greater than this. Please let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem better, but why is vPreVersioning not included in supportedReadVersions? My comment about supportedReadVersions.has(minVersionToDetachedFieldIndexSummaryVersion("2.72.0")) returning false still holds, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the logic a bit.
Regarding supportedReadVersions.has(minVersionToDetachedFieldIndexSummaryVersion("2.72.0")) returning false. I was thinking that vPreConditioning is not a supported write version because metadata blob won't be written for this version. It exists to represent summaries written in the time period where this code is added but the minVersionForCollab is < 2.73.0. So, we don't write the metadata blob.
We could change this logic and write the metadata blob with version 0. It won't break anything and the logic becomes a lot simpler. The only slightly weird side effect is that older client will just ignore the metadata blob. But maybe that's a fine trade off.
packages/dds/tree/src/test/shared-tree-core/versionedSummarizer.spec.ts
Outdated
Show resolved
Hide resolved
| * Base class from which all {@link Summarizable}s derive. | ||
| * It handles versioning of summaries - writing version metadata to summaries | ||
| * and checking version compatibility when loading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to have both the Summarizable interface, and this class, then there is the option of having a an implementation of Summarizable that does not use this class. This flexibility might be useful (for example if something already extending so other class also want to implement Summarizable), but if we want to allow using it we should not say "Base class from which all {@link Summarizable}s derive." and instead something like "Utility for implementing {@link Summarizable}s classes with versioning"
If instead we do actually want to enforce that all Summarizables use versioning via these utilities, I see two approaches:
- Replace Summarizable interface with this class (For example just rename this class to it to Summarizable and delete the interface)
- Modify the interface to consist of the abstract methods and constructor parameters as properties, and move the code from here to the user of Summarizable objects (sharedTreeCore)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comment to say "Utility for implementing {https://github.com/link Summarizable}s classes with versioning".
I don't think we need to force all summarizables to use this utility - only ones that need to be versioned. At least we should start with this flexibility. If we do decide that all summarizables must have versioning, we can update the pattern.
cbfad80 to
8311a9d
Compare
| supportedVersions: Set<number>; | ||
| /** | ||
| * The default format version to use if the summary doesn't have metadata blob. | ||
| * This is true for summaries that were written before versioning was added for summaries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * This is true for summaries that were written before versioning was added for summaries. | |
| * This is used for summaries that were written before versioning was added for summaries. |
| } | ||
| if (!this.supportedVersions.has(version)) { | ||
| throw new UsageError( | ||
| `Cannot read version ${version} of shared tree summary. Upgrade to a supported version.`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `Cannot read version ${version} of shared tree summary. Upgrade to a supported version.`, | |
| `Cannot read version ${version} of shared tree summary.`, |
I'm not sure we should suggest upgrading as a remedy because this error could happen when newer clients who have dropped support for an old format end up encountering that format. In such a situation, the solution is not to upgrade the client, but to downgrade it.
| // Delete metadata blob from the summary | ||
| Reflect.deleteProperty(summary.summary.tree, summarizablesMetadataKey); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this only makes sense because the v1 summary format happens to be the same as the v2 format that is generated above.
This test will break when change the default summary being written to some v3 that does not happen to be the same as v1.
This is bad because this test is checking the ability to load v1 summaries, and the test will fail despite the fact that the ability to load v1 summaries might still work fine.
A better pattern would be to keep a JSON object or JSON string representing a summary in the old format as a constant (instead of generating one at runtime). This will also guard against the risk that the generation code for v1/v2 summaries might change.
I think the overall pattern we want to promote is as follows:
For each format that we support reading:
- keep a copy of a summary written in that format
- write a test that ensures that reading/loading a summary from that format succeeds
For each format that we support writing:
- write a snapshot test that ensures writing a summary to that format generates the same thing as it always has
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write a snapshot test that ensures writing a summary to that format generates the same thing as it always has
Maybe that last bit is already sufficiently covered by the existing snapshot tests
Description
The summaries written by Shared Tree and its summarizables are not versioned. Any changes to the summary format can lead to failures in clients that do not understand this new format. For example, the incremental summarization feature changes the forest's summary format by adding more nodes to its summary tree which older clients won't be able to read. Making this change behind a new version makes it safer as we can fail fast on seeing unrecognizable versions.
This change adds versioning by adding a metadata blob to the summary tree of the shared tree and its summarizables. This metadata blob will contain the format version of the summary. Every time the format of the summary changes, a new version should be added so that clients that don't understand this format will fail.
Here's how versioning will be added:
AB#53723