-
Notifications
You must be signed in to change notification settings - Fork 140
CBG-3690 don't re-read document for resync #7892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request optimizes the document resync operation by eliminating unnecessary document re-reads from the bucket. The key changes improve performance by passing the document data directly from DCP feed events to the resync function, and add support for creating HLV (Hybrid Logical Vector) metadata during resync operations.
Key Changes
- Modified
resyncDocumentto accept a*sgbucket.BucketDocumentparameter instead of re-reading from the bucket - Added HLV creation logic to the resync path for documents that don't have HLV
- Removed the non-xattr code pathway from resync operations
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| db/database.go | Updated resyncDocument signature to accept bucket document directly; added HLV update logic during resync |
| db/background_mgr_resync_dcp.go | Modified DCP callback to convert feed events to bucket documents and pass to resync |
| db/crud.go | Extended HLV update logic to handle Resync case alongside Import |
| db/document.go | Added bucketDocumentFromFeed and getBucketDocument helper functions; removed unused UnmarshalDocumentFromFeed |
| db/document_test.go | Added test for getBucketDocument function |
| db/database_test.go | Enhanced resync tests to verify HLV creation and metadata-only updates |
| base/constants.go | Added VirtualExpiry constant for fetching document expiry |
db/database.go
Outdated
| } | ||
| // Update MetadataOnlyUpdate based on previous Cas, MetadataOnlyUpdate | ||
| doc.MetadataOnlyUpdate = computeMetadataOnlyUpdate(doc.Cas, doc.RevSeqNo, doc.MetadataOnlyUpdate) | ||
| mouMatch := false // after writing this document, mou.cas != doc.cas since otherwise shouldUpdate would be false |
Copilot
AI
Nov 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment explains that mouMatch should be false, but doesn't clearly explain why shouldUpdate being false would make mou.cas == doc.cas. Consider clarifying that when shouldUpdate is false, no resync occurs and the function returns early via ErrUpdateCancel, so at this point shouldUpdate must be true.
| mouMatch := false // after writing this document, mou.cas != doc.cas since otherwise shouldUpdate would be false | |
| // At this point, shouldUpdate must be true, because if it were false, we would have already returned early via ErrUpdateCancel. | |
| // Therefore, mou.cas != doc.cas, and mouMatch should be false. | |
| mouMatch := false |
| } | ||
| return updatedDoc, err | ||
| _, rawSyncXattr, rawVvXattr, rawMouXattr, rawGlobalXattr, err := updatedDoc.MarshalWithXattrs() | ||
| updatedDoc := sgbucket.UpdatedDoc{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: This declaration here shadows the declaration on line 1827
db/database.go
Outdated
| // Update MetadataOnlyUpdate based on previous Cas, MetadataOnlyUpdate | ||
| doc.MetadataOnlyUpdate = computeMetadataOnlyUpdate(doc.Cas, doc.RevSeqNo, doc.MetadataOnlyUpdate) | ||
| mouMatch := false // after writing this document, mou.cas != doc.cas since otherwise shouldUpdate would be false | ||
| doc, err = db.updateHLV(ctx, doc, Resync, mouMatch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to just pass in false here given mouMatch is always false?
db/database.go
Outdated
| } | ||
| // Update MetadataOnlyUpdate based on previous Cas, MetadataOnlyUpdate | ||
| doc.MetadataOnlyUpdate = computeMetadataOnlyUpdate(doc.Cas, doc.RevSeqNo, doc.MetadataOnlyUpdate) | ||
| mouMatch := false // after writing this document, mou.cas != doc.cas since otherwise shouldUpdate would be false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this comment confusing as after this update has been completed mou.cas should be == to doc cas right?
|
To consider: Is the latest set of changes better than https://github.com/couchbase/sync_gateway/pull/7892/files/eb809bf36ded144a008de3741c629a11a999d827 ? Does it add more risk to the ticket to make it not worthwhile? The idea behind the latest set of changes is to avoid having to set The refactor exposes something that is tricky which is that I thought about if the right thing to do is to add |
CBG-3690 don't re-read document for resync
Pre-review checklist
fmt.Print,log.Print, ...)base.UD(docID),base.MD(dbName))docs/apiIntegration Tests
GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/171/ (unrelated flake in TestRemoveIndexesUseViewsTrueAndFalse)