-
Notifications
You must be signed in to change notification settings - Fork 866
storage/filesystem: Add support for reverse index generation #1731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9a4daa2 to
735c791
Compare
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
This is an initial commit with golangci-lint. Given how much the project violates both formatters and linters, those will have to be enabled gradually. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
The generation of the reverse index will enable further performance optimisations around the fetching of objects by offset. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
| packChecksum plumbing.ObjectID | ||
| out chan<- uint32 | ||
|
|
||
| m sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the intention of using mutex to protect against misuse but it complicates and it's inherently single use.
How about having standalone Decode function and unexporting Decoder such that each Decode call would initialize a new decoder where there is no need for mutex, and free of mutation bugs?
Happy to make the change if you agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agree that the overall API in the plumbing/format encoders/decoders can be improved as currently they are a bit awkward.
If you can, please propose a PR with the changes you pointed out for revfile and ping me for a review. If the API changes make sense, potentially we could expand that across the other encoders/decoders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I submitted #1750
Let me know what you think.
| if errors.Is(err, io.EOF) { | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to handle this case separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice spot. This is missing wrapping ErrMalformedRevFile, as that would be the only reason for an EOF to happen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I handled this in my PR and also see the style change there for errors.Is vs ==.
| reader *bufio.Reader | ||
| hasher crypto.Hash | ||
| hash hash.Hash | ||
| nextFn stateFn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped
| } | ||
|
|
||
| _, err = d.reader.Peek(1) | ||
| if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to distinguish EOF and any other I/O errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that would be a nice improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I handled this case and validated via a test.
related to go-git#1731 * Make encode and decode a function and unexport encoder/decoder to hide state so that there is less opportunity to misuse concurrently, and drop mutex and recover as a result. * Handle non EOF errors in decode trailing check. * Add context in object decode for EOF. * Drop unused stateFn in encoder/decoder. Keeping explicit check for io.EOF because of convention golang/go#39155 Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>
related to go-git#1731 * Make encode and decode a function and unexport encoder/decoder to hide state so that there is less opportunity to misuse concurrently, and drop mutex and recover as a result. * Handle non EOF errors in decode trailing check. * Add context in object decode for EOF. * Drop unused stateFn in encoder/decoder. * Accept io.Reader since it's more idiomatic. For decode, internally, we create buffered reader if already not buffered but not doing it for writing since flushing is required and it's generally controlled by caller (file, buffer, etc). Keeping explicit check for io.EOF because of convention golang/go#39155 Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>
related to go-git#1731 * Make encode and decode a function and unexport encoder/decoder to hide state so that there is less opportunity to misuse concurrently, and drop mutex and recover as a result. * Handle non EOF errors in decode trailing check. * Add context in object decode for EOF. * Drop unused stateFn in encoder/decoder. * Accept io.Reader since it's more idiomatic. For decode, internally, we create buffered reader if already not buffered but not doing it for writing since flushing is required and it's generally controlled by caller (file, buffer, etc). Keeping explicit check for io.EOF because of convention golang/go#39155 Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>
The generation of the reverse index will enable further performance optimisations around the fetching of objects by offset.
This lays the foundation to tackle the performance issues for large repositories (e.g. #1601).