storage/filesystem: Add support for reverse index generation #1731

pjbgf · 2025-11-20T22:38:24Z

The generation of the reverse index will enable further performance optimisations around the fetching of objects by offset.

This lays the foundation to tackle the performance issues for large repositories (e.g. #1601).

onee-only · 2025-11-21T06:55:13Z

For future reference, 6d806e8 is related to #1501.

ferhatelmas · 2025-11-21T21:38:03Z

For future reference, 6d806e8 is related to #1501.

It seems these changes are irrelevant to this PR and that's why a comment is needed. Why not add it first separately?

Edit: With gitignore, renovabot and contributing changes.

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

This is an initial commit with golangci-lint. Given how much the project violates both formatters and linters, those will have to be enabled gradually. Signed-off-by: Paulo Gomes <pjbgf@linux.com>

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

The generation of the reverse index will enable further performance optimisations around the fetching of objects by offset. Signed-off-by: Paulo Gomes <pjbgf@linux.com>

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

ferhatelmas · 2025-11-24T22:36:10Z

plumbing/format/revfile/decoder.go

+	packChecksum plumbing.ObjectID
+	out          chan<- uint32
+
+	m sync.Mutex


I understand the intention of using mutex to protect against misuse but it complicates and it's inherently single use.

How about having standalone Decode function and unexporting Decoder such that each Decode call would initialize a new decoder where there is no need for mutex, and free of mutation bugs?

Happy to make the change if you agree

Completely agree that the overall API in the plumbing/format encoders/decoders can be improved as currently they are a bit awkward.

If you can, please propose a PR with the changes you pointed out for revfile and ping me for a review. If the API changes make sense, potentially we could expand that across the other encoders/decoders.

I submitted #1750
Let me know what you think.

ferhatelmas · 2025-11-24T22:36:57Z

plumbing/format/revfile/decoder.go

+		if errors.Is(err, io.EOF) {
+			return nil, err
+		}


why do we need to handle this case separately?

Nice spot. This is missing wrapping ErrMalformedRevFile, as that would be the only reason for an EOF to happen here.

I handled this in my PR and also see the style change there for errors.Is vs ==.

ferhatelmas · 2025-11-24T22:37:26Z

plumbing/format/revfile/decoder.go

+	reader  *bufio.Reader
+	hasher  crypto.Hash
+	hash    hash.Hash
+	nextFn  stateFn


Seems like unused

ferhatelmas · 2025-11-24T22:38:46Z

plumbing/format/revfile/decoder.go

+	}
+
+	_, err = d.reader.Peek(1)
+	if err == nil {


Don't we need to distinguish EOF and any other I/O errors?

Yep, that would be a nice improvement.

I handled this case and validated via a test.

related to go-git#1731 * Make encode and decode a function and unexport encoder/decoder to hide state so that there is less opportunity to misuse concurrently, and drop mutex and recover as a result. * Handle non EOF errors in decode trailing check. * Add context in object decode for EOF. * Drop unused stateFn in encoder/decoder. Keeping explicit check for io.EOF because of convention golang/go#39155 Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>

related to go-git#1731 * Make encode and decode a function and unexport encoder/decoder to hide state so that there is less opportunity to misuse concurrently, and drop mutex and recover as a result. * Handle non EOF errors in decode trailing check. * Add context in object decode for EOF. * Drop unused stateFn in encoder/decoder. * Accept io.Reader since it's more idiomatic. For decode, internally, we create buffered reader if already not buffered but not doing it for writing since flushing is required and it's generally controlled by caller (file, buffer, etc). Keeping explicit check for io.EOF because of convention golang/go#39155 Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>

pjbgf force-pushed the perf branch 2 times, most recently from 9a4daa2 to 735c791 Compare November 20, 2025 23:03

pjbgf mentioned this pull request Nov 20, 2025

prevent concurrent map read and write panic #1729

Merged

pjbgf force-pushed the perf branch from 735c791 to 8ac5000 Compare November 20, 2025 23:17

pjbgf mentioned this pull request Nov 20, 2025

Support for SHA256 #706

Open

16 tasks

pjbgf mentioned this pull request Nov 21, 2025

Suggestion to add golangci-lint to this project #1501

Closed

pjbgf added 8 commits November 23, 2025 13:18

plumbing: format, Add revfile decoder

7787c3f

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

plumbing: format, Add SHA256 support for genOffsetHash

3e40e17

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

storage: filesystem/dotgit, Add support to accessing rev file

91c14f2

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

plumbing: format/revfile, Decouple revfile Decoder from idxfile

d891444

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

build: Add golangci-lint

b5a763b

This is an initial commit with golangci-lint. Given how much the project violates both formatters and linters, those will have to be enabled gradually. Signed-off-by: Paulo Gomes <pjbgf@linux.com>

plumbing: format/revfile, Add rev Encoder

f95ddfc

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

storage: filesystem, Generate the reverse index during clone

c05cb54

The generation of the reverse index will enable further performance optimisations around the fetching of objects by offset. Signed-off-by: Paulo Gomes <pjbgf@linux.com>

build: Add renovate configuration

174681f

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

pjbgf force-pushed the perf branch from 8ac5000 to 174681f Compare November 23, 2025 13:21

pjbgf merged commit 36fa819 into go-git:main Nov 23, 2025
23 of 24 checks passed

pjbgf deleted the perf branch November 23, 2025 16:21

ferhatelmas reviewed Nov 24, 2025

View reviewed changes

ferhatelmas mentioned this pull request Nov 25, 2025

refactor(plumbing): improve revfile encode/decode #1750

Merged

onee-only mentioned this pull request Nov 30, 2025

Bisected regression runs genOffsetHash leading to wasteful loop when processing ofs-delta #1451

Open

pjbgf mentioned this pull request Dec 2, 2025

Add reverse indexes for packfiles go-git/go-git-fixtures#59

Merged

storage/filesystem: Add support for reverse index generation #1731

storage/filesystem: Add support for reverse index generation #1731

Uh oh!

Conversation

pjbgf commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

onee-only commented Nov 21, 2025

Uh oh!

ferhatelmas commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pjbgf commented Nov 20, 2025 •

edited

Loading

ferhatelmas commented Nov 21, 2025 •

edited

Loading