Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Dec 23, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

KarthikNayak and others added 26 commits November 21, 2025 08:40
The `do_fetch()` function contains the core of the `git-fetch(1)` logic.
Part of this is to fetch and store references. This is done by

  1. Creating a reference transaction (non-atomic mode uses batched
     updates).
  2. Adding individual reference updates to the transaction.
  3. Committing the transaction.
  4. When using batched updates, handling the rejected updates.

The following commit, will fix a bug wherein fetching tags with
conflicts was causing other reference updates to fail. Fixing this
requires utilizing this logic in different regions of the function.

In preparation of the follow up commit, extract the committing and
rejection handling logic into a separate function called
`commit_ref_transaction()`.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git submodule add" tries to find if a submodule with the same name
already exists at a different path, by looking up an entry in the
.gitmodules file.  If the entry in the file is incomplete, e.g.,
when the submodule.<name>.something variable is defined but there is
no definition of submodule.<name>.path variable, it accesses the
missing .path member of the submodule structure and triggers a
segfault.

A brief audit was done to make sure that the code does not assume
members other than those that are absolutely certain to exist: a
submodule obtained by submodule_from_name() should have .name
member, while a submodule obtained by submodule_from_path() should
also have .path as well as .name member, and we cannot assume
anything else.  Luckily, the module_add() codepath was the only
problematic one.  It is fairly recent code that comes from 1fa06ce
(submodule: prevent overwriting .gitmodules on path reuse,
2025-07-24).

A helper used by update_submodule() seems to assume that its call to
submodule_from_path() always yields a submodule object without a
failure, which seems to rely on the caller making sure it is the
case.  Leave an assert() with a NEEDSWORK comment there for future
developers to make sure the assumption actually holds.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
The list of supported completions in the header of the file was
mostly written a long time ago when Shawn added the initial version
of this script in 2006.  The list explicitly states that we complete
"common --long-options", which implies that we do not complete
not-so-common ones and single letter options (this text dates back
to May 2007).

Update the description to explicitly state that single-letter
options are not completed.  Also, document that arguments to options
are completed, even for single-letter options (e.g., "git -c <TAB>"
offers configuration variables).

The reason why we do not complete single-letter options is because
it does not seem to help all that much to learn that the command
takes -c, -d, -e options when "git foo -<TAB>" offers these three,
unlike long options that is easier to guess what they are about.

Because this rationale is primarily for our developers, let's leave
it out of the completion script itself, whose messages are entirely
for end-users.  Our developers can run "git blame" to find this
commit as needed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit 0e358de (fetch: use batched reference updates, 2025-05-19)
updated the 'git-fetch(1)' command to use batched updates. This batches
updates to gain performance improvements. When fetching references, each
update is added to the transaction. Finally, when committing, individual
updates are allowed to fail with reason, while the transaction itself
succeeds.

One scenario which was missed here, was fetching tags. When fetching
conflicting tags, the `fetch_and_consume_refs()` function returns '1',
which skipped committing the transaction and directly jumped to the
cleanup section. This mean that no updates were applied. This also
extends to backfilling tags which is done when fetching specific
refspecs which contains tags in their history.

Fix this by committing the transaction when we have an error code and
not using an atomic transaction. This ensures other references are
applied even when some updates fail.

The cleanup section is reached with `retcode` set in several scenarios:

   - `truncate_fetch_head()`, `open_fetch_head()` and `prune_refs()` set
     `retcode` before the transaction is created, so no commit is
     attempted.

   - `fetch_and_consume_refs()` and `backfill_tags()` are the primary
     cases this fix targets, both setting a positive `retcode` to
     trigger the committing of the transaction.

This simplifies error handling and ensures future modifications to
`do_fetch()` don't need special handling for batched updates.

Add tests to check for this regression. While here, add a missing
cleanup from previous test.

Reported-by: David Bohman <debohman@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression introduced with batched updates in 0e358de (fetch:
use batched reference updates, 2025-05-19) when fetching references. In
the `do_fetch()` function, we jump to cleanup if committing the
transaction fails, regardless of whether using batched or atomic
updates. This skips three subsequent operations:

  - Update 'FETCH_HEAD' as part of `commit_fetch_head()`.

  - Add upstream tracking information via `set_upstream()`.

  - Setting remote 'HEAD' values when `do_set_head` is true.

For atomic updates, this is expected behavior. For batched updates,
we want to continue with these operations even if some refs fail to
update.

Skipping `commit_fetch_head()` isn't actually a regression because
'FETCH_HEAD' is already updated via `append_fetch_head()` when not
using '--atomic'. However, we add a test to validate this behavior.

Skipping the other two operations (upstream tracking and remote HEAD)
is a regression. Fix this by only jumping to cleanup when using
'--atomic', allowing batched updates to continue with post-fetch
operations. Add tests to prevent future regressions.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
…zero-array

* tc/last-modified-active-paths-optimization:
  last-modified: fix use of uninitialized memory
Introduce a new macro MEMZERO_ARRAY() that zeroes the memory allocated
by ALLOC_ARRAY() and friends. And add coccinelle rule to enforce the use
of this macro.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit a new coccinelle rule is added. But neiter
`make coccicheck` nor `meson compile coccicheck` did detect a case in
builtin/last-modified.c.

This case involves the field `scratch` in `struct last_modified`. This
field is of type `struct bitmap` and that struct has a member
`eword_t *words`. Both are defined in `ewah/ewok.h`. Now, while
builtin/last-modified.c does include that header (with the subdir in the
#include directive), it seems coccinelle does not process it. So it's
unaware of the type of `words` in the bitmap, and it doesn't recognize
the rule from previous commit that uses:

    type T;
    T *ptr;

Fix coccicheck by passing all possible include paths inside the Git
project so spatch(1) can find the headers and can determine the types.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A repo may have config options set by 'scalar clone' or 'scalar
register' and then updated by 'scalar reconfigure'. It can be helpful to
point out which of those options were set by the latest scalar
recommendations.

Add "# set by scalar" to the end of each config option to assist users
in identifying why these config options were set in their repo. Use a new
helper method to simplify the two callsites.

Co-authored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index.skipHash config option has been set to 'false' by Scalar since
4933152 (scalar: enable path-walk during push via config, 2025-05-16)
but that commit message is trying to communicate the exact opposite:
that the 'true' value is what we want instead. This means that we've
been disabling this performance benefit for Scalar repos
unintentionally.

Fix this issue before we add justification for the config options set in
this list.

Oddly, enabling index.skipHash causes a test issue during 'test_commit'
in one of the Scalar tests when GIT_TEST_SPLIT_INDEX is enabled (as
caught by the linux-test-vars build). I'm fixing the test by disabling
the environment variable, but the issue should be resolved in a series
focused on the split index.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These config values were added in the original Scalar contribution,
d0feac4 (scalar: 'register' sets recommended config and starts
maintenance, 2021-12-03), but were never fully checked for validity in
the upstream Git project. At the time, Scalar was only intended for the
contrib/ directory so did not have as rigorous of an investigation.

Each config option has its own justification for removal:

* core.preloadIndex: This value is true by default, now. Removing this
  causes some changes required to the tests that checked this config
  value. Use gui.gcwarning=false instead.

* core.fscache: This config does not exist in the core Git project, but
  is instead a config option for a Git for Windows feature.

* core.multiPackIndex: This config value is now enabled by default, so
  does not need to be called out specifically. It was originally
  included to make sure the background maintenance that created
  multi-pack-indexes would result in the expected performance
  improvements.

* credential.validate: This option is not something specific to Git but
  instead an older version of Git Credential Manager for Windows. That
  software was replaced several years ago by the cross-platform Git
  Credential Manger so this option is no longer needed to help users who
  were on that older software.

* pack.useSparse=true: This value is now Git's default as of de3a864
  (config: set pack.useSparse=true by default, 2020-03-20) so we don't
  need it set by Scalar.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The config values set by Scalar went through an audit in the previous
changes, so now reorganize the settings and simplify their purpose.

First, alphabetize the config options, except put the platform-specific
options at the end. This groups two Windows-specific settings and only
one non-Windows setting.

Also, this removes the 'overwrite_on_reconfigure' setting for many of
these options. That setting made nearly all of these options "required"
for scalar enlistments, restricting use for users. Instead, now nearly
all options have removed this setting.

However, there is one setting that still has this, which is
index.skipHash, which was previously being set to _false_ when we
actually prefer the value of true. Keep the overwrite here to help
Scalar users upgrade to the new version. We may remove that overwrite in
the future once we belive that most of the users who have the false
value have upgraded to a version that overwrites that to 'true'.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tc/memzero-array:
  contrib/coccinelle: pass include paths to spatch(1)
  git-compat-util: introduce MEMZERO_ARRAY() macro
  last-modified: fix use of uninitialized memory
Telling the user "you got some error messages" without showing what
the errors are is almost useless in CI environment, as the errors
cannot be examined without downloading build artifacts.

Arrange it to spew out the output when it fails.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Existing code in files that have been fairly stable trigger the
"make coccicheck" suggestions due to the new check.

Rewrite them to use MEMZERO_ARRAY()

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add user-facing documentation that justifies the values being set by
'scalar clone', 'scalar register', and 'scalar reconfigure'.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
…les-r-find-copies-fix

* rs/diff-index-find-copies-harder-optim:
  diff-index: don't queue unchanged filepairs with diff_change()
Copy detection cannot work when comparing the index to the working tree
because Git ignores files that it is not explicitly told to track.  It
should work in the other direction, though, i.e. for a reverse diff of
the deletion of a copy from the index.

d1f2d7e (Make run_diff_index() use unpack_trees(), not read_tree(),
2008-01-19) broke it with a seemingly stray change to run_diff_files().

We didn't notice because there's no test for that.  But even if we had
one, it might have gone unnoticed because the breakage only happens
with index preloading, which requires at least 1000 entries (more than
most test repos have) and is racy because it runs in parallel with the
actual command.

Fix copy detection by queuing up-to-date and skip-worktree entries using
diff_same().

While at it, use diff_same() also for queuing unchanged files not
flagged as up-to-date, i.e. clean submodules and entries where
preloading was not done at all or not quickly enough.  It uses less
memory than diff_change() and doesn't unnecessarily set the diff flag
has_changes.

Add two tests to cover running both without and with preloading.  The
first one passes reliably with the original code.  The second one
enables preloading and thus is racy.  It has a good chance to pass even
without the fix, but fails within seconds when running the test script
with --stress.  With the fix it runs fine for several minutes, until
my patience runs out.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation updates.

* ds/doc-scalar-config:
  scalar: document config settings
  scalar: alphabetize and simplify config
  scalar: remove stale config values
  scalar: use index.skipHash=true for performance
  scalar: annotate config file with "set by scalar"
"git submodule add" to add a submodule under <name> segfaulted,
when a submodule.<name>.something is already in .gitmodules file
without defining where its submodule.<name>.path is, which has been
corrected.

* jc/submodule-add:
  submodule add: sanity check existing .gitmodules
In-code comment update to clarify that single-letter options are
outside of the scope of command line completion script.

* jc/completion-no-single-letter-options:
  completion: clarify support for short options and arguments
MEMZERO_ARRAY() helper is introduced to avoid clearing only the
first N bytes of an N-element array whose elements are larger than
a byte.

* tc/memzero-array:
  contrib/coccinelle: pass include paths to spatch(1)
  git-compat-util: introduce MEMZERO_ARRAY() macro
Further application of MEMZERO_ARRAY() macro to the rest of the
code base.

* jc/memzero-array:
  cocci: use MEMZERO_ARRAY() a bit more
  coccicheck: emit the contents of cocci patch
"git diff-files -R --find-copies-harder" has been taught to use
the potential copy sources from the index correctly.

* rs/diff-files-r-find-copies-fix:
  diff-files: fix copy detection
"git fetch" that involves fetching tags, when a tag being fetched
needs to overwrite existing one, failed to fetch other tags, which
has been corrected.

* kn/fix-fetch-backfill-tag-with-batched-ref-updates:
  fetch: fix failed batched updates skipping operations
  fetch: fix non-conflicting tags not being committed
  fetch: extract out reference committing logic
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@pull pull bot locked and limited conversation to collaborators Dec 23, 2025
@pull pull bot added the ⤵️ pull label Dec 23, 2025
@pull pull bot merged commit 66ce5f8 into turkdevops:master Dec 23, 2025
2 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants