Skip to content

Conversation

@ChuheLin
Copy link

@ChuheLin ChuheLin commented Dec 1, 2025

Summary

This PR fixes a critical crash (WinError 206 / WinError 1921) in shutil.copytree on Windows when encountering directory junctions that point to a parent directory (recursive cycles).

The Problem

On Windows, shutil.copytree contains specific logic to force traversal into Directory Junctions by treating them as standard directories (is_symlink is forced to False).

However, unlike os.walk or shutil.rmtree (after recent fixes), copytree lacked a cycle detection mechanism for these traversed junctions. This caused infinite recursion until the path exceeded the OS limit, resulting in an unhandled OSError (Crash) rather than a Python RecursionError.

The Fix

I implemented cycle detection using file system identity (st_dev, st_ino), similar to how deepcopy handles recursion:

  1. Added a hidden _seen set parameter to copytree and its helper _copytree.
  2. Before entering a directory, the code now checks if (st.st_dev, st.st_ino) is already in _seen.
  3. If a cycle is detected, it raises a shutil.Error("Infinite recursion detected") immediately, preventing the crash.

Verification

  1. Regression Test: Added test_copytree_recursive_junction to Lib/test/test_shutil.py.
    • Status: PASSED on local Windows build.
  2. Manual Verification: Verified with a reproduction script that previously caused a WinError 206 crash.
    • Before fix: Script crashed after creating a path > 4000 chars.
    • After fix: Script correctly raises shutil.Error: Infinite recursion detected.

@ChuheLin ChuheLin requested a review from giampaolo as a code owner December 1, 2025 17:12
@bedevere-app
Copy link

bedevere-app bot commented Dec 1, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@ChuheLin ChuheLin changed the title gh-142155: Fix infinite recursion in shutil.copytree on Windows junct… gh-142155: Fix infinite recursion in shutil.copytree on Windows junctions Dec 1, 2025

def copytree(src, dst, symlinks=False, ignore=None, copy_function=copy2,
ignore_dangling_symlinks=False, dirs_exist_ok=False):
ignore_dangling_symlinks=False, dirs_exist_ok=False, _seen=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but I think adding a private parameter to a public function is a little odd. I see we added the _seen here because _copytree calls the public copytree, so we need to add it here too.

I wonder if we could just change _copytree to call itself recursively instead of adding this private parameter. This would also avoid triggering the audit recursively (should we do that?).

If we should trigger the audit recursively, maybe we could create another private function for the recursive logic and let _copytree call it, with a public wrapper that keeps the original function signature (without adding the _seen).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. It is better to have a private helper for that.

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to have a recursive guard, use a frozenset() as a default and update the frozenset every time. We do this in typing.py I think or maybe elsewhere (search for recursive_guard in Lib)

Lib/shutil.py Outdated
Comment on lines 553 to 560
if _seen is None:
_seen = set()
src_st = os.stat(src)
src_id = (src_st.st_dev, src_st.st_ino)
if src_id in _seen:
raise Error([(src, dst, "Infinite recursion detected")])
_seen = _seen.copy()
_seen.add(src_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is costly. We do a copy every time.

Comment on lines +1124 to +1126
except Exception as e:
# Skip if mklink is not available or fails for any reason
self.skipTest(f"Failed to create junction: {e}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea. Can't we create a junction from Python directly? (cc @barneygale).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not addressed.

Comment on lines 1110 to 1112
src_dir = os.path.join(base_dir, "Source")
junction_dir = os.path.join(src_dir, "Junction")
os.makedirs(junction_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use lower-case letters or is it not possible for junctions?

@bedevere-app
Copy link

bedevere-app bot commented Dec 3, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@ChuheLin ChuheLin requested a review from picnixz December 3, 2025 18:45
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address all changes before requesting a new review please, either by adding a comment if I was not clear or by changing the code.

Comment on lines +553 to +554
if _seen is None:
_seen = frozenset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why we do this. Why not just using a default argument? (it is no more mutable so it is fine)

# Track visited directories to detect cycles (e.g., Windows junctions)
if _seen is None:
_seen = frozenset()
src_st = os.stat(src)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call can fail and (can be) expensive. But I assume that we need it right?


def copytree(src, dst, symlinks=False, ignore=None, copy_function=copy2,
ignore_dangling_symlinks=False, dirs_exist_ok=False):
ignore_dangling_symlinks=False, dirs_exist_ok=False, _seen=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. It is better to have a private helper for that.

os.makedirs(junction_dir)

# Create a junction pointing to its parent, creating a cycle
junction_target = os.path.dirname(junction_dir) # Points to Source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
junction_target = os.path.dirname(junction_dir) # Points to Source
junction_target = os.path.dirname(junction_dir)

self.skipTest(f"Failed to create junction: {e}")

# Create destination directory
dst_dir = os.path.join(base_dir, "Dest")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dest -> dest

Comment on lines +1124 to +1126
except Exception as e:
# Skip if mklink is not available or fails for any reason
self.skipTest(f"Failed to create junction: {e}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not addressed.

@ChuheLin
Copy link
Author

ChuheLin commented Dec 3, 2025

Address all changes before requesting a new review please, either by adding a comment if I was not clear or by changing the code.

Thanks for the detailed review! It's getting late in my time zone, so I'm going to get some rest now. I will go through all your comments and update the code first thing tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants