Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Dec 8, 2025

Summary by CodeRabbit

  • Bug Fixes

    • Improved file system directory scanning performance on Windows through optimized metadata handling.
  • Refactor

    • Updated lstat() function signature to accept file paths only instead of file paths or file descriptors.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 8, 2025

Walkthrough

Changes to crates/vm/src/stdlib/os.rs optimize Windows directory iteration by pre-caching lstat metadata in DirEntry objects using OnceCell. The public lstat function signature changed from accepting OsPathOrFd<'_> to OsPath, with implementation forwarding to stat with FollowSymlinks(false). Platform-conditional cache initialization handles Windows and non-Windows paths separately.

Changes

Cohort / File(s) Summary
Windows lstat pre-caching in directory iteration
crates/vm/src/stdlib/os.rs
Introduced Windows-specific pre-caching of lstat data during scandir iteration via win32_xstat metadata, storing result in OnceCell per entry. DirEntry construction now uses locally computed pathval and assigns pre-cached lstat cell. Non-Windows platforms initialize lstat as empty OnceCell. Public lstat function signature changed from OsPathOrFd<'_> to OsPath, with call forwarding to stat(..., FollowSymlinks(false), vm).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Areas requiring extra attention:

  • Public API signature change for lstat(): verify all call sites are compatible with the new OsPath-only parameter
  • Windows-specific pre-caching logic: ensure win32_xstat integration correctly populates OnceCell and handles edge cases (removed files, permission errors)
  • Platform-conditional initialization: confirm non-Windows codepaths maintain original OnceCell behavior and performance characteristics
  • DirEntry construction: validate that pre-cached vs. empty lstat assignments don't introduce cache inconsistencies or memory leaks

Possibly related PRs

Poem

🐰 Hops through Windows paths with glee,
Caching stat before it's free,
OnceCell guards the metadata's keep,
While scandir's cache runs swift and deep!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main changes: fixing scandir/lstat functionality for Windows platform.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between abfd148 and 207a456.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_os.py is excluded by !Lib/**
📒 Files selected for processing (1)
  • crates/vm/src/stdlib/os.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/os.rs
🧠 Learnings (3)
📓 Common learnings
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.

Applied to files:

  • crates/vm/src/stdlib/os.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.

Applied to files:

  • crates/vm/src/stdlib/os.rs
🧬 Code graph analysis (1)
crates/vm/src/stdlib/os.rs (1)
crates/vm/src/windows.rs (1)
  • win32_xstat (86-92)
🔇 Additional comments (3)
crates/vm/src/stdlib/os.rs (3)

723-740: Windows lstat pre-caching optimization looks correct.

The approach of pre-caching lstat metadata from directory entry on Windows is appropriate. Silently ignoring win32_xstat failures is acceptable since DirEntry.stat() will fall back to computing on demand when the cell is empty. This aligns with CPython's optimization strategy for Windows scandir.


742-754: DirEntry construction properly uses pre-cached lstat.

The pathval extraction moved earlier to share with the pre-caching logic, and the lstat field correctly uses the pre-cached OnceCell. The stat field remains separately cached since it may differ when following symlinks.


994-1001: Signature change correctly aligns with CPython's os.lstat() specification.

Python's os.lstat() accepts only path-like objects as the primary argument—it does not support file descriptors as the path. For file descriptor access, Python provides os.fstat(fd). The previous OsPathOrFd<'_> parameter was overly permissive and inconsistent with CPython's API. This change correctly restricts lstat() to paths only, matching the Python specification.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review December 8, 2025 19:06
@youknowone youknowone merged commit bafaa1a into RustPython:main Dec 8, 2025
13 checks passed
@youknowone youknowone deleted the lstat branch December 8, 2025 19:32
This was referenced Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant