Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Dec 2, 2025

I am considering new subclass payload layout to provide

  1. Actually subclass typed function parameters, like using PyRef instead PyRef
  2. adding new fields to the subclass payload, which cause error right now.

I will left open questions on the code.

Summary by CodeRabbit

  • New Features

    • Centralized OS/errno error construction with OS-subtype exceptions; SSL/socket errors now surface as OS-derived errors.
    • New subclass/base support: safe upcast/into_base operations and PySubclass support for transparent subclasses.
  • Bug Fixes

    • Runtime validation for subclass base-field layout and size-aware downcast errors.
    • Exception payloads now expose errno/strerror/filename fields with getters/setters.
  • Refactor

    • Widespread conversion to transparent-wrapper/base-embedded patterns across ctypes, IO, bool/int, AST, struct-sequence and exception types.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Adds formal subclass support and base‑payload wiring across derive macros, VM core, exceptions, ctypes, stdlib, and IO: introduces PySubclass and PyRef upcast/into_base, validates pyclass base-field layout, converts many unit-like types to repr(transparent) newtypes or base‑embedded structs, centralizes OS/PyOSError subtype construction, and adds size-aware downcast checks.

Changes

Cohort / File(s) Summary
Derive macro & pyclass
crates/derive-impl/src/pyclass.rs
Validate that a pyclass base (if present) is the first field; add type_matches_path; detect #[repr(transparent)] and emit PySubclass impls for tuple-newtypes; forward base info into generated PyPayload and class-resolution; stop emitting payload in exception macro path.
Struct-sequence wrapper
crates/derive-impl/src/pystructseq.rs
Replace unit-like struct with #[repr(transparent)] newtype wrapping PyTuple; implement PyPayload, MaybeTraverse, and PySubclass delegating to inner PyTuple.
Core type system & payload
crates/vm/src/class.rs, crates/vm/src/object/core.rs, crates/vm/src/object/payload.rs
Add public PySubclass trait; add PyRef::into_base and PyRef::upcast; introduce cold_downcast_type_error; add size compatibility check in PyPayload::into_ref_with_type producing size-aware type errors.
Exceptions & VM constructors
crates/vm/src/exceptions.rs, crates/vm/src/exception_group.rs, crates/vm/src/vm/vm_new.rs
Convert many exception types to #[repr(transparent)] newtypes or field-backed structs; implement struct-backed PyOSError with fields/getters/setters, Traverse, py_new/slot_new; add new_os_error / new_os_subtype_error and update new_errno_error to return PyRef<PyOSError>.
Stdlib socket & ssl error paths
crates/stdlib/src/socket.rs, crates/stdlib/src/ssl.rs, crates/stdlib/src/ssl/compat.rs
Switch socket/SSL error construction to OS-subtype PyOSError paths (new_os_subtype_error); add From<PyRef<PyOSError>> conversions; update helpers to return PyRef<PyOSError> and upcast where needed; wrap SSL exception types as transparent newtypes.
Builtins: bool / int and structseq usage
crates/vm/src/builtins/bool.rs, crates/vm/src/builtins/int.rs, crates/vm/src/types/structseq.rs
Make PyBool a #[repr(transparent)] pub struct PyBool(pub PyInt) with ctx; update boolean construction/upcast and truth extraction; adjust PyStructSequence::from_data to use Self::Data::into_tuple(...) and guard tuple conversion.
AST node wrappers
crates/vm/src/stdlib/ast/pyast.rs
Emit #[repr(transparent)] single-field wrapper structs for AST node types instead of unit-like types.
ctypes base integration
crates/vm/src/stdlib/ctypes/*
(base.rs, array.rs, pointer.rs, structure.rs, union.rs, field.rs, function.rs, ctypes.rs)
Add CDataObject and PyCData::new; convert many ctypes metatypes to #[repr(transparent)] wrappers; remove PyPayload derives from concrete ctypes types and add explicit _base: PyCData fields; constructors now create a shared CDataObject and initialize _base and cdata consistently.
IO module refactor
crates/vm/src/stdlib/io.rs
Add unsupported_operation() accessor; embed _base fields across IO classes (_IOBase, _RawIOBase, _BufferedIOBase, _TextIOBase, FileIO, TextIOWrapper, StringIO, BytesIO, Buffered*); change __closed getter to return PyRef<PyBool>.
Misc. struct/newtype conversions
crates/vm/src/stdlib/ast/pyast.rs, crates/vm/src/exception_group.rs, crates/vm/src/stdlib/ctypes/*, crates/vm/src/vm/context.rs, crates/vm/src/stdlib/ctypes/*
Convert multiple unit-like types to repr(transparent) newtypes or add _base fields across various subsystems (AST, exceptions, ctypes, IO, context).
Test snippet
extra_tests/snippets/stdlib_ctypes.py
Add a module-level print("done") (test-only change).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Macro as derive-impl (proc-macro)
    participant VM as VM / type registry
    participant Payload as PyPayload::into_ref_with_type
    participant PyRef as runtime PyRef

    Macro->>VM: emit type metadata (static_type, payload_type_id, base info) and PySubclass impl for repr(transparent) newtypes
    VM->>VM: register types and base/subclass relationships
    PyRef->>Payload: into_ref_with_type(obj, cls, exact_class)
    alt exact_class == cls
        Payload-->>PyRef: create reference via _into_ref
    else exact_class is subclass of cls
        Payload->>Payload: check size compatibility
        alt sizes match
            Payload-->>PyRef: create reference via _into_ref
        else sizes differ
            Payload-->>PyRef: Err(cold_downcast_type_error(vm, class, obj))
        end
    else not subclass
        Payload-->>PyRef: Err(downcast type error)
    end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

Areas to focus during review:

  • crates/vm/src/exceptions.rs — PyOSError layout, constructor/slot_init, getters/setters, Traverse and PySubclass glue.
  • crates/vm/src/stdlib/ctypes/* — consistent creation of CDataObject and initialization of _base/cdata across constructors and factory paths.
  • crates/derive-impl/src/pyclass.rs — base-field validation, type_matches_path behavior, and correctness of emitted PySubclass/payload forwarding.
  • crates/vm/src/object/payload.rs & crates/vm/src/object/core.rs — size-aware downcast logic, cold_downcast_type_error, and safety of PyRef::into_base / upcast (use of unreachable/debug_assert).
  • stdlib socket/ssl paths — ensure upcasted PyOSError returns preserve expected Python-visible error types/messages.

Possibly related issues

  • #6054 — Overlaps with PyPayload/static-type generation and subclass/payload wiring; this PR's derive-impl and payload changes address that area.

Possibly related PRs

  • #6309 — Strong overlap in ctypes subsystem changes (types, constructors, CDataObject/PyCData wiring).
  • #6311 — Overlapping ctypes refactor concerns and base wiring patterns.
  • #6242 — Related derive-impl changes around pyclass base handling and generation.

Suggested reviewers

  • coolreader18
  • arihant2math

Poem

🐰
I hopped through structs with tiny feet,
Wrapped bases tidy, fields complete,
Exceptions donned transparent coats,
Upcast, downcast — nimble rabbit notes,
A little hop as types find their seat.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'New subclass payload layout' directly describes the main architectural change in this PR, which introduces a new subclass payload system via PySubclass trait and transparent wrappers.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9017be0 and 99a3b04.

📒 Files selected for processing (21)
  • crates/derive-impl/src/pyclass.rs (8 hunks)
  • crates/stdlib/src/socket.rs (4 hunks)
  • crates/stdlib/src/ssl.rs (20 hunks)
  • crates/stdlib/src/ssl/compat.rs (5 hunks)
  • crates/vm/src/builtins/int.rs (1 hunks)
  • crates/vm/src/exception_group.rs (1 hunks)
  • crates/vm/src/exceptions.rs (13 hunks)
  • crates/vm/src/object/core.rs (2 hunks)
  • crates/vm/src/stdlib/ctypes.rs (3 hunks)
  • crates/vm/src/stdlib/ctypes/array.rs (8 hunks)
  • crates/vm/src/stdlib/ctypes/base.rs (4 hunks)
  • crates/vm/src/stdlib/ctypes/field.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/function.rs (6 hunks)
  • crates/vm/src/stdlib/ctypes/pointer.rs (7 hunks)
  • crates/vm/src/stdlib/ctypes/structure.rs (6 hunks)
  • crates/vm/src/stdlib/ctypes/union.rs (6 hunks)
  • crates/vm/src/stdlib/io.rs (22 hunks)
  • crates/vm/src/stdlib/os.rs (1 hunks)
  • crates/vm/src/vm/context.rs (7 hunks)
  • crates/vm/src/vm/vm_new.rs (3 hunks)
  • extra_tests/snippets/stdlib_ctypes.py (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment on lines 92 to 94
#[inline]
fn payload_type_id() -> std::any::TypeId {
std::any::TypeId::of::<PyInt>()
}
Copy link
Member Author

@youknowone youknowone Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is okay here, because PyInt and PyBool shares same payload. The only difference is type.

But implementing this makes PyPayload::downcastable to be automatically true for any kind of PyInt. It doesn't make failing tests on this patch because we don't use that pattern. But it obviously will be a sort of trap for future developing.

Another question is that PyPayload::downcastable only takes &PyObject currently. We can't compare it to actual type... because we don't have vm.ctx.

#[pyclass(name = "bool", module = false, base = PyInt)]
pub struct PyBool;
#[repr(transparent)]
pub struct PyBool(pub PyInt);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have to manually define the base layout to be always leading field.
The problem is Rust actually only guarantees it to be actually same layout only when #[repr(transparent)] is used.

What I didn't check yet: Can using #[repr(C)] helps? Is there any other repr options to keep the first field as same as its original layout?

@youknowone youknowone force-pushed the new-subclass branch 2 times, most recently from e312bbf to 8db2660 Compare December 2, 2025 05:20
Copy link
Collaborator

@ShaharNaveh ShaharNaveh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks super cool. I am a bit worried that we are not actually using it anywhere ATM so we can't validate that it's not a breaking change.

How difficult would it be to gradually transition to pattern?

@youknowone youknowone force-pushed the new-subclass branch 2 times, most recently from afe436d to 4e2edcc Compare December 2, 2025 12:21

#[inline]
fn validate_downcastable_from(obj: &::rustpython_vm::PyObject) -> bool {
<Self as ::rustpython_vm::class::PyClassDef>::BASICSIZE <= obj.class().slots.basicsize && obj.class().fast_issubclass(<Self as ::rustpython_vm::class::StaticType>::static_type())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I found another way to do with more tricks. Let's check it with OSError

@youknowone youknowone force-pushed the new-subclass branch 4 times, most recently from d2f933e to 84f4c37 Compare December 12, 2025 07:34
@github-actions
Copy link
Contributor

Code has been automatically formatted

The code in this PR has been formatted using cargo fmt --all.
Please pull the latest changes before pushing again:

git pull origin new-subclass

@youknowone youknowone force-pushed the new-subclass branch 8 times, most recently from f96cee4 to 86de80e Compare December 13, 2025 08:47
@youknowone
Copy link
Member Author

@ShaharNaveh This is almost done! Could you give a look again?

@youknowone
Copy link
Member Author

One question: Passing OSError to new_exception is a runtime error. Can this be checked earlier?

Copy link
Collaborator

@ShaharNaveh ShaharNaveh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I think it's a much needed feature.

My only comments are nitpicks really, not logic related

@youknowone youknowone force-pushed the new-subclass branch 2 times, most recently from 6efa248 to ca84ccf Compare December 14, 2025 01:30
@youknowone youknowone force-pushed the new-subclass branch 3 times, most recently from 64a8eea to 6564f23 Compare December 14, 2025 03:12
@youknowone youknowone marked this pull request as ready for review December 14, 2025 03:36
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
crates/vm/src/stdlib/ctypes/array.rs (2)

241-248: Same redundant storage pattern as PyCUnion.

PyCArray stores both _base: PyCData and cdata: PyRwLock<CDataObject>. The same concern applies here: cloning CDataObject duplicates the buffer memory and could lead to inconsistencies if modifications are made through one field but not reflected in the other.

This pattern appears in all initialization paths (lines 198-204, 307-313, 538-544, 606-612, 665-671, 752-758). Consider consolidating to use a single source of truth for the data.


198-214: Verify that cdata.base assignment doesn't cause issues with cloned data.

At line 210-211, after creating the instance with cloned CDataObject, the code sets array_ref.cdata.write().base = Some(dll). This only modifies the cdata field's CDataObject, not the one in _base. If _base is ever used to access the base reference, it won't have this information.

crates/stdlib/src/ssl.rs (1)

3298-3304: Critical: Incorrect return type signature.

The read method signature on line 3304 shows -> PyResult without a type parameter, but all return paths in the function body (lines 3333-3335, 3375, 3396, 3410, etc.) return PyObjectRef values. This is either a compilation error or a type mismatch.

Apply this fix:

         fn read(
             &self,
             len: OptionalArg<isize>,
             buffer: OptionalArg<ArgMemoryBuffer>,
             vm: &VirtualMachine,
-        ) -> PyResult {
+        ) -> PyResult<PyObjectRef> {
crates/vm/src/stdlib/io.rs (1)

20-74: Remove the unsafe transmute and avoid panics in error conversion.

unsafe { std::mem::transmute(exc) } (Line 73) is not a safe way to return PyBaseExceptionRef, and unwrap() during exception construction (Line 70) can panic while already handling an error.

Suggested fix (use proper upcast + don’t panic):

@@
-        #[allow(clippy::let_and_return)]
-        let exc = vm.new_errno_error(errno, msg);
+        let exc = vm.new_errno_error(errno, msg);
@@
-            exc.as_object()
-                .set_attr("winerror", vm.new_pyobj(winerror), vm)
-                .unwrap();
+            let _ = exc
+                .as_object()
+                .set_attr("winerror", vm.new_pyobj(winerror), vm);
         }
-        // FIXME:
-        unsafe { std::mem::transmute(exc) }
+        exc.upcast()
     }
 }
crates/vm/src/exceptions.rs (2)

1551-1601: Avoid unsafe swap(...) + unwrap() in PyOSError::slot_init (panic/UB risk).

  • zelf.downcast_ref::<PyOSError>().unwrap() (Line 1560) can panic during exception initialization.
  • unsafe { exc.<field>.swap(...) } (Line 1567+) relies on “only called once / fields are None”, which is not a safe invariant (Python can re-run __init__), and may violate whatever safety contract PyAtomicRef::swap has.

Consider switching to checked downcast + safe swap helpers (you already use swap_to_temporary_refs in setters):

-            #[allow(deprecated)]
-            let exc: &Py<PyOSError> = zelf.downcast_ref::<PyOSError>().unwrap();
+            #[allow(deprecated)]
+            let exc: &Py<PyOSError> = zelf
+                .downcast_ref::<PyOSError>()
+                .ok_or_else(|| vm.new_type_error("OSError.__init__ called on non-OSError instance"))?;
@@
-                    let _ = unsafe { exc.errno.swap(Some(new_args.args[0].clone())) };
-                    let _ = unsafe { exc.strerror.swap(Some(new_args.args[1].clone())) };
+                    exc.errno
+                        .swap_to_temporary_refs(Some(new_args.args[0].clone()), vm);
+                    exc.strerror
+                        .swap_to_temporary_refs(Some(new_args.args[1].clone()), vm);
@@
-                    let _ = unsafe { exc.filename.swap(Some(new_args.args[2].clone())) };
+                    exc.filename
+                        .swap_to_temporary_refs(Some(new_args.args[2].clone()), vm);

1603-1681: __str__ and __reduce__ have inconsistent handling of winerror on Windows.

__str__ displays winerror when set (lines 1615–1617), but __reduce__ doesn't preserve it during pickling. The __init__ method accepts winerror at args[3] (line 1575) and stores it separately, then truncates args to 2 elements (line 1597). When __reduce__ reconstructs the exception using the truncated args, winerror is lost. A Windows-only pickle round-trip test should verify that winerror is intentionally not preserved (matching CPython's default behavior) or add it to __reduce__ if preservation is desired.

♻️ Duplicate comments (1)
crates/derive-impl/src/pyclass.rs (1)

718-719: LGTM: Optimization to skip checking for Constructor after found.

This change correctly adds an early exit condition to avoid unnecessary iteration once the Constructor trait is found.

🧹 Nitpick comments (5)
crates/vm/src/object/payload.rs (1)

104-119: Error message may be misleading.

The error message states "cannot create '{}' from a subclass with a different size '{}'", but cls is the subclass and exact_class is the base type. The phrasing suggests the relationship is inverted.

Consider rephrasing for clarity:

-                    vm.new_type_error(format!(
-                        "cannot create '{}' from a subclass with a different size '{}'",
-                        cls.name(),
-                        exact_class.name()
-                    ))
+                    vm.new_type_error(format!(
+                        "cannot create '{}' instance: size differs from base type '{}'",
+                        cls.name(),
+                        exact_class.name()
+                    ))
crates/vm/src/stdlib/os.rs (1)

528-537: Consider simplifying the error conversion.

The current flow creates an error, converts to PyObjectRef, then downcasts back. This roundabout conversion appears to be a workaround for type constraints. If vm.new_errno_error() returns a type compatible with PyResult, the intermediate conversion could potentially be avoided.

However, if this pattern is required by the new subclass payload layout to ensure proper type relationships, please add a brief comment explaining why the conversion is necessary.

         if key.is_empty() || key.contains(&b'=') {
-            let x = vm.new_errno_error(
+            // Note: Convert through PyObjectRef for proper subclass type handling
+            let x = vm.new_errno_error(
                 22,
                 format!(
                     "Invalid argument: {}",
                     std::str::from_utf8(key).unwrap_or("<bytes encoding failure>")
                 ),
             );
-
             let x: PyObjectRef = x.into();
             return Err(x.downcast().unwrap());
         }
crates/vm/src/stdlib/ctypes.rs (1)

372-378: Consider avoiding the clone if CDataObject is large or frequently instantiated.

The pattern creates a CDataObject, then clones it for both _base (via PyCData::new) and cdata. Since CDataObject contains a Vec<u8> buffer, this results in duplicating the buffer allocation. If the intent is for _base and cdata to share the same underlying data, consider using Arc<PyRwLock<CDataObject>> or similar shared ownership. If independent copies are intentional, this is fine.

#!/bin/bash
# Check if other ctypes types follow the same clone pattern and whether buffer sizes are typically small
rg -n "CDataObject::from_bytes" --type=rs -A2 crates/vm/src/stdlib/ctypes/
crates/derive-impl/src/pyclass.rs (1)

363-385: Type matching logic may be overly permissive with last-segment matching.

The type_matches_path function compares types by checking if either the full paths match or if just the last segments match (line 384). The last-segment-only matching could lead to false positives if two different types happen to share the same final identifier (e.g., foo::Bar matching baz::Bar).

Consider whether this permissiveness is intentional for macro hygiene reasons, or if a stricter comparison would be safer.

crates/stdlib/src/ssl.rs (1)

1512-1512: Consider using SSLError for certificate store loading failures.

Line 1512 uses vm.new_os_error() for Windows certificate store failures, while other SSL-related errors in this file use new_os_subtype_error with PySSLError::class(). For consistency and to help users distinguish SSL-specific errors, consider using:

vm.new_os_subtype_error(
    PySSLError::class(&vm.ctx).to_owned(),
    None,
    "Failed to load certificates from Windows store",
)
.upcast()

The same applies to line 1536 in the non-Windows path.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 21300f6 and 666979e.

📒 Files selected for processing (27)
  • crates/derive-impl/src/pyclass.rs (8 hunks)
  • crates/derive-impl/src/pystructseq.rs (2 hunks)
  • crates/stdlib/src/socket.rs (4 hunks)
  • crates/stdlib/src/ssl.rs (20 hunks)
  • crates/stdlib/src/ssl/compat.rs (5 hunks)
  • crates/vm/src/builtins/bool.rs (4 hunks)
  • crates/vm/src/builtins/int.rs (1 hunks)
  • crates/vm/src/class.rs (1 hunks)
  • crates/vm/src/exception_group.rs (1 hunks)
  • crates/vm/src/exceptions.rs (13 hunks)
  • crates/vm/src/object/core.rs (2 hunks)
  • crates/vm/src/object/payload.rs (3 hunks)
  • crates/vm/src/stdlib/ast/pyast.rs (14 hunks)
  • crates/vm/src/stdlib/ctypes.rs (3 hunks)
  • crates/vm/src/stdlib/ctypes/array.rs (8 hunks)
  • crates/vm/src/stdlib/ctypes/base.rs (4 hunks)
  • crates/vm/src/stdlib/ctypes/field.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/function.rs (6 hunks)
  • crates/vm/src/stdlib/ctypes/pointer.rs (7 hunks)
  • crates/vm/src/stdlib/ctypes/structure.rs (6 hunks)
  • crates/vm/src/stdlib/ctypes/union.rs (6 hunks)
  • crates/vm/src/stdlib/io.rs (22 hunks)
  • crates/vm/src/stdlib/os.rs (1 hunks)
  • crates/vm/src/types/structseq.rs (1 hunks)
  • crates/vm/src/vm/context.rs (7 hunks)
  • crates/vm/src/vm/vm_new.rs (3 hunks)
  • extra_tests/snippets/stdlib_ctypes.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/ctypes/field.rs
  • crates/vm/src/class.rs
  • crates/vm/src/types/structseq.rs
  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/object/core.rs
  • crates/stdlib/src/ssl/compat.rs
  • crates/vm/src/stdlib/ctypes/structure.rs
  • crates/vm/src/object/payload.rs
  • crates/vm/src/stdlib/os.rs
  • crates/stdlib/src/socket.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/vm/src/exception_group.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/stdlib/ctypes/function.rs
  • crates/vm/src/stdlib/ctypes/union.rs
  • crates/vm/src/builtins/int.rs
  • crates/derive-impl/src/pystructseq.rs
  • crates/vm/src/stdlib/ctypes/array.rs
  • crates/vm/src/stdlib/ctypes/pointer.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/builtins/bool.rs
  • crates/vm/src/stdlib/ctypes/base.rs
  • crates/stdlib/src/ssl.rs
  • crates/vm/src/stdlib/ast/pyast.rs
  • crates/vm/src/exceptions.rs
  • crates/vm/src/stdlib/io.rs
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • extra_tests/snippets/stdlib_ctypes.py
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.

Applied to files:

  • crates/vm/src/types/structseq.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • crates/vm/src/types/structseq.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/vm/src/stdlib/ctypes.rs
  • crates/stdlib/src/socket.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/derive-impl/src/pystructseq.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/builtins/bool.rs
  • crates/vm/src/stdlib/ast/pyast.rs
  • crates/vm/src/exceptions.rs
📚 Learning: 2025-12-09T08:46:58.660Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.

Applied to files:

  • crates/vm/src/object/payload.rs
  • crates/vm/src/stdlib/os.rs
  • crates/stdlib/src/socket.rs
  • crates/vm/src/exception_group.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/stdlib/ast/pyast.rs
  • crates/vm/src/exceptions.rs
📚 Learning: 2025-06-28T16:31:03.991Z
Learnt from: arihant2math
Repo: RustPython/RustPython PR: 5790
File: build.rs:2-2
Timestamp: 2025-06-28T16:31:03.991Z
Learning: In Cargo build scripts (build.rs), the environment variable CARGO_CFG_TARGET_OS is guaranteed to exist and is automatically set by Cargo during the build process, making unwrap() safe to use when accessing this variable.

Applied to files:

  • crates/vm/src/stdlib/os.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.

Applied to files:

  • crates/stdlib/src/socket.rs
  • crates/stdlib/src/ssl.rs
  • crates/vm/src/exceptions.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/**/*.py : Minimize modifications to CPython standard library files in the `Lib/` directory; modifications should be minimal and only to work around RustPython limitations

Applied to files:

  • crates/vm/src/stdlib/io.rs
🧬 Code graph analysis (14)
crates/vm/src/class.rs (2)
crates/vm/src/exceptions.rs (1)
  • as_base (1451-1453)
crates/vm/src/object/core.rs (1)
  • Self (371-374)
crates/vm/src/stdlib/ctypes.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
  • from_bytes (184-191)
  • size (211-213)
  • new (223-227)
  • new (261-267)
crates/stdlib/src/ssl/compat.rs (2)
crates/stdlib/src/ssl.rs (4)
  • create_ssl_want_read_error (413-419)
  • create_ssl_want_write_error (421-427)
  • create_ssl_zero_return_error (437-443)
  • create_ssl_eof_error (429-435)
crates/stdlib/src/socket.rs (1)
  • timeout_error_msg (2427-2429)
crates/vm/src/stdlib/ctypes/structure.rs (2)
crates/vm/src/stdlib/ctypes/base.rs (4)
  • from_stg_info (174-181)
  • new (223-227)
  • new (261-267)
  • from_bytes (184-191)
crates/vm/src/stdlib/ctypes/util.rs (1)
  • new (56-67)
crates/vm/src/object/payload.rs (2)
crates/vm/src/object/core.rs (2)
  • class (668-670)
  • obj (1098-1098)
crates/vm/src/object/ext.rs (1)
  • class (488-490)
crates/vm/src/stdlib/os.rs (1)
crates/vm/src/builtins/type.rs (1)
  • x (1152-1152)
crates/derive-impl/src/pyclass.rs (3)
crates/vm/src/object/payload.rs (4)
  • std (31-31)
  • class (54-54)
  • payload_type_id (30-32)
  • validate_downcastable_from (41-43)
crates/vm/src/class.rs (1)
  • as_base (187-187)
crates/vm/src/builtins/type.rs (1)
  • fast_issubclass (629-631)
crates/vm/src/vm/vm_new.rs (1)
crates/vm/src/exceptions.rs (5)
  • errno (1532-1533)
  • errno (1534-1534)
  • errno (1684-1686)
  • errno_to_exc_type (1163-1189)
  • errno_to_exc_type (1192-1194)
crates/vm/src/stdlib/ctypes/union.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
  • from_stg_info (174-181)
  • new (223-227)
  • new (261-267)
  • from_bytes (184-191)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/builtins/type.rs (1)
  • fast_issubclass (629-631)
crates/vm/src/class.rs (1)
  • as_base (187-187)
crates/vm/src/vm/context.rs (2)
crates/vm/src/types/slot.rs (3)
  • from (1236-1238)
  • heap_type_flags (243-249)
  • default (262-264)
crates/vm/src/builtins/type.rs (2)
  • new_heap (190-213)
  • bases (1070-1087)
crates/vm/src/builtins/bool.rs (1)
crates/vm/src/builtins/int.rs (2)
  • class (52-54)
  • fmt (34-36)
crates/vm/src/stdlib/ctypes/base.rs (2)
crates/vm/src/stdlib/ctypes/field.rs (1)
  • new (37-54)
crates/vm/src/stdlib/ctypes/function.rs (1)
  • value (52-52)
crates/vm/src/exceptions.rs (2)
crates/vm/src/object/payload.rs (1)
  • class (54-54)
crates/vm/src/class.rs (1)
  • as_base (187-187)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
🔇 Additional comments (69)
crates/vm/src/types/structseq.rs (1)

192-192: Good improvement: expect() with descriptive message.

Replacing .unwrap() with .expect() and providing a clear error message is a best practice that improves debuggability. The message correctly identifies this as an internal RustPython bug, which helps with triage if this invariant is ever violated.

crates/vm/src/exception_group.rs (1)

47-50: LGTM! Wrapper pattern aligns with the new subclass layout.

The transition from a unit struct to a #[repr(transparent)] newtype wrapper around PyBaseException is consistent with the broader refactor. This enables proper base-class relationships while maintaining memory layout compatibility for safe casting.

crates/vm/src/object/payload.rs (1)

19-26: LGTM! Cold path helper for error generation.

The #[cold] annotation is appropriate for an error path that should rarely execute, helping branch prediction optimization.

crates/vm/src/vm/context.rs (5)

35-36: LGTM! Type change aligns with PyBool wrapper pattern.

Changing true_value and false_value from PyIntRef to PyRef<PyBool> correctly reflects the new type hierarchy where PyBool is a proper wrapper around PyInt.


282-285: LGTM! Simplified trait bound.

Removing the redundant PyPayload bound is correct since PyObjectPayload already requires it (as shown at line 141).


306-307: LGTM! Boolean singleton creation with wrapper pattern.

Creating PyBool(PyInt::from(1)) and PyBool(PyInt::from(0)) correctly establishes the boolean singletons using the new wrapper structure.


450-456: LGTM! Return type correctly updated.

The method now returns PyRef<PyBool> matching the updated field types, maintaining the singleton pattern for boolean values.


510-520: No changes needed — basicsize: 0 is appropriate for heap exception types.

The explicit basicsize: 0 in the PyTypeSlots is correctly handled by PyType::new_heap_inner, which automatically inherits the base class basicsize when it's set to 0 (lines 327-329 in crates/vm/src/builtins/type.rs). This ensures that by the time into_ref_with_type is called, both the exact exception class and the dynamically created exception type have matching basicsize values, preventing any size mismatch errors. This design is safe and consistent with how other heap types (like ExceptionGroup in exception_group.rs) are created.

crates/vm/src/stdlib/ast/pyast.rs (2)

6-79: LGTM! Macro updated for wrapper-based AST nodes.

The impl_node macro correctly generates #[repr(transparent)] wrapper structs with the base type, aligning with the new subclass payload layout. The pattern matching and recursive macro invocations are consistent across all arms.


81-82: LGTM! Base AST node types converted to wrapper pattern.

All intermediate AST base types (NodeMod, NodeStmt, NodeExpr, etc.) correctly use #[repr(transparent)] wrappers around NodeAst, establishing the proper type hierarchy for AST nodes.

crates/vm/src/builtins/int.rs (1)

256-263: LGTM! Proper use of upcast for bool subclass.

The .upcast() call correctly converts PyRef<PyBool> to PyRef<PyInt>, leveraging the new PySubclass trait to maintain type safety while allowing with_value to return a consistent PyRef<Self> type.

crates/vm/src/class.rs (1)

174-188: Clean trait design for subclass relationships.

The PySubclass trait provides a clear abstraction for base-class access. The documentation accurately describes the auto-implementation by #[pyclass] and the #[repr(transparent)] requirement for ownership transfer.

crates/derive-impl/src/pystructseq.rs (1)

448-513: Well-structured newtype wrapper implementation.

The generated code correctly:

  1. Uses #[repr(transparent)] for safe memory layout
  2. Delegates payload_type_id() to PyTuple for proper type identification
  3. Implements validate_downcastable_from using fast_issubclass for correct subtype checking
  4. Provides PySubclass implementation for base access

This aligns well with the PR's subclass payload layout goals.

crates/stdlib/src/ssl/compat.rs (2)

225-256: Consistent error construction with OS subtype pattern.

The changes correctly use vm.new_os_subtype_error() followed by .upcast() to return PyBaseExceptionRef. This pattern is consistently applied across all SSL error creation functions and aligns with the PR's exception handling refactoring.


546-561: Error conversion paths properly upcast to base exception type.

All SslError variants that create OS-subtype errors (WantRead, WantWrite, Timeout, ZeroReturn, Eof) correctly call .upcast() to convert from PyRef<PyOSError> to PyBaseExceptionRef.

crates/vm/src/stdlib/ctypes/pointer.rs (2)

12-14: Proper repr(transparent) wrapper for type object.

The #[repr(transparent)] attribute on PyCPointerType(PyType) ensures correct memory layout for the type wrapper pattern used throughout this PR.


64-68: No action needed—empty base initialization is correct for pointer types.

Pointer types intentionally use CDataObject::from_bytes(vec![], None) because they store their data in a separate contents field rather than in the CData buffer. In contrast, structures and unions (which store binary field data) use from_stg_info with properly sized buffers. This design is consistent across all pointer-like types (PyCPointer, PyCFuncPtr) and is architecturally sound.

crates/vm/src/stdlib/ctypes/field.rs (1)

10-16: The removal of PyPayload derive from PyCFieldType is correct. PyCFieldType is a metaclass (type) for field descriptors, not a payload-bearing instance class. Metaclass types with #[pyclass(base = PyType)] do not require PyPayload implementation. The actual field descriptor class PyCField correctly derives PyPayload. This pattern is consistent across all ctypes metaclasses (PyCStructType, PyCPointerType, PyCArrayType, etc.), which also use base = PyType without PyPayload derive.

crates/vm/src/stdlib/ctypes.rs (2)

18-18: LGTM: Public export expanded correctly.

Adding CDataObject to the public exports is consistent with the new base pattern introduced across ctypes types.


50-50: LGTM: Import aligned with construction changes.

The expanded import correctly brings in CDataObject and PyCData for use in new_simple_type.

crates/vm/src/vm/vm_new.rs (4)

96-99: Debug assertion may fail for user-defined exception types.

The debug_assert_eq! checks that exc_type.slots.basicsize equals size_of::<PyBaseException>(). Per the docstring, this function "should only be used with builtin exception types," but the assertion will panic in debug builds if misused. This is good for catching errors early, though consider adding a comment clarifying why this assertion exists.


109-141: LGTM: Well-structured OS error construction.

The new new_os_error and new_os_subtype_error methods properly handle OSError construction with optional errno. The nested new_os_subtype_error_impl function keeps the implementation contained, though it could be a private method if reused elsewhere.

One minor note: the .expect("new_os_error usage error") calls will panic on construction failure. This is acceptable for internal API misuse, but ensure callers pass valid exception types.


120-120: Debug assertion validates PyOSError construction.

This assertion ensures exc_type has the correct basicsize for PyOSError, which is essential for the new subclass payload layout. Good defensive programming.


259-264: LGTM: Errno-based error construction correctly delegates.

The updated new_errno_error properly maps errno to a specific exception subtype via errno_to_exc_type, falling back to os_error if no mapping exists, then delegates to new_os_subtype_error.

crates/vm/src/stdlib/ctypes/base.rs (4)

222-228: LGTM: Simple and effective constructor.

The PyCData::new constructor provides a clean way to wrap a CDataObject in a PyRwLock, supporting the new base pattern across ctypes types. Based on coding guidelines, this follows the macro system pattern appropriately.


424-428: New _base field establishes base-class relationship.

Adding _base: PyCData alongside cdata: PyRwLock<CDataObject> is consistent with the PR's goal of formalizing subclass-base relationships. The pyclass declaration correctly specifies base = PyCData.


659-664: Same clone pattern as other ctypes types.

The constructor creates a CDataObject and clones it for both _base and cdata fields. This is consistent with the pattern in new_simple_type and other ctypes types. The duplication concern raised earlier applies here as well.


238-241: The #[repr(transparent)] usage is correct and follows the established pattern.

All other metaclass types in ctypes (PyCStructType, PyCPointerType, PyCUnionType, PyCArrayType) use the same #[repr(transparent)] attribute with identical structure. This confirms that PyCSimpleType correctly maintains memory layout compatibility with PyType and is consistent with the codebase's metaclass design.

crates/vm/src/stdlib/ctypes/function.rs (3)

7-7: LGTM: Import added for CDataObject usage.

The import is required for initializing the _base field in PyCFuncPtr constructors.


164-176: LGTM: _base field added to PyCFuncPtr.

The new _base: PyCData field is correctly placed at the beginning of the struct, consistent with the base-class pattern across ctypes types. The struct maintains all existing fields.


197-197: Consistent _base initialization across all constructor paths.

All four constructor branches (empty args, integer address, tuple form, callable form) correctly initialize _base with PyCData::new(CDataObject::from_bytes(vec![], None)). Using an empty buffer is appropriate for function pointers since they don't store data like other ctypes types.

crates/vm/src/stdlib/ctypes/structure.rs (6)

18-20: Consistent transparent wrapper pattern for metaclass.

PyCStructType(PyType) with #[repr(transparent)] follows the same pattern as PyCSimpleType, ensuring consistent metaclass handling across ctypes types.


222-229: LGTM: _base field added to PyCStructure.

The structure correctly adds _base: PyCData while maintaining the existing cdata and fields fields. This aligns with the PR's goal of enabling more specific subclass-typed function parameters.


299-304: Consistent initialization pattern in slot_new.

Creates CDataObject from StgInfo, then clones for both _base and cdata. Consistent with other ctypes types.


370-377: LGTM: from_address correctly updated.

The from_address method properly initializes both _base and cdata from the same CDataObject.


423-430: Good: from_buffer preserves source reference.

The from_buffer method correctly passes Some(source) to CDataObject::from_bytes, which stores the reference in objects to keep the source buffer alive. This is important for memory safety when the structure references external buffer data.


468-475: LGTM: from_buffer_copy creates independent copy.

Unlike from_buffer, from_buffer_copy correctly passes None since it creates an independent copy of the data, not a reference to the source buffer.

crates/derive-impl/src/pyclass.rs (5)

308-361: Well-structured base field validation.

The validate_base_field function properly validates that when a base class is specified, the struct has the base type as its first field. This is essential for #[repr(transparent)] to work correctly and ensures proper memory layout for subclassing.

One minor observation: the error messages are clear and informative, providing the expected base type name.


431-449: LGTM: repr(transparent) detection and basicsize computation.

The logic correctly detects #[repr(transparent)] and sets basicsize to 0 when the type has a base class with transparent representation, ensuring the type shares the same memory layout as its base.


611-652: PyPayload generation for base types looks correct.

The conditional logic properly handles:

  1. Types with a base: forwards payload_type_id to the base and adds validate_downcastable_from check
  2. Types without a base but with a ctx type: generates simple class resolution
  3. Types without base or ctx: no additional impl needed

The #[allow(clippy::collapsible_else_if)] is appropriate here for readability.


1212-1213: Clarifying comment for cfg matching is helpful.

The added comment explains that getter and setter must have matching #[cfg] attributes since they're paired by the map key (name, cfgs).


485-502: Remove the comment addition and close as not an issue.

The generated PySubclass impl correctly assumes a tuple struct layout. The impl is only generated when is_repr_transparent is true, and Rust's #[repr(transparent)] attribute guarantees exactly one field. This field must be the base type (validated by validate_base_field), so accessing it via &self.0 is always correct, whether the struct uses named or unnamed field syntax. The validate_base_field function handles both cases to provide flexible validation, but the actual PySubclass impl generation is safely gated by the is_repr_transparent check, which ensures the memory layout assumption holds.

crates/vm/src/builtins/bool.rs (3)

21-34: TryFromBorrowedObject accepts any integer as a boolean.

The implementation now downcasts to PyInt and treats any non-zero integer as true. This differs from strict boolean checking and aligns with Python's truthy semantics for integers. However, this means bool.try_from_borrowed_object will accept plain integers, not just booleans.

Verify this is the intended behavior - if callers expect only actual bool objects, this could be a semantic change.


88-96: LGTM: PyBool as transparent wrapper around PyInt.

The #[repr(transparent)] ensures PyBool has the same memory layout as PyInt, which is correct for the subclass pattern. The ctx = "bool_type" and base = PyInt attributes properly wire up the class hierarchy.

The Debug implementation correctly extracts the boolean value from the inner PyInt.


222-228: No action needed—all call sites include proper type checks.

The get_value function is safe despite the .unwrap(). All callers verify the object is a PyBool before calling: external callers in py_serde.rs (line 70) and jit.rs (line 141) use fast_isinstance or cls.is() checks, and internal calls within bool.rs occur in bool-specific methods protected by type guards (e.g., lines 136–137 in __or__ check fast_isinstance(vm.ctx.types.bool_type) for both operands). The function is pub(crate) scoped and never receives non-PyBool objects in practice.

Likely an incorrect or invalid review comment.

crates/vm/src/stdlib/ctypes/union.rs (1)

14-18: LGTM: PyCUnionType as transparent wrapper around PyType.

The metaclass is correctly defined as a transparent wrapper, consistent with the new subclass pattern.

crates/stdlib/src/socket.rs (4)

18-18: LGTM: PyOSError added to imports.

The import aligns with the new error handling pattern that uses PyOSError directly.


1829-1833: LGTM: From implementation for PyRef.

The implementation correctly upcasts PyRef<PyOSError> to PyBaseExceptionRef for integration with the IoOrPyException enum.


1852-1852: LGTM: Timeout error upcast.

The .upcast() call correctly converts PyRef<PyOSError> to the expected PyBaseExceptionRef type.


2420-2428: LGTM: OS subtype error creation and timeout functions.

The changes consistently use vm.new_os_subtype_error to create properly typed OS exceptions. The function signatures now return PyRef<PyOSError>, which provides better type safety.

crates/vm/src/stdlib/ctypes/array.rs (2)

26-28: LGTM: PyCArrayType as transparent wrapper around PyType.

Consistent with the new subclass pattern used throughout the codebase.


162-172: Verify unsafe raw pointer read from symbol address.

The code reads data from a symbol address obtained from a shared library. While there's a null check, verify that:

  1. The symbol address points to valid readable memory of the expected size
  2. The symbol data remains valid for the lifetime of the PyCArray instance (the dll reference helps, but data could still be modified)
crates/stdlib/src/ssl.rs (4)

346-410: LGTM: Exception type restructuring follows newtype wrapper pattern correctly.

The transparent newtype wrappers around base exception types (PySSLError wrapping PyOSError, subclasses wrapping PySSLError) align with the PR's goal of enabling more specific subclass-typed parameters. The __str__ implementation correctly preserves OSError-compatible behavior by checking for strerror attribute first.


1236-1242: LGTM: Error construction follows OS-subtype pattern.

The password decryption error correctly uses new_os_subtype_error with None errno (since this is an SSL/crypto error, not a system error) and calls upcast() to convert to the expected exception type.


1870-1880: LGTM: FileNotFoundError construction is correct.

The error correctly sets errno to 2 (ENOENT), creates the FileNotFoundError subtype, sets the filename attribute, and calls upcast(). This matches CPython's FileNotFoundError behavior.


3347-3353: LGTM: Error construction in read/write operations is consistent.

The shutdown, EOF, and operation errors correctly use the OS-subtype error pattern with appropriate errno values (None for protocol errors) and descriptive messages. The use of helper functions with .upcast() properly converts between exception types.

Also applies to: 3400-3406, 3478-3484

crates/vm/src/stdlib/io.rs (7)

83-98: Module wiring via _io::unsupported_operation() looks fine.
Pulling the type via _io::unsupported_operation() and exporting it (Line 91-95) is a nice cleanup over ad-hoc statics.


148-156: Import tweak is fine.
Switching to PyBool (Line 152) matches the new more-specific return types in this PR.


205-208: new_unsupported_operation() integration looks good.
Creating an OSError subtype with the centralized UnsupportedOperation type (Line 205-208) is consistent with the refactor direction.


426-462: _IOBase.__closed type change is OK; just ensure it matches Python-visible expectations.
Returning PyRef<PyBool> (Line 459-461) aligns with the “more specific subclass-typed parameters” goal; worth running stdlib _io behavioral tests around closed/__closed since this is a core attribute.


640-645: Transparent base wrappers are a good approach for the new layout.
The #[repr(transparent)] wrappers for _RawIOBase, _BufferedIOBase, _TextIOBase plus Default (Line 640-645, 700-705, 764-770) look consistent with the subclass payload embedding strategy.

Also applies to: 700-705, 764-770


1766-2032: Embedded _base fields: looks consistent; please clippy/test for any accidental default-constructed “uninitialized” states.
The _base: <BaseType> pattern across buffered/text/raw/file IO types (Line 1766-2032, 2376-2383, 3657-3695, 3802-3827, 4441-4474) matches the PR objective. Main risk is behavior of any Default/DefaultConstructor paths that could create instances without the expected init invariants (e.g., mutex state, “uninitialized object” errors) — worth validating constructors and __init__ flows still gate operations properly.

Also applies to: 2376-2383, 3657-3695, 3802-3827, 4441-4474


4246-4268: unsupported_operation() is correctly implemented. RustPython is a single-context architecture where all VirtualMachine instances share the same global Context::genesis(). The static cell pattern here is safe and intentional—there is no scenario with multiple independent contexts. The concern about multi-context support does not apply to this codebase.

crates/vm/src/exceptions.rs (6)

966-969: extend_exception!(PyOSError, ...) change is fine.
Removing dynamic attributes in favor of struct fields + getsets matches the refactor.


1196-1210: Types-module import adjustments look fine.
No concerns with the added imports to support the new PyOSError implementation.


1235-1437: #[repr(transparent)] wrapper migration looks consistent; just ensure subclass/base expectations are tested.
The broad conversion of exception types into transparent wrappers matches the new subclass payload design and should enable more precise typed refs.

Also applies to: 1736-2202


1682-1734: Field getsets look good; nice use of swap_to_temporary_refs.
The getters/setters for errno/strerror/filename/filename2(/winerror) are straightforward and align with the new field-backed model.


1-341: Process: please run formatting + clippy on this PR.
Given the scale of unsafe and layout changes, it’s worth ensuring cargo fmt + cargo clippy are clean and that exception/stdlib test suites cover the new constructor + pickle/GC paths.


1487-1521: Non-Windows filename2 handling appears incorrect (len==4 should populate filename2).

In py_new, filename2 is only set when len == 5 (Line 1500-1504). On non-Windows, the 4th positional arg is typically filename2, so this loses data. Similar issue in slot_init where filename2 is only set on len == 5 (Line 1591-1593).

Suggested fix (platform-specific indexing):

@@
-            let filename2 = if len == 5 {
-                args.args.get(4).cloned()
-            } else {
-                None
-            };
+            #[cfg(windows)]
+            let filename2 = if len == 5 { args.args.get(4).cloned() } else { None };
+            #[cfg(not(windows))]
+            let filename2 = if len == 4 { args.args.get(3).cloned() } else { None };
@@
-                if len == 5 {
-                    let _ = unsafe { exc.filename2.swap(new_args.args.get(4).cloned()) };
-                }
+                #[cfg(windows)]
+                if len == 5 {
+                    exc.filename2
+                        .swap_to_temporary_refs(new_args.args.get(4).cloned(), vm);
+                }
+                #[cfg(not(windows))]
+                if len == 4 {
+                    exc.filename2
+                        .swap_to_temporary_refs(new_args.args.get(3).cloned(), vm);
+                }

Also applies to: 1551-1601

⛔ Skipped due to learnings
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.

Comment on lines +1076 to +1103
impl<T: crate::class::PySubclass + std::fmt::Debug> PyRef<T>
where
T::Base: std::fmt::Debug,
{
/// Converts this reference to the base type (ownership transfer).
/// # Safety
/// T and T::Base must have compatible layouts in size_of::<T::Base>() bytes.
#[inline]
pub fn into_base(self) -> PyRef<T::Base> {
let obj: PyObjectRef = self.into();
match obj.downcast() {
Ok(base_ref) => base_ref,
Err(_) => unsafe { std::hint::unreachable_unchecked() },
}
}
#[inline]
pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U>
where
T: StaticType,
{
debug_assert!(T::static_type().is_subtype(U::static_type()));
let obj: PyObjectRef = self.into();
match obj.downcast::<U>() {
Ok(upcast_ref) => upcast_ref,
Err(_) => unsafe { std::hint::unreachable_unchecked() },
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unsafe unreachable_unchecked could cause UB if invariants are violated.

Both into_base() and upcast() use unsafe { std::hint::unreachable_unchecked() } on downcast failure. While the type system constraints should prevent misuse, if invariants are violated (e.g., due to macro bugs or incorrect implementations), this will cause undefined behavior rather than a clear panic.

Consider using unreachable!() with a descriptive message for safer error handling, or at minimum document why this is safe:

     pub fn into_base(self) -> PyRef<T::Base> {
         let obj: PyObjectRef = self.into();
         match obj.downcast() {
             Ok(base_ref) => base_ref,
-            Err(_) => unsafe { std::hint::unreachable_unchecked() },
+            Err(_) => unreachable!("PySubclass invariant violated: into_base downcast failed"),
         }
     }
     #[inline]
     pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U>
     where
         T: StaticType,
     {
         debug_assert!(T::static_type().is_subtype(U::static_type()));
         let obj: PyObjectRef = self.into();
         match obj.downcast::<U>() {
             Ok(upcast_ref) => upcast_ref,
-            Err(_) => unsafe { std::hint::unreachable_unchecked() },
+            Err(_) => unreachable!("PySubclass invariant violated: upcast to {} failed", std::any::type_name::<U>()),
         }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
impl<T: crate::class::PySubclass + std::fmt::Debug> PyRef<T>
where
T::Base: std::fmt::Debug,
{
/// Converts this reference to the base type (ownership transfer).
/// # Safety
/// T and T::Base must have compatible layouts in size_of::<T::Base>() bytes.
#[inline]
pub fn into_base(self) -> PyRef<T::Base> {
let obj: PyObjectRef = self.into();
match obj.downcast() {
Ok(base_ref) => base_ref,
Err(_) => unsafe { std::hint::unreachable_unchecked() },
}
}
#[inline]
pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U>
where
T: StaticType,
{
debug_assert!(T::static_type().is_subtype(U::static_type()));
let obj: PyObjectRef = self.into();
match obj.downcast::<U>() {
Ok(upcast_ref) => upcast_ref,
Err(_) => unsafe { std::hint::unreachable_unchecked() },
}
}
}
impl<T: crate::class::PySubclass + std::fmt::Debug> PyRef<T>
where
T::Base: std::fmt::Debug,
{
/// Converts this reference to the base type (ownership transfer).
/// # Safety
/// T and T::Base must have compatible layouts in size_of::<T::Base>() bytes.
#[inline]
pub fn into_base(self) -> PyRef<T::Base> {
let obj: PyObjectRef = self.into();
match obj.downcast() {
Ok(base_ref) => base_ref,
Err(_) => unreachable!("PySubclass invariant violated: into_base downcast failed"),
}
}
#[inline]
pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U>
where
T: StaticType,
{
debug_assert!(T::static_type().is_subtype(U::static_type()));
let obj: PyObjectRef = self.into();
match obj.downcast::<U>() {
Ok(upcast_ref) => upcast_ref,
Err(_) => unreachable!("PySubclass invariant violated: upcast to {} failed", std::any::type_name::<U>()),
}
}
}
🤖 Prompt for AI Agents
In crates/vm/src/object/core.rs around lines 1076 to 1103, the methods
into_base() and upcast() call unsafe { std::hint::unreachable_unchecked() } when
a downcast fails, which can produce undefined behavior if invariants are ever
violated; replace the unreachable_unchecked() calls with a safe panic (e.g.,
unreachable!() or panic! with a descriptive message) so a violation yields a
clear panic and backtrace rather than UB, and add a short comment above each
branch documenting the invariant that guarantees the downcast should succeed (or
keep a debug_assert verifying the subtype relation before the downcast).

Comment on lines 124 to 129
#[pyclass(module = "_ctypes", name = "Union", base = PyCData, metaclass = "PyCUnionType")]
#[derive(PyPayload)]
pub struct PyCUnion {
_base: PyCData,
/// Common CDataObject for memory buffer
pub(super) cdata: PyRwLock<CDataObject>,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Redundant storage: _base and cdata hold cloned copies of the same CDataObject.

The struct stores both _base: PyCData and cdata: PyRwLock<CDataObject>. Looking at the initialization code (lines 177-180), CDataObject is cloned and stored in both fields. Since CDataObject contains a Vec<u8> buffer, this duplicates the memory allocation.

This pattern is repeated in from_address, from_buffer, and from_buffer_copy. Consider whether _base should hold the canonical data and cdata should be removed, or if they should share a reference.

 pub struct PyCUnion {
-    _base: PyCData,
-    /// Common CDataObject for memory buffer
-    pub(super) cdata: PyRwLock<CDataObject>,
+    /// Base PyCData containing the memory buffer
+    _base: PyCData,
 }

If both are needed, consider having cdata be a reference or accessor to _base's data rather than a clone.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/union.rs around lines 124 to 129, the struct
stores duplicated CDataObject data in both `_base: PyCData` and `cdata:
PyRwLock<CDataObject>`, causing cloned Vec<u8> buffers; fix by consolidating
storage: either remove `cdata` and use `_base` as the canonical CDataObject
everywhere, or make `cdata` a non-owning accessor to `_base` (e.g., point to
`_base`'s inner CDataObject via a shared reference/Arc or expose RwLock on
`_base`), then update the constructors
(from_address/from_buffer/from_buffer_copy) to stop cloning the buffer into two
places and adjust any methods to access the single canonical storage; ensure
thread-safety by using a shared lock or reference type consistently and update
types/signatures where needed.

Comment on lines +177 to 183
let cdata = CDataObject::from_stg_info(&stg_info);
PyCUnion {
cdata: PyRwLock::new(CDataObject::from_stg_info(&stg_info)),
_base: PyCData::new(cdata.clone()),
cdata: PyRwLock::new(cdata),
}
.into_ref_with_type(vm, cls)
.map(Into::into)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

CDataObject is cloned unnecessarily.

The code creates a CDataObject, clones it for _base, and then wraps the original in cdata. This means two separate Vec<u8> buffers exist with identical content. If one is modified, the other won't reflect the change, which could lead to subtle bugs.

🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/union.rs around lines 177 to 183, the code
constructs a CDataObject and clones it so two independent buffers exist; instead
share the same underlying CDataObject to avoid duplicate Vec<u8> allocations and
divergent state. Change the code to create one shared CDataObject wrapped in a
reference-counted pointer (e.g., Arc or Rc as appropriate for the threading
model) and pass clones of that Arc to both PyCData::new and PyRwLock::new so
both fields reference the same buffer rather than performing a deep clone.

Comment on lines +187 to +188
let tuple =
<Self::Data as ::rustpython_vm::types::PyStructSequenceData>::into_tuple(data, vm);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how PyStructSequenceData is organized in the types module
echo "=== Checking types module structure ==="
fd -t f "mod.rs" crates/vm/src/types/ -exec cat {} \;

echo ""
echo "=== Checking for PyStructSequenceData definition and re-exports ==="
rg -n "trait PyStructSequenceData|pub use.*PyStructSequenceData" crates/vm/src/types/ --type rust

echo ""
echo "=== Check the current file structseq.rs for trait definition ==="
rg -n "trait PyStructSequenceData|pub.*struct.*Data" crates/vm/src/types/structseq.rs --type rust

Repository: RustPython/RustPython

Length of output: 635


🏁 Script executed:

#!/bin/bash
# Check the context around lines 187-188 to understand the change
echo "=== Context around lines 187-188 in structseq.rs ==="
sed -n '180,200p' crates/vm/src/types/structseq.rs

echo ""
echo "=== Check what Data is bound to (look for the full method signature) ==="
sed -n '175,195p' crates/vm/src/types/structseq.rs

Repository: RustPython/RustPython

Length of output: 1939


Remove the unnecessary fully qualified trait path; use simpler syntax.

The trait PyStructSequenceData is defined in this file and re-exported in crates/vm/src/types/mod.rs, so it is already in scope. The fully qualified syntax <Self::Data as ::rustpython_vm::types::PyStructSequenceData>::into_tuple(data, vm) is unnecessarily verbose.

Use simpler syntax instead: Self::Data::into_tuple(data, vm) or just rely on method call syntax data.into_tuple(vm).

🤖 Prompt for AI Agents
In crates/vm/src/types/structseq.rs around lines 187 to 188, the code uses a
fully qualified trait path `<Self::Data as
::rustpython_vm::types::PyStructSequenceData>::into_tuple(data, vm)` which is
unnecessarily verbose; replace it with the simpler form
`Self::Data::into_tuple(data, vm)` or call the method directly as
`data.into_tuple(vm)` since PyStructSequenceData is already in scope via the
module re-export; update that line accordingly to remove the fully qualified
path.

Comment on lines +400 to +401

print("done")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# First, find the snippet runner code and output handling
echo "=== Searching for snippet runner code ==="
rg -n "extra_tests.*snippets|snippets.*runner" --hidden -g '!target' . | head -20

echo -e "\n=== Searching for references to stdlib_ctypes ==="
rg -n "stdlib_ctypes" . --hidden -g '!target' | head -20

echo -e "\n=== Looking for extra_tests directory structure ==="
find . -path './target' -prune -o -type f -name "*.py" -path "*extra_tests*" -print | head -20

Repository: RustPython/RustPython

Length of output: 1153


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for snippet harness/runner configuration
echo "=== Searching for snippet runner or harness code ==="
fd -H "snippet" --type f -e py -e sh -e toml -e yaml . | grep -v target | head -30

echo -e "\n=== Checking extra_tests directory ==="
ls -la extra_tests/ 2>/dev/null | head -20

echo -e "\n=== Looking for expected output or snapshot files ==="
find extra_tests -type f \( -name "*.expected" -o -name "*.out" -o -name "*.txt" -o -name "*snapshot*" \) 2>/dev/null | head -20

Repository: RustPython/RustPython

Length of output: 945


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Read the file around lines 400-401
echo "=== Context around lines 400-401 in stdlib_ctypes.py ==="
if [ -f "extra_tests/snippets/stdlib_ctypes.py" ]; then
  wc -l extra_tests/snippets/stdlib_ctypes.py
  echo ""
  # Show context: lines 390-410
  sed -n '390,410p' extra_tests/snippets/stdlib_ctypes.py | cat -n
else
  echo "File not found"
fi

echo -e "\n=== Searching for how extra_tests snippets are run ==="
rg -n "import|exec|subprocess|test" -A 3 -B 3 "extra_tests" . --hidden -g '!target' -g '!*.pyc' | head -50

Repository: RustPython/RustPython

Length of output: 2790


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if there's a test runner script or CI configuration
echo "=== Looking for test runner scripts ==="
find . -path './target' -prune -o -type f \( -name "*.py" -o -name "*.sh" \) -print | xargs grep -l "extra_tests.*snippets\|snippets.*extra_tests" 2>/dev/null | head -20

echo -e "\n=== Checking CI configuration ==="
ls -la .github/ 2>/dev/null | grep -E "\.yml|\.yaml"

echo -e "\n=== Checking for Makefile or build scripts ==="
ls -la | grep -E "Makefile|makefile|build|test"

Repository: RustPython/RustPython

Length of output: 405


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# First check the actual file
echo "=== Reading stdlib_ctypes.py file size and end ==="
if [ -f "extra_tests/snippets/stdlib_ctypes.py" ]; then
  echo "File exists"
  tail -50 extra_tests/snippets/stdlib_ctypes.py | cat -n
else
  echo "File not found at that path"
  # Try to find it
  find . -name "stdlib_ctypes.py" -path "*extra_tests*" 2>/dev/null
fi

echo -e "\n=== Searching for how snippets are invoked ==="
rg -n "\.py.*test|exec|subprocess" extra_tests/ -g '!*.pyc' -g '!target' 2>/dev/null | head -30

Repository: RustPython/RustPython

Length of output: 4671


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for Python files that might be test runners or invoke extra_tests
echo "=== Searching for test runner files ==="
find . -path ./target -prune -o -type f -name "*.py" -print | xargs grep -l "extra_tests" 2>/dev/null

echo -e "\n=== Checking if there's a conftest or pytest config ==="
find . -path ./target -prune -o -type f \( -name "conftest.py" -o -name "pytest.ini" -o -name "pyproject.toml" \) -print

echo -e "\n=== Looking at the actual file to understand context ==="
head -1 extra_tests/snippets/stdlib_ctypes.py 2>/dev/null || echo "File not found"
tail -20 extra_tests/snippets/stdlib_ctypes.py 2>/dev/null | cat -n

Repository: RustPython/RustPython

Length of output: 1250


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for any documentation or readme
echo "=== Checking for documentation ==="
find . -path ./target -prune -o -type f \( -name "README*" -o -name "*.md" \) -path "*extra_tests*" -print

# Look for Rust code that might handle snippet execution
echo -e "\n=== Searching for Rust test/snippet handling ==="
find . -path ./target -prune -o -type f -name "*.rs" -print | xargs grep -l "extra_tests\|snippet" 2>/dev/null | head -10

Repository: RustPython/RustPython

Length of output: 212


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# List the extra_tests directory
echo "=== Extra tests directory structure ==="
ls -la extra_tests/ 2>/dev/null

echo -e "\n=== Snippets directory ==="
ls -la extra_tests/snippets/ 2>/dev/null | head -30

echo -e "\n=== Check if stdlib_ctypes.py exists ==="
if [ -f "extra_tests/snippets/stdlib_ctypes.py" ]; then
  echo "File exists"
  echo "File size: $(wc -l < extra_tests/snippets/stdlib_ctypes.py) lines"
  echo "Last 10 lines:"
  tail -10 extra_tests/snippets/stdlib_ctypes.py
else
  echo "File does not exist at expected location"
  find . -name "stdlib_ctypes.py" 2>/dev/null
fi

Repository: RustPython/RustPython

Length of output: 2704


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Try to find any test-related configuration or runner
find . -type f \( -name "*.toml" -o -name "*.yaml" -o -name "*.yml" -o -name "*.cfg" \) ! -path "*/target/*" | xargs grep -l "test\|snippet" 2>/dev/null | head -10

Repository: RustPython/RustPython

Length of output: 363


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check git to see recent changes to stdlib_ctypes.py
echo "=== Git status and recent changes ==="
git status 2>/dev/null | head -30 || echo "Git not available in this view"

# Try to get the actual file listing to verify directory exists
echo -e "\n=== Checking directory structure directly ==="
test -d "extra_tests" && echo "extra_tests exists" || echo "extra_tests not found"
test -d "extra_tests/snippets" && echo "extra_tests/snippets exists" || echo "extra_tests/snippets not found"

# List any Python files in extra_tests
echo -e "\n=== Python files in extra_tests ==="
find extra_tests -name "*.py" 2>/dev/null | head -20 || echo "Could not find extra_tests"

Repository: RustPython/RustPython

Length of output: 1066


🌐 Web query:

RustPython extra_tests snippets stdout handling

💡 Result:

Brief summary and tips

  • Rust test harness (and many Rust test runners) capture stdout of passing tests by default; use --nocapture / -- --show-output to see printed output. This explains why println!/stdout may be hidden during tests. [1]

  • RustPython’s extra_tests/snippets are Python tests included in the repo; they run under RustPython’s test-execution infrastructure and can be affected by the Rust-side test runner’s output-capture. See the RustPython contribution guide for running/unskipping CPython unittest files. [2]

  • If you need deterministic capture behavior when running tests with nextest/other runners, configure the runner’s capture strategy (Split, Combined, None) or pass the runner flag to disable capture. For nextest, CaptureStrategy controls stdout/stderr capture. [3]

Practical actions

  • Run cargo test with the test-runner flag to disable capture: cargo test -- --nocapture (or cargo nextest run with capture strategy None / appropriate flag for your runner). [1][3]
  • If you want output from failing Python snippets only, rely on the default behavior (captured output is printed when a test fails) or explicitly write to stderr from the snippet so it’s more likely to appear. [1]
  • When contributing to RustPython extra_tests, follow the repo guidance (skip/expectedFailure) to avoid noisy output while iterating. [2]

Sources
[1] Stack Overflow — Why doesn't println! work in Rust unit tests? (explanation of test stdout capture and --nocapture).
[2] RustPython blog — How to contribute by CPython unittest (notes about extra_tests/snippets).
[3] nextest-runner docs — CaptureStrategy (Split/Combined/None) for controlling stdout/stderr capture.


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check for any documentation about testing
echo "=== Root level documentation ==="
ls -la *.md *.txt *.rst 2>/dev/null | head -20

# Check for Makefile or test scripts in root
echo -e "\n=== Root level test/build scripts ==="
ls -la Makefile* makefile* test* run* 2>/dev/null || echo "None found"

# Try to find any Cargo.toml that might reference test configuration
echo -e "\n=== Checking Cargo.toml for test configuration ==="
if [ -f "Cargo.toml" ]; then
  head -50 Cargo.toml
fi

Repository: RustPython/RustPython

Length of output: 288


🏁 Script executed:

#!/bin/bash
# Simple listing to understand structure
pwd
ls -la | head -30

Repository: RustPython/RustPython

Length of output: 1913


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Examine the test_snippets.py to understand how snippets are executed
echo "=== test_snippets.py content ==="
head -80 extra_tests/test_snippets.py

Repository: RustPython/RustPython

Length of output: 2197


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for any stdout/stderr assertions or expected output handling
echo "=== Checking for output validation in test framework ==="
rg -n "stdout|stderr|expected|output|assert" extra_tests/test_snippets.py -A 2 -B 2

Repository: RustPython/RustPython

Length of output: 121


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check extra_tests/README.md for documentation on snippet testing
echo "=== extra_tests/README.md ==="
cat extra_tests/README.md

Repository: RustPython/RustPython

Length of output: 735


Remove the unconditional print("done") — it adds unnecessary noise to test output without providing validation value.

The snippet test harness (extra_tests/test_snippets.py) uses subprocess.check_call() which only validates exit codes, not stdout content. While this print won't cause test failures, it clutters CI logs with output that serves no testing purpose. The commented-out print on the preceding line demonstrates the correct pattern when testing is not needed.

Suggested diff:

-
-print("done")
🤖 Prompt for AI Agents
In extra_tests/snippets/stdlib_ctypes.py around lines 400-401, remove the
unconditional print("done") which only clutters CI logs; delete that line (or
revert it to the commented example on the previous line) so the test produces no
unnecessary stdout while preserving the existing behavior validated by exit
codes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
crates/vm/src/vm/context.rs (1)

511-526: Custom exception types created with basicsize=0 will trigger the debug_assert in vm.new_exception().

Exception types created via new_exception_type() (lines 511–526) set slots.basicsize = 0, but vm.new_exception() asserts that exc_type.slots.basicsize == std::mem::size_of::<PyBaseException>(). This affects all stdlib exception types created with new_exception_type() (e.g., binascii.Error, socket.error, etc.), which are raised via vm.new_exception_msg()vm.new_exception(). Debug builds will fail when these exceptions are raised.

Set basicsize to std::mem::size_of::<PyBaseException>() instead of 0 to match the assertion.

crates/vm/src/stdlib/ctypes/base.rs (1)

423-428: Remove redundant storage: PyCSimple unnecessarily maintains two independent copies of CDataObject.

PyCSimple stores _base: PyCData (containing PyRwLock<CDataObject>) and a separate cdata: PyRwLock<CDataObject>, both initialized from the same CDataObject. This pattern appears across all ctypes classes (PyCStructure, PyCUnion, PyCArray, etc.). The _base field is never accessed after initialization; all code uses the cdata field directly. This creates two independent copies that can diverge if either is modified, and suggests a design issue with how the PyO3 class inheritance pattern is being used.

Consider:

  • Whether the _base field can be removed if the Python inheritance contract can be satisfied differently
  • Consolidating to a single source of truth for CDataObject
  • Documenting why both copies are intentionally maintained if this pattern is required
crates/vm/src/stdlib/ctypes/array.rs (1)

197-214: Two independent CDataObjects per instance + in_dll keepalive written to only one copy.

Right now _base and cdata hold separate CDataObject clones. Beyond memory overhead, this can break correctness for in_dll: you set base = Some(dll) on PyCArray.cdata (Line 209-211 / 762-765), but _base’s copy won’t get that update—so any base-class logic (or GC traversal) that reads _base’s backing store may miss the DLL keepalive reference.

Recommendation: make _base the canonical storage and have all reads/writes go through it (or share storage via Arc), and ensure the base keepalive is applied to the canonical CDataObject.

Also applies to: 241-248, 307-317, 538-547, 606-615, 665-674, 751-768

crates/vm/src/stdlib/io.rs (1)

20-73: Prefer not to unwrap() when setting winerror attr (avoid panic during exception construction).

exc.as_object().set_attr(...).unwrap() can panic while handling an I/O error, which is a pretty bad failure mode. Return/ignore the error explicitly instead (you already treat this as a best-effort “FIXME” path).

-            exc.as_object()
-                .set_attr("winerror", vm.new_pyobj(winerror), vm)
-                .unwrap();
+            let _ = exc
+                .as_object()
+                .set_attr("winerror", vm.new_pyobj(winerror), vm);
crates/vm/src/exceptions.rs (1)

1548-1602: The downcast_ref::<PyOSError>().unwrap() will panic when called with OSError subclass instances.

The code comment claims all OSError subclasses use #[repr(transparent)] wrapping, enabling safe downcasting. However, downcast_ref checks type identity via TypeId comparison, not memory layout. When slot_init is inherited and called with a FileNotFoundError instance, the TypeId is FileNotFoundError, not PyOSError, so downcast_ref returns None and unwrap() panics.

Replace the unwrap with safe error handling:

-            #[allow(deprecated)]
-            let exc: &Py<PyOSError> = zelf.downcast_ref::<PyOSError>().unwrap();
+            #[allow(deprecated)]
+            let Some(exc) = zelf.downcast_ref::<PyOSError>() else {
+                return Err(vm.new_runtime_error(
+                    "internal error: OSError slot_init called with non-OSError payload",
+                ));
+            };
♻️ Duplicate comments (5)
crates/vm/src/types/structseq.rs (1)

186-193: Simplify the fully-qualified trait call in from_data() (already suggested earlier).
You can call Self::Data::into_tuple(data, vm) or data.into_tuple(vm) instead of <Self::Data as ::rustpython_vm::types::PyStructSequenceData>::into_tuple(...).

crates/vm/src/object/core.rs (1)

1076-1103: Replace unreachable_unchecked() in into_base() / upcast() to avoid UB on invariant violations (raised earlier).
Even if the invariant “should be impossible” holds, a macro/layout bug would become instant UB instead of a debuggable panic.

     pub fn into_base(self) -> PyRef<T::Base> {
         let obj: PyObjectRef = self.into();
         match obj.downcast() {
             Ok(base_ref) => base_ref,
-            Err(_) => unsafe { std::hint::unreachable_unchecked() },
+            Err(_) => unreachable!("PySubclass invariant violated: into_base downcast failed"),
         }
     }

     pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U>
     where
         T: StaticType,
     {
         debug_assert!(T::static_type().is_subtype(U::static_type()));
         let obj: PyObjectRef = self.into();
         match obj.downcast::<U>() {
             Ok(upcast_ref) => upcast_ref,
-            Err(_) => unsafe { std::hint::unreachable_unchecked() },
+            Err(_) => unreachable!(
+                "PySubclass invariant violated: upcast to {} failed",
+                std::any::type_name::<U>()
+            ),
         }
     }
crates/vm/src/stdlib/ctypes/union.rs (1)

124-129: CDataObject is duplicated (deep-cloned) between _base and cdata → wasted memory + potential state divergence.

Each constructor builds one CDataObject, then clones it into _base and also stores another copy in cdata. If either side is mutated (buffer/base/objects), they can silently diverge.

Prefer a single canonical storage:

  • Either remove cdata and route all access through _base’s storage, or
  • Change PyCData to share the same underlying buffer/lock (e.g., store an Arc<PyRwLock<CDataObject>> and clone the Arc, not the CDataObject>).

(Already raised in prior review on earlier commits.)

Also applies to: 175-184, 209-216, 257-264, 296-303

crates/stdlib/src/ssl.rs (1)

412-427: WantRead/WantWrite should not set errno (set it to None).

SSL_ERROR_WANT_READ/WRITE are SSL-layer codes, not OS errno values. CPython’s ssl.SSLWantReadError / ssl.SSLWantWriteError typically expose errno = None, so passing Some(SSL_ERROR_WANT_READ/WRITE) is misleading and diverges from expected behavior (as discussed in the earlier review).

Proposed change:

 pub(super) fn create_ssl_want_read_error(vm: &VirtualMachine) -> PyRef<PyOSError> {
     vm.new_os_subtype_error(
         PySSLWantReadError::class(&vm.ctx).to_owned(),
-        Some(SSL_ERROR_WANT_READ),
+        None,
         "The operation did not complete (read)",
     )
 }

 pub(super) fn create_ssl_want_write_error(vm: &VirtualMachine) -> PyRef<PyOSError> {
     vm.new_os_subtype_error(
         PySSLWantWriteError::class(&vm.ctx).to_owned(),
-        Some(SSL_ERROR_WANT_WRITE),
+        None,
         "The operation did not complete (write)",
     )
 }
crates/vm/src/exceptions.rs (1)

1438-1482: GC traversal for PyOSError now includes base — fixes the earlier GC-safety concern.
This addresses the missing-base traversal issue from prior review feedback.

🧹 Nitpick comments (7)
crates/derive-impl/src/pyclass.rs (1)

431-502: Generated PySubclass impl assumes tuple struct layout (&self.0)—guard this or you’ll get hard compile errors later.
Right now, subclass_impl is generated for any #[repr(transparent)] + base, but it uses &self.0, which only works for tuple structs. Since validate_base_field() permits named structs, a future #[repr(transparent)] struct X { base: Base, ... } will break codegen.

Practical fix: plumb a “is_tuple_struct” boolean into generate_class_def(...) (or generate subclass_impl inside impl_pyclass(...) where you have access to the Item::Struct fields).

crates/vm/src/stdlib/ast/pyast.rs (1)

81-108: Consider making NodeMod explicitly #[repr(transparent)] too (consistency / future-proof).
NodeStmt is explicitly transparent, but NodeMod isn’t even though it’s also a single-field wrapper. If any codegen or safety checks start keying off repr(transparent), this difference may become surprising.

 #[pyclass(module = "_ast", name = "mod", base = NodeAst)]
-pub(crate) struct NodeMod(NodeAst);
+#[repr(transparent)]
+pub(crate) struct NodeMod(NodeAst);
crates/vm/src/builtins/bool.rs (1)

222-228: Potential panic if called on non-PyBool object.

The get_value function uses unwrap() on the downcast result, which will panic if called with an object that isn't a PyBool. All current call sites in this file check fast_isinstance(vm.ctx.types.bool_type) before calling get_value, but this is a fragile contract.

Consider either:

  1. Adding a debug assertion or comment documenting the precondition
  2. Returning an Option<bool> or using expect() with a descriptive message
 pub(crate) fn get_value(obj: &PyObject) -> bool {
-    !obj.downcast_ref::<PyBool>()
-        .unwrap()
-        .0
-        .as_bigint()
-        .is_zero()
+    !obj.downcast_ref::<PyBool>()
+        .expect("get_value called on non-bool object")
+        .0
+        .as_bigint()
+        .is_zero()
 }
crates/stdlib/src/ssl.rs (1)

1868-1881: Avoid hardcoding ENOENT as 2.

Using Some(2) is brittle; prefer the platform constant (e.g., libc::ENOENT or a helper that maps std::io::ErrorKind::NotFound to the VM’s errno plumbing).

At minimum, please confirm this behaves correctly on Windows targets in your CI matrix.

crates/vm/src/stdlib/io.rs (1)

425-461: _IOBase::__closed returning PyRef<PyBool> is the right typing improvement; consider using a shared false if available.
The typed return aligns with the PR’s “subclass-typed parameters” goal. If Context has an interned false object, prefer that to avoid repeated allocations.

crates/vm/src/exceptions.rs (2)

1524-1546: slot_new errno→subtype normalization is good; be careful about silently swallowing constructor failures.
vm.invoke_exception(...).ok() drops error details; if that’s intentional (best-effort normalization), consider a comment to clarify why failures should fall back rather than propagate.


1565-1595: Prefer using the safe swap_to_temporary_refs path in slot_init, not unsafe ...swap(...).
You already use swap_to_temporary_refs in setters; doing the same in slot_init would likely let you delete the unsafe blocks and reduce the chance of GC-barrier footguns.

Also applies to: 1683-1735

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 666979e and 941b373.

📒 Files selected for processing (27)
  • crates/derive-impl/src/pyclass.rs (8 hunks)
  • crates/derive-impl/src/pystructseq.rs (2 hunks)
  • crates/stdlib/src/socket.rs (4 hunks)
  • crates/stdlib/src/ssl.rs (20 hunks)
  • crates/stdlib/src/ssl/compat.rs (5 hunks)
  • crates/vm/src/builtins/bool.rs (4 hunks)
  • crates/vm/src/builtins/int.rs (1 hunks)
  • crates/vm/src/class.rs (1 hunks)
  • crates/vm/src/exception_group.rs (1 hunks)
  • crates/vm/src/exceptions.rs (13 hunks)
  • crates/vm/src/object/core.rs (2 hunks)
  • crates/vm/src/object/payload.rs (3 hunks)
  • crates/vm/src/stdlib/ast/pyast.rs (14 hunks)
  • crates/vm/src/stdlib/ctypes.rs (3 hunks)
  • crates/vm/src/stdlib/ctypes/array.rs (8 hunks)
  • crates/vm/src/stdlib/ctypes/base.rs (4 hunks)
  • crates/vm/src/stdlib/ctypes/field.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/function.rs (6 hunks)
  • crates/vm/src/stdlib/ctypes/pointer.rs (7 hunks)
  • crates/vm/src/stdlib/ctypes/structure.rs (6 hunks)
  • crates/vm/src/stdlib/ctypes/union.rs (6 hunks)
  • crates/vm/src/stdlib/io.rs (22 hunks)
  • crates/vm/src/stdlib/os.rs (1 hunks)
  • crates/vm/src/types/structseq.rs (1 hunks)
  • crates/vm/src/vm/context.rs (7 hunks)
  • crates/vm/src/vm/vm_new.rs (3 hunks)
  • extra_tests/snippets/stdlib_ctypes.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • crates/vm/src/stdlib/os.rs
  • crates/vm/src/class.rs
  • crates/stdlib/src/ssl/compat.rs
  • crates/vm/src/stdlib/ctypes.rs
  • extra_tests/snippets/stdlib_ctypes.py
  • crates/vm/src/exception_group.rs
  • crates/vm/src/stdlib/ctypes/field.rs
  • crates/vm/src/builtins/int.rs
  • crates/vm/src/stdlib/ctypes/function.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/derive-impl/src/pystructseq.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/object/payload.rs
  • crates/vm/src/stdlib/ctypes/pointer.rs
  • crates/vm/src/stdlib/ctypes/base.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/stdlib/ctypes/structure.rs
  • crates/stdlib/src/ssl.rs
  • crates/vm/src/types/structseq.rs
  • crates/stdlib/src/socket.rs
  • crates/vm/src/stdlib/ctypes/array.rs
  • crates/vm/src/stdlib/ast/pyast.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/stdlib/ctypes/union.rs
  • crates/vm/src/builtins/bool.rs
  • crates/vm/src/stdlib/io.rs
  • crates/vm/src/exceptions.rs
🧠 Learnings (6)
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/derive-impl/src/pystructseq.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/vm/src/vm/context.rs
  • crates/stdlib/src/socket.rs
  • crates/vm/src/stdlib/ast/pyast.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/builtins/bool.rs
  • crates/vm/src/exceptions.rs
📚 Learning: 2025-12-09T08:46:58.660Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.

Applied to files:

  • crates/vm/src/object/payload.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/vm/src/vm/context.rs
  • crates/stdlib/src/socket.rs
  • crates/vm/src/stdlib/ast/pyast.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/exceptions.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.

Applied to files:

  • crates/stdlib/src/ssl.rs
  • crates/stdlib/src/socket.rs
  • crates/vm/src/exceptions.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.

Applied to files:

  • crates/vm/src/types/structseq.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.

Applied to files:

  • crates/vm/src/types/structseq.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/**/*.py : Minimize modifications to CPython standard library files in the `Lib/` directory; modifications should be minimal and only to work around RustPython limitations

Applied to files:

  • crates/vm/src/stdlib/io.rs
🧬 Code graph analysis (9)
crates/derive-impl/src/pystructseq.rs (3)
crates/vm/src/object/payload.rs (4)
  • payload_type_id (30-32)
  • std (31-31)
  • validate_downcastable_from (41-43)
  • class (54-54)
crates/vm/src/builtins/type.rs (1)
  • fast_issubclass (629-631)
crates/vm/src/builtins/str.rs (1)
  • try_traverse (1928-1930)
crates/vm/src/stdlib/ctypes/pointer.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (3)
  • new (223-227)
  • new (261-267)
  • from_bytes (184-191)
crates/vm/src/stdlib/ctypes/base.rs (1)
crates/vm/src/stdlib/ctypes/function.rs (1)
  • value (52-52)
crates/derive-impl/src/pyclass.rs (2)
crates/vm/src/object/payload.rs (4)
  • std (31-31)
  • class (54-54)
  • payload_type_id (30-32)
  • validate_downcastable_from (41-43)
crates/vm/src/class.rs (1)
  • as_base (187-187)
crates/vm/src/stdlib/ctypes/structure.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
  • from_stg_info (174-181)
  • new (223-227)
  • new (261-267)
  • from_bytes (184-191)
crates/vm/src/stdlib/ctypes/array.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (5)
  • from_bytes (184-191)
  • new (223-227)
  • new (261-267)
  • buffer (1032-1032)
  • buffer (1042-1042)
crates/vm/src/stdlib/ast/pyast.rs (1)
crates/derive-impl/src/util.rs (1)
  • base (416-418)
crates/vm/src/vm/vm_new.rs (1)
crates/vm/src/exceptions.rs (5)
  • errno (1533-1534)
  • errno (1535-1535)
  • errno (1685-1687)
  • errno_to_exc_type (1163-1189)
  • errno_to_exc_type (1192-1194)
crates/vm/src/stdlib/ctypes/union.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
  • from_stg_info (174-181)
  • new (223-227)
  • new (261-267)
  • from_bytes (184-191)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (37)
crates/derive-impl/src/pyclass.rs (2)

718-729: Good: stop scanning with(...) once Constructor is found.
This reduces wasted work and matches the intent expressed in prior review.


611-638: Verify the semantic intent of the BASICSIZE check for repr(transparent) wrappers.

The size/subclass check is sound, but for #[repr(transparent)] types with base=, the BASICSIZE validation effectively becomes a no-op: the macro generates BASICSIZE = 0 (since the wrapper has identical layout to its base), and at runtime slots.basicsize is automatically inherited to match the base's size. This means the check BASICSIZE <= obj.class().slots.basicsize becomes 0 <= base_size, which always passes.

This design is correct—BASICSIZE = 0 semantically signals "no additional payload"—but confirm that:

  1. The size check isn't intended as a safety guard against under-allocated objects (since it's effectively always-true for repr(transparent))
  2. The actual validation responsibility falls entirely to the fast_issubclass() call for these types
  3. All code paths that construct repr(transparent) types correctly inherit slots.basicsize from their base
crates/vm/src/stdlib/ast/pyast.rs (1)

6-16: Macro wrapper layout looks correct for the new base-first-field scheme.
The generated #[repr(transparent)] struct $name($base); is consistent with the base-field validation and PySubclass expectations.

crates/vm/src/object/payload.rs (2)

19-26: Good refactor: central cold_downcast_type_error() improves call-site clarity.

Also applies to: 50-52


102-121: Good safety check: reject subclass construction when basicsize differs.
This closes a class of “valid subtype but incompatible payload layout” issues and provides a clear error.

crates/vm/src/stdlib/ctypes/pointer.rs (2)

11-15: PyCPointerType wrapper layout looks consistent with the new subclass/payload approach.


58-68: The _base field design for pointer types is intentional and correct; no refactoring is needed.

The concern about pointer backing storage is based on a misunderstanding of the implementation. PyCPointer types intentionally store their value in a separate contents field (a PyRwLock) rather than in the _base.buffer. This design is consistent with how other complex ctypes like PyCArray also separate the inherited base class (_base: PyCData) from type-specific data storage. The empty buffer in _base is not a limitation—it's the correct representation for pointer types, which are value wrappers rather than memory buffers. All five pointer instantiation sites consistently use this pattern, there are no buffer size checks for pointers, and no downstream code attempts to access or depend on _base.buffer for pointer types.

crates/vm/src/vm/context.rs (1)

35-37: Bool caching via PyRef<PyBool> looks consistent and keeps new_bool() fast.

Also applies to: 306-307, 450-457

crates/vm/src/vm/vm_new.rs (1)

95-107: Good: new_exception() now asserts payload size matches PyBaseException.
This is helpful for catching layout regressions during the subclass-payload transition.

crates/vm/src/stdlib/ctypes/base.rs (3)

222-228: LGTM - New PyCData constructor.

The new constructor properly wraps the CDataObject in a PyRwLock, providing a clean API for creating PyCData instances.


238-241: LGTM - repr(transparent) for PyCSimpleType.

The #[repr(transparent)] attribute ensures PyCSimpleType has the same memory layout as its inner PyType, which is correct for the newtype wrapper pattern being adopted across this PR.


659-664: Data cloned into both _base and cdata.

The CDataObject is cloned and stored in both _base (via PyCData::new(cdata.clone())) and cdata (via PyRwLock::new(cdata)). This means modifications to one won't reflect in the other. If this is intentional (e.g., _base is only for type hierarchy and cdata is the working copy), please add a comment clarifying this design choice.

crates/derive-impl/src/pystructseq.rs (3)

448-452: LGTM - Newtype wrapper for PyStructSequence.

The generated struct correctly uses #[repr(transparent)] to wrap PyTuple, ensuring the memory layout matches the inner type. The public inner field (pub) allows direct access when needed.


481-496: LGTM - PyPayload implementation for struct sequences.

The implementation correctly:

  • Delegates payload_type_id() to PyTuple to share the same type ID
  • Uses fast_issubclass in validate_downcastable_from to properly handle subclass relationships
  • Returns the static type from class()

This aligns with the external code snippets showing similar patterns.


498-513: LGTM - MaybeTraverse and PySubclass implementations.

The MaybeTraverse implementation correctly delegates to the inner PyTuple for GC traversal. The PySubclass implementation properly defines the base type relationship and provides access to the inner tuple via as_base().

crates/stdlib/src/socket.rs (5)

18-18: LGTM - PyOSError import added.

The import aligns with the updated error handling that now uses PyOSError subtypes.


1852-1852: LGTM - Timeout error now uses upcast.

The timeout_error(vm).upcast() correctly converts the PyRef<PyOSError> to PyBaseExceptionRef for the exception handling flow.


2420-2421: LGTM - Improved socket error construction.

Using new_os_subtype_error instead of new_exception preserves the specific error subtype (gaierror/herror), includes the error number, and maintains the error message. This provides better error information for debugging.


2424-2429: LGTM - Updated timeout error functions.

Both timeout_error and timeout_error_msg now return PyRef<PyOSError>, using new_os_subtype_error for consistent error construction with the timeout exception class. The None for errno is appropriate since timeouts don't have a specific system error number.


1829-1833: LGTM - From implementation for PyOSError.

This enables seamless conversion from PyRef<PyOSError> to IoOrPyException via upcast(), which safely converts the specialized exception type to its base PyBaseExceptionRef type through the generic method defined in crate::vm::object::core.

crates/vm/src/stdlib/ctypes/structure.rs (4)

17-20: LGTM - repr(transparent) for PyCStructType.

Consistent with the pattern applied to PyCSimpleType in base.rs, this ensures proper memory layout for the newtype wrapper.


222-229: Same duplication pattern as PyCSimple.

PyCStructure now has both _base: PyCData and cdata: PyRwLock<CDataObject>, following the same pattern as PyCSimple. The concern about potential inconsistencies between the two copies applies here as well.


299-304: Consistent construction pattern with PyCSimple.

The construction correctly creates a CDataObject and uses it for both _base and cdata. This is consistent with the approach in base.rs.


370-377: LGTM - Consistent construction in factory methods.

All factory methods (from_address, from_buffer, from_buffer_copy) follow the same pattern of creating a CDataObject and populating both _base and cdata fields consistently.

Also applies to: 423-430, 468-475

crates/vm/src/builtins/bool.rs (3)

4-4: LGTM - Import consolidation.

Minor formatting change to consolidate imports on one line.


88-97: LGTM - PyBool as repr(transparent) wrapper around PyInt.

The #[repr(transparent)] attribute ensures PyBool has the same layout as PyInt, enabling proper subclass payload handling. The Debug implementation correctly derives the boolean value from the inner bigint.

Note: The past review comment mentioned concerns about #[repr(C)] vs #[repr(transparent)] for base field layout. Using #[repr(transparent)] here is correct since it guarantees the same layout as the inner type.


21-34: Behavior change: TryFromBorrowedObject now accepts PyInt in addition to bool.

The implementation downcasts to PyInt instead of checking for the bool_type, accepting any integer where zero is falsy and non-zero is truthy. This aligns with Python semantics where bool is a subclass of int and bool(0) returns False while bool(n) returns True for any non-zero n.

Verify that this reflects the intended behavior: does the VM's type system follow Python's class hierarchy where bool is a subclass of int, or should bool conversion only accept actual bool instances? Check call sites and tests to confirm the expected usage.

crates/vm/src/stdlib/ctypes/union.rs (1)

15-44: Metaclass newtype looks fine; ensure fmt/clippy are clean.

The #[repr(transparent)] PyCUnionType(PyType) pattern is sensible for the new subclass payload layout; just make sure cargo fmt + cargo clippy are run since these metaclass wrappers can surface lint/trait/derive edge cases quickly.

crates/vm/src/stdlib/ctypes/array.rs (1)

25-28: PyCArrayType(PyType) wrapper is consistent with the PR direction.

crates/vm/src/stdlib/io.rs (5)

204-207: new_unsupported_operation() using new_os_subtype_error() is a nice consolidation; double-check the “errno None” semantics.
This matches the direction of the PR (centralized OS subtype construction). Just ensure callers don’t rely on UnsupportedOperation.errno being set in cases where CPython would leave it unset.


639-769: repr(transparent) base wrappers (_RawIOBase/_BufferedIOBase/_TextIOBase) look consistent; verify macro expectations around layout.
This is a core mechanism for the new payload layout. Please sanity-check that the #[pyclass(base = _IOBase)] machinery treats these as valid base payloads (esp. ZST base + transparent wrapper) across all supported compilers/targets.


1765-1950: Embedded _base fields: good direction; keep _base first everywhere to avoid layout surprises.
Most structs correctly place _base as the first field—please keep this invariant for any follow-ups (it matters a lot for transparent/base-embedded subclass payload schemes).

Also applies to: 2375-2382, 3656-3663, 3801-3810, 4440-4473


4245-4267: Singleton UnsupportedOperation type: avoid Context::genesis() if “multiple contexts” is a supported future.
If multiple Contexts are possible, consider keying the cache by Context (or storing on Context itself) rather than a process-global static created from genesis. If multiple contexts are impossible by design, a short comment here would prevent future refactors from breaking it.


82-97: The singleton Context assumption is architectural, not a bug. The code correctly uses Context::genesis() which is a true singleton via static initialization. All exception types and special objects in the codebase consistently rely on this same singleton Context design. This is an intentional architectural choice, not a hidden assumption. No changes are needed unless RustPython's design fundamentally shifts to support independent per-VM Contexts—which would require refactoring the entire context system, not just this function.

crates/vm/src/exceptions.rs (3)

966-969: Exception zoo extension for PyOSError is correctly simplified.
Dropping dynamic attribute wiring now that PyOSError has real fields is the right cleanup.


1235-1437: Broad #[repr(transparent)] wrapper rewrite looks consistent; please keep wrappers truly “transparent.”
Given how much relies on layout here, ensure none of these wrapper types ever gain extra fields (even accidentally) without also updating the subclass-payload assumptions. A brief doc comment near the pattern could help.

Also applies to: 1737-2204


1485-1522: No changes needed—current implementation is CPython-compatible.

The code correctly handles filename2 at index 4 when 5 arguments are provided. CPython uses the same argument structure across all platforms: OSError(errno, strerror, filename, winerror, filename2), where winerror at index 3 is simply unused/ignored on POSIX systems rather than being repurposed. There is no separate 4-argument form for POSIX systems.

Likely an incorrect or invalid review comment.

Comment on lines +109 to +142
pub fn new_os_error(&self, msg: impl ToPyObject) -> PyRef<PyBaseException> {
self.new_os_subtype_error(self.ctx.exceptions.os_error.to_owned(), None, msg)
.upcast()
}

pub fn new_os_subtype_error(
&self,
exc_type: PyTypeRef,
errno: Option<i32>,
msg: impl ToPyObject,
) -> PyRef<PyOSError> {
debug_assert_eq!(exc_type.slots.basicsize, std::mem::size_of::<PyOSError>());
let msg = msg.to_pyobject(self);

fn new_os_subtype_error_impl(
vm: &VirtualMachine,
exc_type: PyTypeRef,
errno: Option<i32>,
msg: PyObjectRef,
) -> PyRef<PyOSError> {
let args = match errno {
Some(e) => vec![vm.new_pyobj(e), msg],
None => vec![msg],
};
let payload =
PyOSError::py_new(&exc_type, args.into(), vm).expect("new_os_error usage error");
payload
.into_ref_with_type(vm, exc_type)
.expect("new_os_error usage error")
}

new_os_subtype_error_impl(self, exc_type, errno, msg)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Searching for call sites of new_os_subtype_error..."
rg -n --type=rust "new_os_subtype_error\\(" -S || true

echo
echo "Searching for new_exception usages referencing os_error or OSError..."
rg -n --type=rust "new_exception\\(.*os_error|new_exception\\(.*OSError" -S || true

echo
echo "Show vm/src/vm/vm_new.rs around the function for context (lines 1-220)..."
sed -n '1,220p' crates/vm/src/vm/vm_new.rs || true

Repository: RustPython/RustPython

Length of output: 10670


Avoid panics when constructing OSError subtypes

new_os_subtype_error_impl() calls PyOSError::py_new(...).expect(...) and into_ref_with_type(...).expect(...), which will panic in release builds if exc_type isn't a proper OSError subtype. Change the API to avoid hard panics — either return PyResult<PyRef> or keep the current signature but replace the expects with a non-panicking path that raises a Python TypeError (e.g., vm.new_type_error(...)) so misuse produces a Python exception instead of aborting the process. Applies to crates/vm/src/vm/vm_new.rs (lines ~109–141) and the similar usage around lines ~259–264.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (10)
crates/vm/src/stdlib/ctypes/function.rs (1)

212-227: Don’t silently coerce non-usize / negative “address” into NULL.
to_usize().unwrap_or(0) turns overflow/negative values into 0, which can later trigger confusing “Function pointer not set” / NULL-call behavior. Return a Python exception instead.

-        if let Ok(addr) = first_arg.try_int(vm) {
-            let ptr_val = addr.as_bigint().to_usize().unwrap_or(0);
+        if let Ok(addr) = first_arg.try_int(vm) {
+            let ptr_val = addr
+                .as_bigint()
+                .to_usize()
+                .ok_or_else(|| vm.new_overflow_error("function address does not fit in usize".to_owned()))?;
             return PyCFuncPtr {
                 _base: PyCData::new(CDataObject::from_bytes(vec![], None)),
                 ptr: PyRwLock::new(Some(CodePtr(ptr_val as *mut _))),
                 // ...
             }
             .into_ref_with_type(vm, cls)
             .map(Into::into);
         }
crates/vm/src/stdlib/ctypes/base.rs (1)

423-428: Derived cdata duplicates (and can desync from) base _base.cdata.
Because CDataObject contains an owned Vec<u8>, cdata.clone() copies the buffer; mutations through one path won’t be reflected in the other. This is especially risky since base methods (defined on _CData) will observe _base.cdata, while many derived ops mutate self.cdata.

Concrete fix direction: ensure a single shared backing store (e.g., Arc<PyRwLock<CDataObject>> in PyCData) or eliminate the duplicated cdata field from derived types and always use _base.cdata.

Also applies to: 658-668

crates/vm/src/stdlib/ctypes/structure.rs (1)

215-229: Same deep-clone backing-store split: _base and cdata won’t stay in sync.
All constructor paths build _base from cdata.clone(), but Clone on CDataObject copies the Vec<u8>. If any behavior reads/writes through the base _CData API, it can observe different bytes/objects than the structure’s own cdata.

Suggested fix is the same as for PyCSimple: unify on a single shared backing store (or store only _base and access its lock everywhere).

Also applies to: 296-304, 369-377, 422-431, 467-476

crates/vm/src/stdlib/ctypes/array.rs (5)

95-124: Avoid unwrap() and usize overflow in __mul__ (can panic / mis-size allocations).

  • let type_ref = zelf.downcast_ref::<PyType>().unwrap(); should be a TypeError instead of a panic.
  • let total_size = new_element_size * (n as usize); should be checked_mul, returning OverflowError on overflow.
-        let type_ref = zelf.downcast_ref::<PyType>().unwrap();
+        let type_ref = zelf
+            .downcast_ref::<PyType>()
+            .ok_or_else(|| vm.new_type_error("expected a type object".to_owned()))?;

         let new_element_size = inner_size;
-        let total_size = new_element_size * (n as usize);
+        let total_size = new_element_size
+            .checked_mul(n as usize)
+            .ok_or_else(|| vm.new_overflow_error("array size too large".to_owned()))?;

197-214: CDataObject is stored twice (deep clone) → double memory + potential divergent state.

This pattern:

  • let cdata = ...;
  • _base: PyCData::new(cdata.clone()),
  • cdata: PyRwLock::new(cdata),

very likely duplicates the Vec<u8> buffer and allows _base vs self.cdata to get out of sync.

Strongly consider making _base the only owner of the CDataObject and routing all accesses through it (or refactor PyCData to share a single lock/Arc with PyCArray).

 pub struct PyCArray {
     _base: PyCData,
-    pub(super) cdata: PyRwLock<CDataObject>,
+    // cdata should live in _base (single source of truth)
 }

241-248: PyCArray layout likely correct for subclass payloads, but cdata duplication should be removed.

Same root issue as above: keeping both _base: PyCData and cdata: PyRwLock<CDataObject> suggests two separate buffers.


290-317: Unchecked element_size * length can overflow; allocations should fail as Python errors (not wrap/panic).

let total_size = element_size * length; let mut buffer = vec![0u8; total_size]; should use checked_mul + map allocation failure to MemoryError / OverflowError.

-        let total_size = element_size * length;
-        let mut buffer = vec![0u8; total_size];
+        let total_size = element_size
+            .checked_mul(length)
+            .ok_or_else(|| vm.new_overflow_error("array size too large".to_owned()))?;
+        let mut buffer = vec![0u8; total_size];

(If you want to be stricter, consider Vec::try_reserve_exact to avoid aborting on OOM.)


507-548: All constructors repeat the _base = PyCData::new(cdata.clone()) / cdata = PyRwLock::new(cdata) duplication.

Please consolidate to a single canonical CDataObject storage (ideally via _base) across:

  • from_address
  • from_buffer
  • from_buffer_copy
  • in_dll

Otherwise modifications via buffer protocol / setters may not reflect in _base and vice versa.

Also applies to: 550-616, 617-675, 676-769

crates/vm/src/stdlib/io.rs (1)

20-73: Avoid unwrap() on set_attr for Windows winerror assignment (panic path).
Line 70 will abort the VM on an unexpected attribute failure; prefer propagating/ignoring error consistently (you already tolerate stderr failures elsewhere).

-            exc.as_object()
-                .set_attr("winerror", vm.new_pyobj(winerror), vm)
-                .unwrap();
+            let _ = exc
+                .as_object()
+                .set_attr("winerror", vm.new_pyobj(winerror), vm);
crates/vm/src/exceptions.rs (1)

1556-1602: Critical: slot_init will panic when called on OSError subclass instances.

This __init__ is inherited by all OSError subclasses. When invoked with zelf being a subclass instance (e.g., FileNotFoundError), downcast_ref::<PyOSError>() returns None because it checks exact typeid match, not subclass relationship. The unwrap() then panics.

The comment claims "pointer cast" but the code uses downcast_ref, which is type-ID based. Since OSError subclasses are #[repr(transparent)] wrappers with identical memory layout, use unsafe pointer casting instead:

-            #[allow(deprecated)]
-            let exc: &Py<PyOSError> = zelf.downcast_ref::<PyOSError>().unwrap();
+            // SAFETY: OSError subclasses are #[repr(transparent)] wrappers around PyOSError.
+            // Memory layout is guaranteed identical, enabling safe pointer cast access.
+            let exc: &Py<PyOSError> = unsafe {
+                &*(zelf.as_object_ref() as *const _ as *const Py<PyOSError>)
+            };

(Adjust the casting pattern to match the project's conventions for transparent wrapper access.)

♻️ Duplicate comments (5)
crates/vm/src/object/core.rs (1)

1076-1103: Unsafe unreachable_unchecked concern already raised in prior review.

The past review correctly identified that using unsafe { std::hint::unreachable_unchecked() } can cause undefined behavior if type invariants are violated (e.g., due to macro bugs). The suggested fix to use unreachable!() with descriptive messages remains applicable.

crates/vm/src/vm/vm_new.rs (1)

109-141: Avoid panics when constructing OSError subtypes (still present).
This is the same issue previously raised: new_os_subtype_error_impl() uses .expect(...) and can abort the process if the passed exc_type is wrong. Please switch to a non-panicking path (return PyResult<PyRef<PyOSError>> or raise a Python TypeError).

Also applies to: 259-264

crates/derive-impl/src/pyclass.rs (1)

308-385: type_matches_path() is still too permissive for a layout-safety gate.
Matching only the last segment can accept the wrong base type from another module, undermining the base-field/layout invariant.

crates/vm/src/stdlib/ctypes/union.rs (1)

124-129: _base: PyCData + cdata: PyRwLock<CDataObject> still deep-clones the buffer (double alloc + divergence).

This repeats the previously raised issue: PyCData::new(cdata.clone()) very likely duplicates the Vec<u8> backing store and can lead to _base and cdata observing different bytes after mutation.

Recommendation remains: make _base the single source of truth for CDataObject, and route buffer access through it (or refactor PyCData to share a single lock/Arc with the outer type).

Also applies to: 176-184, 209-217, 257-264, 296-303

crates/stdlib/src/ssl.rs (1)

412-427: Regression: SSLWantReadError / SSLWantWriteError should not set errno (CPython keeps it None).

This reintroduces the earlier mismatch by using Some(SSL_ERROR_WANT_READ) / Some(SSL_ERROR_WANT_WRITE).

 pub(super) fn create_ssl_want_read_error(vm: &VirtualMachine) -> PyRef<PyOSError> {
     vm.new_os_subtype_error(
         PySSLWantReadError::class(&vm.ctx).to_owned(),
-        Some(SSL_ERROR_WANT_READ),
+        None,
         "The operation did not complete (read)",
     )
 }

 pub(super) fn create_ssl_want_write_error(vm: &VirtualMachine) -> PyRef<PyOSError> {
     vm.new_os_subtype_error(
         PySSLWantWriteError::class(&vm.ctx).to_owned(),
-        Some(SSL_ERROR_WANT_WRITE),
+        None,
         "The operation did not complete (write)",
     )
 }
In CPython's ssl module, do SSLWantReadError / SSLWantWriteError set the OSError errno attribute, or is it typically None?
🧹 Nitpick comments (4)
crates/vm/src/stdlib/ctypes/pointer.rs (1)

80-80: Consider extracting repeated base initialization to a helper.

All five construction sites use identical initialization:

_base: PyCData::new(CDataObject::from_bytes(vec![], None))

This could be simplified with a helper method like PyCData::empty() or similar, which would reduce duplication and make the intent clearer. This matches the pattern seen in other ctypes types and would improve maintainability.

Example helper (could be added to PyCData):

impl PyCData {
    pub fn empty() -> Self {
        Self::new(CDataObject::from_bytes(vec![], None))
    }
}

Also applies to: 130-130, 175-175, 212-212, 268-268

crates/vm/src/stdlib/ctypes/function.rs (1)

164-176: Avoid repeating ad-hoc _base initialization; also consider whether a 0-byte base buffer is sufficient for CFuncPtr semantics.
Right now every constructor path re-creates _base with an empty CDataObject. If CFuncPtr is expected to behave like other _CData instances (e.g., byref/addressof-style ops), a consistent base init helper will reduce drift and make it easier to adjust buffer sizing later.

 impl Constructor for PyCFuncPtr {
     type Args = FuncArgs;

     fn slot_new(cls: PyTypeRef, args: FuncArgs, vm: &VirtualMachine) -> PyResult {
+        fn new_base() -> PyCData {
+            PyCData::new(CDataObject::from_bytes(vec![], None))
+        }
         // ...
         if args.args.is_empty() {
             return PyCFuncPtr {
-                _base: PyCData::new(CDataObject::from_bytes(vec![], None)),
+                _base: new_base(),
                 // ...
             }
             // ...
         }
         // ...
         if let Ok(addr) = first_arg.try_int(vm) {
             let ptr_val = addr.as_bigint().to_usize().unwrap_or(0);
             return PyCFuncPtr {
-                _base: PyCData::new(CDataObject::from_bytes(vec![], None)),
+                _base: new_base(),
                 // ...
             }
             // ...
         }
         // ... apply similarly to the other constructor arms ...
     }
 }

Also applies to: 195-207, 215-227, 275-287, 319-331

crates/vm/src/stdlib/io.rs (1)

639-769: #[repr(transparent)] base wrappers: confirm derive-macro/layout expectations.
_RawIOBase(_IOBase) / _BufferedIOBase(_IOBase) / _TextIOBase(_IOBase) are a big semantic shift; please ensure the #[pyclass(base = _IOBase)] machinery expects an embedded base payload and that the tuple-field order matches what the new subclass payload layout requires.

crates/vm/src/exceptions.rs (1)

1683-1735: Getset backing fields via PyAtomicRef is fine; consider making setters return PyResult<()> if they can fail.
If swap_to_temporary_refs is infallible, current signature is fine; if it can allocate/GC-root in a fallible way, prefer surfacing errors.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 941b373 and 9017be0.

📒 Files selected for processing (21)
  • crates/derive-impl/src/pyclass.rs (8 hunks)
  • crates/stdlib/src/socket.rs (4 hunks)
  • crates/stdlib/src/ssl.rs (20 hunks)
  • crates/stdlib/src/ssl/compat.rs (5 hunks)
  • crates/vm/src/builtins/int.rs (1 hunks)
  • crates/vm/src/exception_group.rs (1 hunks)
  • crates/vm/src/exceptions.rs (13 hunks)
  • crates/vm/src/object/core.rs (2 hunks)
  • crates/vm/src/stdlib/ctypes.rs (3 hunks)
  • crates/vm/src/stdlib/ctypes/array.rs (8 hunks)
  • crates/vm/src/stdlib/ctypes/base.rs (4 hunks)
  • crates/vm/src/stdlib/ctypes/field.rs (1 hunks)
  • crates/vm/src/stdlib/ctypes/function.rs (6 hunks)
  • crates/vm/src/stdlib/ctypes/pointer.rs (7 hunks)
  • crates/vm/src/stdlib/ctypes/structure.rs (6 hunks)
  • crates/vm/src/stdlib/ctypes/union.rs (6 hunks)
  • crates/vm/src/stdlib/io.rs (22 hunks)
  • crates/vm/src/stdlib/os.rs (1 hunks)
  • crates/vm/src/vm/context.rs (7 hunks)
  • crates/vm/src/vm/vm_new.rs (3 hunks)
  • extra_tests/snippets/stdlib_ctypes.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/vm/src/builtins/int.rs
  • crates/vm/src/stdlib/ctypes/field.rs
  • crates/stdlib/src/ssl/compat.rs
  • extra_tests/snippets/stdlib_ctypes.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/stdlib/ctypes/pointer.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/vm/src/stdlib/ctypes/base.rs
  • crates/vm/src/stdlib/os.rs
  • crates/stdlib/src/socket.rs
  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/stdlib/ctypes/function.rs
  • crates/stdlib/src/ssl.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/stdlib/ctypes/union.rs
  • crates/vm/src/exception_group.rs
  • crates/vm/src/stdlib/ctypes/array.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/stdlib/io.rs
  • crates/vm/src/stdlib/ctypes/structure.rs
  • crates/vm/src/exceptions.rs
🧠 Learnings (6)
📚 Learning: 2025-12-09T08:46:58.660Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.

Applied to files:

  • crates/derive-impl/src/pyclass.rs
  • crates/stdlib/src/socket.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/exception_group.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/exceptions.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/derive-impl/src/pyclass.rs
  • crates/stdlib/src/socket.rs
  • crates/vm/src/stdlib/ctypes.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/exceptions.rs
📚 Learning: 2025-06-28T16:31:03.991Z
Learnt from: arihant2math
Repo: RustPython/RustPython PR: 5790
File: build.rs:2-2
Timestamp: 2025-06-28T16:31:03.991Z
Learning: In Cargo build scripts (build.rs), the environment variable CARGO_CFG_TARGET_OS is guaranteed to exist and is automatically set by Cargo during the build process, making unwrap() safe to use when accessing this variable.

Applied to files:

  • crates/vm/src/stdlib/os.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.

Applied to files:

  • crates/stdlib/src/socket.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/exceptions.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Follow Rust best practices for error handling and memory management

Applied to files:

  • crates/vm/src/vm/vm_new.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/**/*.py : Minimize modifications to CPython standard library files in the `Lib/` directory; modifications should be minimal and only to work around RustPython limitations

Applied to files:

  • crates/vm/src/stdlib/io.rs
🧬 Code graph analysis (8)
crates/vm/src/stdlib/ctypes/pointer.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (3)
  • new (223-227)
  • new (261-267)
  • from_bytes (184-191)
crates/vm/src/stdlib/ctypes/base.rs (4)
crates/vm/src/stdlib/ctypes/field.rs (1)
  • new (37-54)
crates/vm/src/stdlib/ctypes/function.rs (1)
  • value (52-52)
crates/vm/src/builtins/type.rs (2)
  • value (154-154)
  • value (964-964)
crates/vm/src/builtins/object.rs (1)
  • value (460-460)
crates/vm/src/stdlib/ctypes.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
  • from_bytes (184-191)
  • size (211-213)
  • new (223-227)
  • new (261-267)
crates/vm/src/stdlib/ctypes/function.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
  • ffi_type_from_str (23-46)
  • new (223-227)
  • new (261-267)
  • from_bytes (184-191)
crates/vm/src/vm/vm_new.rs (1)
crates/vm/src/exceptions.rs (7)
  • new (547-555)
  • new (1061-1063)
  • errno (1533-1534)
  • errno (1535-1535)
  • errno (1685-1687)
  • errno_to_exc_type (1163-1189)
  • errno_to_exc_type (1192-1194)
crates/vm/src/object/core.rs (2)
crates/vm/src/stdlib/ast/other.rs (1)
  • object (39-39)
crates/vm/src/class.rs (1)
  • static_type (24-32)
crates/vm/src/exception_group.rs (1)
crates/vm/src/stdlib/builtins.rs (1)
  • repr (777-779)
crates/vm/src/exceptions.rs (2)
crates/vm/src/vm/mod.rs (3)
  • class (553-562)
  • value (646-646)
  • value (648-648)
crates/common/src/os.rs (1)
  • winerror_to_errno (129-227)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (36)
crates/vm/src/stdlib/os.rs (1)

528-536: LGTM - Correct adaptation to new error type hierarchy.

The intermediate variable and .upcast() call properly converts the more specific PyRef<PyOSError> returned by new_errno_error to the generic PyBaseExceptionRef required by PyResult. This aligns with the PR's broader changes to return typed OS error references.

crates/vm/src/exception_group.rs (1)

49-50: LGTM - Correct newtype wrapper for subclass support.

The #[repr(transparent)] newtype wrapper around PyBaseException enables the PySubclass trait machinery while maintaining ABI compatibility. This allows PyBaseExceptionGroup to leverage the new into_base() and upcast() methods on PyRef<T>.

crates/vm/src/object/core.rs (1)

29-32: LGTM - Required imports for new upcast/downcast utilities.

These imports support the new into_base() and upcast() methods below.

crates/vm/src/vm/context.rs (6)

4-7: LGTM - Import updates for PyBool type changes.

The imports correctly reflect the new PyBool newtype wrapper and the updated StaticType path.

Also applies to: 17-17


35-36: LGTM - Type-safe boolean value fields.

Changing true_value and false_value from PyIntRef to PyRef<PyBool> provides stronger type guarantees, enabling callers to work with the more specific boolean type when needed.


283-285: LGTM - Simplified trait bound.

The PyPayload bound is removed since PyObjectPayload (or the type's usage context) already implies the necessary capabilities.


306-307: LGTM - PyBool newtype construction.

The boolean values are now created as PyBool(PyInt::from(...)) wrappers, aligning with the new subclass payload layout where PyBool wraps PyInt as its base.


450-456: LGTM - Updated return type for new_bool.

The signature now returns PyRef<PyBool> instead of PyIntRef, providing type-safe access to boolean values. Callers can use .upcast() or .into() if they need the value as a PyIntRef or PyObjectRef.


511-521: LGTM - Explicit slot construction for exception types.

Constructing PyTypeSlots explicitly with heap_type_flags() and HAS_DICT is appropriate for dynamically created exception types. This provides clearer control over the type's capabilities compared to deriving from a base's make_slots().

crates/vm/src/stdlib/ctypes/pointer.rs (3)

7-7: LGTM - Required imports for base payload pattern.

Importing CDataObject and PyCData enables the subclass payload layout used across ctypes types.


12-14: LGTM - repr(transparent) newtype for metaclass.

The PyCPointerType as a transparent wrapper around PyType enables proper metaclass behavior while supporting the PySubclass trait machinery.


64-67: LGTM - Base payload field for subclass layout.

Adding the _base: PyCData field follows the established pattern for ctypes types, enabling proper inheritance and memory layout for the subclass hierarchy.

crates/vm/src/stdlib/ctypes.rs (1)

18-18: Public re-export + internal import changes look consistent with the new base model.

Also applies to: 50-57

crates/vm/src/stdlib/ctypes/base.rs (2)

222-228: PyCData::new(...) helper is a nice addition.


238-242: repr(transparent) wrapper for PyCSimpleType matches the new base-field validation expectations.

crates/derive-impl/src/pyclass.rs (1)

431-449: Please double-check BASICSIZE = 0 semantics for repr(transparent)+base in the runtime allocator.
If 0 is meant to mean “no extra beyond base”, make sure the allocator/class machinery still uses the base’s actual basicsize for instance allocation (otherwise this risks under-allocation). A targeted test exercising allocation + field access for a repr(transparent) subclass would be ideal here.

crates/vm/src/stdlib/ctypes/structure.rs (1)

17-21: repr(transparent) PyCStructType(PyType) is aligned with the new subclass/base rules.

crates/stdlib/src/socket.rs (2)

1829-1833: From<PyRef<PyOSError>> for IoOrPyException is a clean integration for the new error path.

Also applies to: 1849-1856


2390-2422: Please ensure gaierror/herror/timeout are guaranteed OSError subtypes, or this can currently panic via new_os_subtype_error.
Given new_os_subtype_error still uses .expect(...), any accidental non-OSError subtype here would abort instead of raising a Python exception. Once new_os_subtype_error is made non-panicking, this becomes much safer.

Also applies to: 2424-2429

crates/vm/src/stdlib/ctypes/array.rs (1)

25-28: PyCArrayType(PyType) wrapper looks consistent with the new subclass/base layout.

crates/vm/src/stdlib/ctypes/union.rs (1)

15-18: PyCUnionType(PyType) wrapper looks aligned with the new base-payload approach.

crates/stdlib/src/ssl.rs (3)

40-43: Exception newtype wrappers + updated imports look consistent with the subclass payload refactor.

Also applies to: 346-410


1619-1622: set_ciphers correctly maps parse errors to SSLError (OS-subtype path).


1983-2000: Error mapping to SSLError/subtypes and .upcast() usage looks coherent after the refactor.

The protocol-mismatch errors, shutdown read/write guards, SNI invalid return-type error, EOF mapping, timeout mapping, and “unsupported PHA” path are all consistently expressed via new_os_subtype_error(...).upcast() and subtype classes.

Also applies to: 2053-2070, 2788-2801, 3346-3354, 3398-3422, 3474-3485, 3538-3542, 4096-4107

crates/vm/src/stdlib/io.rs (8)

204-207: New new_unsupported_operation() construction looks good, but confirm None errno is desired.
If new_os_subtype_error(type, errno, msg) uses errno for subclass selection or message formatting, ensure None doesn’t accidentally produce a default errno-dependent subtype.


425-503: _IOBase::__closed now returns bool — good, but double-check attribute semantics.
Returning PyRef<PyBool> aligns with Python closed: bool. Please verify no callers relied on int-like behavior (e.g., arithmetic) and that closed property still round-trips correctly through __closed.


1765-1950: Embedded _base fields on buffered types look consistent.
The _base: _BufferedIOBase additions + Default derivations appear coherent with the “base embedding” goal.


2375-2442: TextIOWrapper base embedding looks fine; watch locking API consistency.
The new _base: _TextIOBase field is fine; ensure Default doesn’t accidentally create a usable-but-uninitialized instance in Python space (you already gate operations on Option<TextIOData>).


3656-3826: StringIO/BytesIO constructors updated correctly for embedded base.
_base: Default::default() in both constructors matches the new layout and keeps initialization explicit.


4440-4473: FileIO base embedding + Default updated cleanly.
_base: Default::default() and the new field layout are consistent with the rest of the IO refactor.


82-97: Consider documenting the implicit assumption that all VirtualMachine instances share Context::genesis().

The pattern of using _io::unsupported_operation().to_owned() — which stores a PyType created from Context::genesis() in a static singleton — is consistent with how exception_group.rs handles exception_group(). However, this relies on an implicit design invariant: the public Interpreter::with_init() API always initializes VirtualMachine with Context::genesis(). While VirtualMachine::new() accepts arbitrary contexts, this invariant is currently undocumented. Consider adding a comment to Context::genesis() or the VirtualMachine constructor clarifying this assumption, or verify the design prevents custom contexts from being used with the standard module initialization.


4245-4267: The current design is intentional: Context::genesis() creates a single global context with shared builtin types. The use of Context::genesis() in unsupported_operation() is consistent with how other builtin exception types (e.g., exception_group()) are implemented. RustPython's architecture assumes one genesis context per application; multiple contexts are not a supported usage pattern. Identity checks via fast_isinstance work correctly because all VMs use the same shared static type instance, not separate per-context instances.

Likely an incorrect or invalid review comment.

crates/vm/src/exceptions.rs (4)

966-969: Dropping dynamic attrs for PyOSError is a good simplification.
Relying on getsets for standard fields should reduce per-instance dict traffic and make layout more predictable.


1438-1482: GC traversal delegation via self.base.try_traverse(...) looks right.
Good to see the embedded base being traversed before visiting the extra optional fields.


1524-1546: Subtype selection in slot_new is reasonable; ensure it can’t recurse/loop.
The vm.invoke_exception(typ, args_vec) call path should not come back into the errno-specialization branch for non-OSError classes (your cls.name() guard seems to prevent it). Worth sanity-checking with a small test for OSError(EPIPE, ...) etc.


1597-1601: __reduce__ / args truncation: verify it matches CPython for filename2 on non-Windows.
Once filename2 is fixed for 4-arg non-Windows, ensure __reduce__ includes it (and that truncating base args to 2 doesn’t lose reconstruction fidelity).

Also applies to: 1646-1681

Comment on lines +485 to +503
// Generate PySubclass impl for #[repr(transparent)] types with base class
// (tuple struct assumed, so &self.0 works)
let subclass_impl = if !is_pystruct && is_repr_transparent {
base.as_ref().map(|typ| {
quote! {
impl ::rustpython_vm::class::PySubclass for #ident {
type Base = #typ;

#[inline]
fn as_base(&self) -> &Self::Base {
&self.0
}
}
}
})
} else {
None
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

PySubclass::as_base() assumes tuple structs (self.0) but validate_base_field also allows named-field structs.
For #[repr(transparent)] struct S { base: Base } this will generate invalid code. Either (a) restrict repr(transparent)+base to tuple structs, or (b) generate as_base() using the actual first field (named vs unnamed) discovered during validation.

🤖 Prompt for AI Agents
In crates/derive-impl/src/pyclass.rs around lines 485-503, the generated
PySubclass::as_base() unconditionally uses tuple struct access (&self.0) which
is invalid for named-field repr(transparent) structs; update the codegen to use
the actual first field determined during validation: have the validator record
whether the single field is named or unnamed and its identifier or index, then
emit as_base() using &self.<field_ident> for named fields and &self.0 for
unnamed tuple fields (alternatively, you may tighten validation to only allow
tuple structs and emit a clear compile error for named fields).

Comment on lines 1272 to 1280
cert::validate_cert_key_match(&certs, &key).map_err(|e| {
vm.new_exception_msg(
PySSLError::class(&vm.ctx).to_owned(),
if e.contains("key values mismatch") {
"[SSL: KEY_VALUES_MISMATCH] key values mismatch".to_owned()
} else {
e
},
)
let msg = if e.contains("key values mismatch") {
"[SSL: KEY_VALUES_MISMATCH] key values mismatch".to_owned()
} else {
e
};
vm.new_os_subtype_error(PySSLError::class(&vm.ctx).to_owned(), Some(0), msg)
.upcast()
})?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

CPython ssl module SSLError key values mismatch errno

💡 Result:

Short answer: that SSLError comes from OpenSSL’s X509_check_private_key failing — the private key doesn’t match the certificate (or the certificate chain is ordered/packed incorrectly). Python’s ssl module surfaces that OpenSSL error as ssl.SSLError (often shown as "key values mismatch"); the underlying OpenSSL error queue can make the reported errno/message come from an earlier call, so the numeric errno/text may look mismatched. [1][2][3][4]

How to confirm and fix (quick):

  • Verify the certificate and private key match (compare moduli or public keys):
    • openssl x509 -noout -modulus -in cert.crt | openssl sha256
    • openssl rsa -noout -modulus -in privkey.key | openssl sha256
      (or compare public keys with openssl pkey -pubout / openssl x509 -noout -pubkey | openssl sha256). [1][2]
  • If using a chain file, ensure the leaf certificate appears first and intermediates follow; wrong order can trigger checks to fail. [2][3]
  • If you still see confusing errno/text, it can be caused by stale errors left on OpenSSL’s error queue by earlier API calls; clearing/consuming the queue (or fixing the code path that left the error) is required — this is a known cause in other language bindings (see bug reports). [4]

If you want, paste the exact SSLError traceback and the openssl command outputs and I’ll help interpret them.


🌐 Web query:

SSL_ERROR_* constants CPython ssl.SSLError errno

💡 Result:

Short answer — which SSL_ERROR_* constants exist in CPython's ssl module, and how they relate to ssl.SSLError.errno:

  • Common SSL_ERROR_* constants exposed by the ssl module (enum ssl.SSLErrorNumber) include:

    • SSL_ERROR_ZERO_RETURN
    • SSL_ERROR_WANT_READ
    • SSL_ERROR_WANT_WRITE
    • SSL_ERROR_WANT_X509_LOOKUP
    • SSL_ERROR_SYSCALL
    • SSL_ERROR_SSL
    • SSL_ERROR_WANT_CONNECT
    • SSL_ERROR_EOF
    • SSL_ERROR_INVALID_ERROR_CODE. [1][2]
  • How they appear in exceptions:

    • ssl.SSLError (and SSLError.args) can contain one of those SSL_ERROR_* codes (available as the ssl.SSLErrorNumber IntEnum since Python 3.6). [1]
    • The SSLError.errno attribute is not always an OS errno; it may be one of the SSL_ERROR_* codes or None for many SSL errors (some SSL errors do not correspond to a single OS error). See Python issue discussion that explains errno can be None for non-OS errors. [3]

References

  • Python ssl module docs (constants & SSLErrorNumber). [1]
  • Extracted constants listing from stdlib/third-party mirror. [2]
  • Python issue explaining SSLError.errno semantics (errno may be None). [3]

If you want, I can:

  • show these constants programmatically from your installed Python version (via a small snippet you can run), or
  • map typical cases (e.g. SSL_ERROR_SYSCALL) to when underlying OS errno is used.

🏁 Script executed:

# Search the codebase to understand how SSLError errno is typically set in this file
rg -n "new_os_subtype_error" crates/stdlib/src/ssl.rs -A 2 -B 2

Repository: RustPython/RustPython

Length of output: 5989


🏁 Script executed:

# Check how errno values are used elsewhere in ssl.rs for context
rg -n "Some\(" crates/stdlib/src/ssl.rs | grep -E "(errno|error|SSL)" | head -20

Repository: RustPython/RustPython

Length of output: 565


🏁 Script executed:

# Search for SSL_ERROR_SSL or other SSL_ERROR constants defined in the file
rg -n "SSL_ERROR" crates/stdlib/src/ssl.rs | head -30

Repository: RustPython/RustPython

Length of output: 595


🏁 Script executed:

# Check what constants are defined for SSL errors
rg -n "const.*SSL_ERROR" crates/stdlib/src/ssl.rs

Repository: RustPython/RustPython

Length of output: 508


🏁 Script executed:

# Look at cert validation error handling pattern in context
sed -n '1300,1320p' crates/stdlib/src/ssl.rs

Repository: RustPython/RustPython

Length of output: 972


🏁 Script executed:

# Check function context for line 1278
sed -n '1250,1285p' crates/stdlib/src/ssl.rs

Repository: RustPython/RustPython

Length of output: 1556


🏁 Script executed:

# Check function context for line 1310
sed -n '1290,1325p' crates/stdlib/src/ssl.rs

Repository: RustPython/RustPython

Length of output: 1709


Change Some(0) to None for consistent SSLError errno semantics.

Line 1278 uses Some(0) (SSL_ERROR_NONE) for the key mismatch error, but identical key validation errors on lines 1303 and 1314 use None. Since SSL_ERROR_NONE should never be used as an errno for an actual error condition, and CPython's ssl module raises SSLError with errno as None for generic SSL errors, this should be None for consistency with the rest of the codebase.

vm.new_os_subtype_error(PySSLError::class(&vm.ctx).to_owned(), None, msg)
🤖 Prompt for AI Agents
In crates/stdlib/src/ssl.rs around lines 1272 to 1280, the SSLError construction
uses Some(0) for the errno which is inconsistent with the other key validation
errors at lines ~1303 and ~1314 and wrongly represents an actual error with
SSL_ERROR_NONE; change the errno argument from Some(0) to None when calling
vm.new_os_subtype_error so the SSLError is created with errno None, matching
CPython semantics and the surrounding code.

Comment on lines +1870 to 1881
// Create FileNotFoundError with errno=ENOENT (2)
let exc = vm.new_os_subtype_error(
vm.ctx.exceptions.file_not_found_error.to_owned(),
vec![
vm.ctx.new_int(2).into(), // errno = ENOENT (2)
vm.ctx.new_str("No such file or directory").into(),
vm.ctx.new_str(path_str.clone()).into(), // filename
],
Some(2), // errno = ENOENT (2)
"No such file or directory",
);
return Err(exc);
// Set filename attribute
let _ = exc
.as_object()
.set_attr("filename", vm.ctx.new_str(path_str.clone()), vm);
return Err(exc.upcast());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -name "ssl.rs" -type f 2>/dev/null | head -5

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# First, find the ssl.rs file
find . -path "*/stdlib/src/ssl.rs" -type f

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Search for ssl.rs in the repository
fd -t f "ssl.rs" | head -10

Repository: RustPython/RustPython

Length of output: 118


🏁 Script executed:

# Read the ssl.rs file around lines 1870-1881 with context
sed -n '1850,1900p' ./crates/stdlib/src/ssl.rs | cat -n

Repository: RustPython/RustPython

Length of output: 2713


🏁 Script executed:

# Get more context - read from line 1850 to 1900 to see the function signature
head -1900 ./crates/stdlib/src/ssl.rs | tail -60 | cat -n

Repository: RustPython/RustPython

Length of output: 3016


🏁 Script executed:

# Let's check the file size first
wc -l ./crates/stdlib/src/ssl.rs

Repository: RustPython/RustPython

Length of output: 96


🏁 Script executed:

# Search for how other modules use new_os_subtype_error and errno
rg -A5 "new_os_subtype_error" --type rs | head -80

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Search for FileNotFoundError usage patterns
rg -B3 -A3 "file_not_found_error" --type rs | head -100

Repository: RustPython/RustPython

Length of output: 91


🏁 Script executed:

# Check how CPython's ssl module handles file not found errors
web_search

Repository: RustPython/RustPython

Length of output: 115


🏁 Script executed:

# Search for how other modules use new_os_subtype_error
rg "new_os_subtype_error" -A5 | head -100

Repository: RustPython/RustPython

Length of output: 5611


🏁 Script executed:

# Search for FileNotFoundError patterns in the codebase
rg "file_not_found_error" -B3 -A3 | head -100

Repository: RustPython/RustPython

Length of output: 7049


🏁 Script executed:

# Check if there's any errno constant definition or usage patterns
rg "errno.*2\|ENOENT" --max-count 20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check how std::fs errors are handled elsewhere
rg "std::fs::" | grep -i "error\|map_err" | head -20

Repository: RustPython/RustPython

Length of output: 483


🏁 Script executed:

# Check for errno/ENOENT constants
rg "ENOENT|errors::" crates/vm/src/exceptions.rs | head -20

Repository: RustPython/RustPython

Length of output: 1148


🏁 Script executed:

# Look at the full socket.rs pattern for errno extraction
rg -B5 -A5 "error_num\(\)" crates/stdlib/src/socket.rs | head -40

Repository: RustPython/RustPython

Length of output: 966


🏁 Script executed:

# Check the error mapping in exceptions.rs more carefully
rg -B2 -A2 "ENOENT" crates/vm/src/exceptions.rs

Repository: RustPython/RustPython

Length of output: 364


🌐 Web query:

CPython ssl.load_dh_params FileNotFoundError errno Windows POSIX behavior

💡 Result:

Short answer

  • ssl.SSLContext.load_dh_params(dhfile) opens the file with the C runtime and calls OpenSSL's PEM_read_DHparams; if fopen/_Py_fopen_obj fails Python exposes the OS error (FileNotFoundError / OSError) with errno. [1][2]
  • On POSIX this typically behaves as you expect (fopen fails → errno = ENOENT → FileNotFoundError). On Windows the same underlying fopen can also fail for other reasons (path encoding/mbcs issues, wide/ANSI conversion, or Windows-specific fopen behavior), so you can also get FileNotFoundError or other OSError variants even when a path that looks valid to Python strings is used. This is a known class of platform differences for ssl file-loading APIs. [2][3]

Why this happens (brief technical outline)

  • CPython's _ssl implementation calls _Py_fopen_obj(filepath, "rb") and then PEM_read_DHparams on the FILE*. If _Py_fopen_obj returns NULL, CPython inspects errno and raises an OSError (FileNotFoundError if errno==ENOENT). If PEM_read_DHparams returns NULL with errno==0, a different SSL error is raised. (Code and reproduction notes shown in community / source snippets). [2]
  • Differences in how Windows converts Python filename objects to the C fopen call (MBCS/ANSI vs wide APIs used by Python internals, and what OpenSSL or the C runtime expects) produce platform-dependent failures; similar issues have been reported and fixed historically for other ssl file-loading functions (e.g. load_verify_locations). [3][4]

What to do

  • Verify the file exists and is readable from Python (os.path.exists, open(...,'rb')). If that succeeds but load_dh_params still fails, the problem is likely an encoding/ABI/path-form issue on Windows (non-ASCII characters, reserved names, long paths). Use a plain ASCII path or a path exposed via a file descriptor opened in Python and re-create the file in a known-good location. [1][2]
  • If you need a robust workaround on Windows for non-encodable paths, open the file yourself in Python (open(path,'rb')) to confirm accessibility; report/attach a minimal reproducer to the CPython issue tracker if fopen succeeds but load_dh_params still fails. See linked reports about related Windows path/encoding behavior in the ssl module. [3][4]

Sources

  • CPython ssl docs (SSLContext.load_dh_params). [1]
  • Community excerpt showing the _ssl.c implementation calling _Py_fopen_obj and handling errno / PEM_read_DHparams. [2]
  • Python bug tracker / issue about Windows path/encoding differences for ssl.load_verify_locations (same class of problem). [3]
  • Python ssl module documentation note about platform-dependent behavior (OpenSSL/OS differences). [4]

References
[1] Python ssl docs — SSLContext.load_dh_params. (docs.python.org)
[2] Community discussion / snippet showing _ssl__SSLContext_load_dh_params implementation and errno handling. (community.letsencrypt.org)
[3] Python issue: SSLContext.load_verify_locations Windows encoding/path behavior. (bugs.python.org)
[4] Python ssl module notes on platform/OpenSSL differences. (docs.python.org)

Citations:


🏁 Script executed:

# Check the stat module mentioned in learnings for platform-specific handling pattern
rg -B5 -A5 "SF_SUPPORTED\|SF_SYNTHETIC" crates/stdlib/src/stat.rs | head -50

Repository: RustPython/RustPython

Length of output: 130


🏁 Script executed:

# Check if there's a utility function to convert std::io::Error to proper errno
rg -B3 -A3 "e.kind\(\)" crates/stdlib/src/ | head -60

Repository: RustPython/RustPython

Length of output: 3925


🏁 Script executed:

# Check how into_pyexception() works
rg -B3 -A5 "into_pyexception" crates/stdlib/src/ssl.rs | head -60

Repository: RustPython/RustPython

Length of output: 1890


🏁 Script executed:

# Check the trait definition or implementation
rg "into_pyexception" crates/ | grep -E "(fn|impl|trait)" | head -20

Repository: RustPython/RustPython

Length of output: 1166


🏁 Script executed:

# Look at the specific load_dh_params code more carefully
sed -n '1840,1895p' ./crates/stdlib/src/ssl.rs | cat -n

Repository: RustPython/RustPython

Length of output: 2880


🏁 Script executed:

# Check the into_pyexception implementation for std::io::Error
rg -B5 -A15 "impl.*into_pyexception.*std::io::Error|fn into_pyexception.*io::Error" crates/

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check os.rs implementation
sed -n '1,100p' crates/vm/src/stdlib/os.rs | grep -A20 "into_pyexception"

Repository: RustPython/RustPython

Length of output: 982


🏁 Script executed:

# Find the exact implementation in os.rs
rg -B2 -A10 "fn into_pyexception" crates/vm/src/stdlib/os.rs

Repository: RustPython/RustPython

Length of output: 668


🏁 Script executed:

# Find the io::Error into_pyexception implementation
rg -B5 -A15 "impl.*IntoPyException.*for.*io::Error" crates/

Repository: RustPython/RustPython

Length of output: 1183


🏁 Script executed:

# Check the convert/to_pyobject.rs file for io::Error
rg -B5 -A20 "io::Error" crates/vm/src/convert/to_pyobject.rs | head -80

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Find to_pyexception implementation
rg -B5 -A20 "fn to_pyexception" crates/vm/src/stdlib/io.rs | head -100

Repository: RustPython/RustPython

Length of output: 1160


🏁 Script executed:

# Check the full to_pyexception implementation to see how it maps errno to exception types
rg -B5 -A50 "fn to_pyexception" crates/vm/src/stdlib/io.rs | tail -80

Repository: RustPython/RustPython

Length of output: 2135


Refactor to eliminate the existence check and use proper error handling. Instead of checking Path::exists() and hardcoding errno=2, try to read the file directly and match on std::io::ErrorKind to convert errors properly via into_pyexception(). This approach:

  • Properly extracts errno from the actual OS error (works on Windows and POSIX)
  • Matches the pattern used elsewhere in the same file
  • Fixes the race condition where the file could be deleted between the existence check and read
🤖 Prompt for AI Agents
In crates/stdlib/src/ssl.rs around lines 1870 to 1881, remove the Path::exists()
pre-check and instead attempt to open/read the file directly, match on the
resulting std::io::Error (using error.kind()) and convert it to the appropriate
Python exception via into_pyexception() (as done elsewhere in this file), then
set the "filename" attribute on that created exception and return
Err(exc.upcast()); this preserves the actual OS errno/info (works on
Windows/POSIX), avoids the TOCTOU race, and follows the existing
error-conversion pattern.

Comment on lines 1484 to 1522
// OS Errors:
#[pyexception]
impl PyOSError {
#[cfg(not(target_arch = "wasm32"))]
fn optional_new(args: Vec<PyObjectRef>, vm: &VirtualMachine) -> Option<PyBaseExceptionRef> {
let len = args.len();
if (2..=5).contains(&len) {
let errno = &args[0];
errno
.downcast_ref::<PyInt>()
.and_then(|errno| errno.try_to_primitive::<i32>(vm).ok())
.and_then(|errno| super::errno_to_exc_type(errno, vm))
.and_then(|typ| vm.invoke_exception(typ.to_owned(), args.to_vec()).ok())
impl Constructor for PyOSError {
type Args = FuncArgs;

fn py_new(_cls: &Py<PyType>, args: FuncArgs, vm: &VirtualMachine) -> PyResult<Self> {
let len = args.args.len();
// CPython only sets errno/strerror when args len is 2-5
let (errno, strerror) = if (2..=5).contains(&len) {
(Some(args.args[0].clone()), Some(args.args[1].clone()))
} else {
(None, None)
};
let filename = if (3..=5).contains(&len) {
Some(args.args[2].clone())
} else {
None
}
};
let filename2 = if len == 5 {
args.args.get(4).cloned()
} else {
None
};
// Truncate args for base exception when 3-5 args
let base_args = if (3..=5).contains(&len) {
args.args[..2].to_vec()
} else {
args.args.to_vec()
};
let base_exception = PyBaseException::new(base_args, vm);
Ok(Self {
base: PyException(base_exception),
errno: errno.into(),
strerror: strerror.into(),
filename: filename.into(),
filename2: filename2.into(),
#[cfg(windows)]
winerror: None.into(),
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bug: filename2 arg indexing is wrong on non-Windows (4-arg OSError).
On non-Windows, CPython uses OSError(errno, strerror, filename, filename2) (4 args). Current logic only sets filename2 when len == 5 and uses index 4, so filename2 will be silently dropped for the common 4-arg case.

-            let filename2 = if len == 5 {
-                args.args.get(4).cloned()
-            } else {
-                None
-            };
+            #[cfg(windows)]
+            let filename2 = if len == 5 { args.args.get(4).cloned() } else { None };
+            #[cfg(not(windows))]
+            let filename2 = if len >= 4 { args.args.get(3).cloned() } else { None };
🤖 Prompt for AI Agents
In crates/vm/src/exceptions.rs around lines 1484 to 1522, filename2 is only set
when len == 5 which drops the 4-argument OSError form; change the filename2
assignment to also handle the common 4-arg case by checking for len == 4 (use
args.get(3).cloned()) and keep the len == 5 branch (args.get(4).cloned()), so
filename2 is populated for both 4- and 5-argument OS errors.

Comment on lines 371 to 379
let size = get_size(&tp_str);
let cdata = CDataObject::from_bytes(vec![0u8; size], None);
Ok(PyCSimple {
_base: PyCData::new(cdata.clone()),
_type_: tp_str,
value: AtomicCell::new(vm.ctx.none()),
cdata: rustpython_common::lock::PyRwLock::new(CDataObject::from_bytes(
vec![0u8; size],
None,
)),
cdata: rustpython_common::lock::PyRwLock::new(cdata),
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

cdata.clone() deep-copies the buffer; base/derived will diverge.
If the goal is “one underlying C data backing store”, current wiring creates two independent CDataObjects (because Vec<u8> clones). Anything operating via the base (PyCData) can see different bytes/objects than code operating via the derived field.

Actionable direction: make the backing storage shared (e.g., Arc<PyRwLock<CDataObject>> inside PyCData, and have derived types hold only _base: PyCData and access self._base.cdata), or remove the duplicated cdata field from derived types entirely and standardize on the base field.

🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes.rs around lines 371–379, the code clones the
CDataObject into both PyCData (base) and the derived PyCSimple, creating two
independent buffers; instead make the backing storage shared so base and derived
view the same bytes. Fix by removing the duplicated cdata field from derived
types (or change PyCData to hold an Arc<PyRwLock<CDataObject>> and have derived
types keep only _base: PyCData and access the shared cdata via self._base), and
stop calling .clone() to create separate Vecs; initialize a single CDataObject
wrapped in the shared container and use that same handle for both base and
derived.

Comment on lines 95 to 108
pub fn new_exception(&self, exc_type: PyTypeRef, args: Vec<PyObjectRef>) -> PyBaseExceptionRef {
debug_assert_eq!(
exc_type.slots.basicsize,
std::mem::size_of::<PyBaseException>()
);
// TODO: add repr of args into logging?

PyRef::new_ref(
// TODO: this constructor might be invalid, because multiple
// exception (even builtin ones) are using custom constructors,
// see `OSError` as an example:
PyBaseException::new(args, self),
exc_type,
Some(self.ctx.new_dict()),
)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update doc/contract for new_exception: it now explicitly assumes PyBaseException payload size.
The new debug_assert_eq!(exc_type.slots.basicsize, size_of::<PyBaseException>()) is a good guardrail; consider tightening the doc comment to say “builtin types whose payload is PyBaseException (not OSError-family payloads)” to prevent accidental debug-only failures.

🤖 Prompt for AI Agents
In crates/vm/src/vm/vm_new.rs around lines 95 to 107, the debug_assert checks
that exc_type.slots.basicsize equals size_of::<PyBaseException>(), but the
public API/docs don't state this payload assumption; update the function
doc/contract to explicitly require callers to pass builtin exception types whose
payload is PyBaseException (i.e., not OSError-family or other sized payloads) so
the debug-only assert cannot be accidentally violated, and mention that callers
must only use types with basicsize == size_of::<PyBaseException>() (or else
construct the exception differently).

Copy link
Collaborator

@ShaharNaveh ShaharNaveh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

@youknowone youknowone merged commit d36a2cf into RustPython:main Dec 14, 2025
12 of 13 checks passed
@youknowone youknowone deleted the new-subclass branch December 14, 2025 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants