-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New subclass payload layout #6319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughAdds formal subclass support and base‑payload wiring across derive macros, VM core, exceptions, ctypes, stdlib, and IO: introduces PySubclass and PyRef upcast/into_base, validates pyclass base-field layout, converts many unit-like types to repr(transparent) newtypes or base‑embedded structs, centralizes OS/PyOSError subtype construction, and adds size-aware downcast checks. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Macro as derive-impl (proc-macro)
participant VM as VM / type registry
participant Payload as PyPayload::into_ref_with_type
participant PyRef as runtime PyRef
Macro->>VM: emit type metadata (static_type, payload_type_id, base info) and PySubclass impl for repr(transparent) newtypes
VM->>VM: register types and base/subclass relationships
PyRef->>Payload: into_ref_with_type(obj, cls, exact_class)
alt exact_class == cls
Payload-->>PyRef: create reference via _into_ref
else exact_class is subclass of cls
Payload->>Payload: check size compatibility
alt sizes match
Payload-->>PyRef: create reference via _into_ref
else sizes differ
Payload-->>PyRef: Err(cold_downcast_type_error(vm, class, obj))
end
else not subclass
Payload-->>PyRef: Err(downcast type error)
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes Areas to focus during review:
Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro 📒 Files selected for processing (21)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
557b273 to
91975e2
Compare
crates/vm/src/builtins/bool.rs
Outdated
| #[inline] | ||
| fn payload_type_id() -> std::any::TypeId { | ||
| std::any::TypeId::of::<PyInt>() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is okay here, because PyInt and PyBool shares same payload. The only difference is type.
But implementing this makes PyPayload::downcastable to be automatically true for any kind of PyInt. It doesn't make failing tests on this patch because we don't use that pattern. But it obviously will be a sort of trap for future developing.
Another question is that PyPayload::downcastable only takes &PyObject currently. We can't compare it to actual type... because we don't have vm.ctx.
| #[pyclass(name = "bool", module = false, base = PyInt)] | ||
| pub struct PyBool; | ||
| #[repr(transparent)] | ||
| pub struct PyBool(pub PyInt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we have to manually define the base layout to be always leading field.
The problem is Rust actually only guarantees it to be actually same layout only when #[repr(transparent)] is used.
What I didn't check yet: Can using #[repr(C)] helps? Is there any other repr options to keep the first field as same as its original layout?
e312bbf to
8db2660
Compare
ShaharNaveh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks super cool. I am a bit worried that we are not actually using it anywhere ATM so we can't validate that it's not a breaking change.
How difficult would it be to gradually transition to pattern?
afe436d to
4e2edcc
Compare
|
|
||
| #[inline] | ||
| fn validate_downcastable_from(obj: &::rustpython_vm::PyObject) -> bool { | ||
| <Self as ::rustpython_vm::class::PyClassDef>::BASICSIZE <= obj.class().slots.basicsize && obj.class().fast_issubclass(<Self as ::rustpython_vm::class::StaticType>::static_type()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I found another way to do with more tricks. Let's check it with OSError
d2f933e to
84f4c37
Compare
|
Code has been automatically formatted The code in this PR has been formatted using git pull origin new-subclass |
f96cee4 to
86de80e
Compare
|
@ShaharNaveh This is almost done! Could you give a look again? |
|
One question: Passing OSError to new_exception is a runtime error. Can this be checked earlier? |
ShaharNaveh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I think it's a much needed feature.
My only comments are nitpicks really, not logic related
6efa248 to
ca84ccf
Compare
64a8eea to
6564f23
Compare
6564f23 to
666979e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
crates/vm/src/stdlib/ctypes/array.rs (2)
241-248: Same redundant storage pattern as PyCUnion.
PyCArraystores both_base: PyCDataandcdata: PyRwLock<CDataObject>. The same concern applies here: cloningCDataObjectduplicates the buffer memory and could lead to inconsistencies if modifications are made through one field but not reflected in the other.This pattern appears in all initialization paths (lines 198-204, 307-313, 538-544, 606-612, 665-671, 752-758). Consider consolidating to use a single source of truth for the data.
198-214: Verify thatcdata.baseassignment doesn't cause issues with cloned data.At line 210-211, after creating the instance with cloned
CDataObject, the code setsarray_ref.cdata.write().base = Some(dll). This only modifies thecdatafield'sCDataObject, not the one in_base. If_baseis ever used to access the base reference, it won't have this information.crates/stdlib/src/ssl.rs (1)
3298-3304: Critical: Incorrect return type signature.The
readmethod signature on line 3304 shows-> PyResultwithout a type parameter, but all return paths in the function body (lines 3333-3335, 3375, 3396, 3410, etc.) returnPyObjectRefvalues. This is either a compilation error or a type mismatch.Apply this fix:
fn read( &self, len: OptionalArg<isize>, buffer: OptionalArg<ArgMemoryBuffer>, vm: &VirtualMachine, - ) -> PyResult { + ) -> PyResult<PyObjectRef> {crates/vm/src/stdlib/io.rs (1)
20-74: Remove theunsafe transmuteand avoid panics in error conversion.
unsafe { std::mem::transmute(exc) }(Line 73) is not a safe way to returnPyBaseExceptionRef, andunwrap()during exception construction (Line 70) can panic while already handling an error.Suggested fix (use proper upcast + don’t panic):
@@ - #[allow(clippy::let_and_return)] - let exc = vm.new_errno_error(errno, msg); + let exc = vm.new_errno_error(errno, msg); @@ - exc.as_object() - .set_attr("winerror", vm.new_pyobj(winerror), vm) - .unwrap(); + let _ = exc + .as_object() + .set_attr("winerror", vm.new_pyobj(winerror), vm); } - // FIXME: - unsafe { std::mem::transmute(exc) } + exc.upcast() } }crates/vm/src/exceptions.rs (2)
1551-1601: Avoidunsafe swap(...)+unwrap()inPyOSError::slot_init(panic/UB risk).
zelf.downcast_ref::<PyOSError>().unwrap()(Line 1560) can panic during exception initialization.unsafe { exc.<field>.swap(...) }(Line 1567+) relies on “only called once / fields are None”, which is not a safe invariant (Python can re-run__init__), and may violate whatever safety contractPyAtomicRef::swaphas.Consider switching to checked downcast + safe swap helpers (you already use
swap_to_temporary_refsin setters):- #[allow(deprecated)] - let exc: &Py<PyOSError> = zelf.downcast_ref::<PyOSError>().unwrap(); + #[allow(deprecated)] + let exc: &Py<PyOSError> = zelf + .downcast_ref::<PyOSError>() + .ok_or_else(|| vm.new_type_error("OSError.__init__ called on non-OSError instance"))?; @@ - let _ = unsafe { exc.errno.swap(Some(new_args.args[0].clone())) }; - let _ = unsafe { exc.strerror.swap(Some(new_args.args[1].clone())) }; + exc.errno + .swap_to_temporary_refs(Some(new_args.args[0].clone()), vm); + exc.strerror + .swap_to_temporary_refs(Some(new_args.args[1].clone()), vm); @@ - let _ = unsafe { exc.filename.swap(Some(new_args.args[2].clone())) }; + exc.filename + .swap_to_temporary_refs(Some(new_args.args[2].clone()), vm);
1603-1681:__str__and__reduce__have inconsistent handling ofwinerroron Windows.
__str__displayswinerrorwhen set (lines 1615–1617), but__reduce__doesn't preserve it during pickling. The__init__method acceptswinerroratargs[3](line 1575) and stores it separately, then truncatesargsto 2 elements (line 1597). When__reduce__reconstructs the exception using the truncated args,winerroris lost. A Windows-only pickle round-trip test should verify thatwinerroris intentionally not preserved (matching CPython's default behavior) or add it to__reduce__if preservation is desired.
♻️ Duplicate comments (1)
crates/derive-impl/src/pyclass.rs (1)
718-719: LGTM: Optimization to skip checking for Constructor after found.This change correctly adds an early exit condition to avoid unnecessary iteration once the Constructor trait is found.
🧹 Nitpick comments (5)
crates/vm/src/object/payload.rs (1)
104-119: Error message may be misleading.The error message states "cannot create '{}' from a subclass with a different size '{}'", but
clsis the subclass andexact_classis the base type. The phrasing suggests the relationship is inverted.Consider rephrasing for clarity:
- vm.new_type_error(format!( - "cannot create '{}' from a subclass with a different size '{}'", - cls.name(), - exact_class.name() - )) + vm.new_type_error(format!( + "cannot create '{}' instance: size differs from base type '{}'", + cls.name(), + exact_class.name() + ))crates/vm/src/stdlib/os.rs (1)
528-537: Consider simplifying the error conversion.The current flow creates an error, converts to
PyObjectRef, then downcasts back. This roundabout conversion appears to be a workaround for type constraints. Ifvm.new_errno_error()returns a type compatible withPyResult, the intermediate conversion could potentially be avoided.However, if this pattern is required by the new subclass payload layout to ensure proper type relationships, please add a brief comment explaining why the conversion is necessary.
if key.is_empty() || key.contains(&b'=') { - let x = vm.new_errno_error( + // Note: Convert through PyObjectRef for proper subclass type handling + let x = vm.new_errno_error( 22, format!( "Invalid argument: {}", std::str::from_utf8(key).unwrap_or("<bytes encoding failure>") ), ); - let x: PyObjectRef = x.into(); return Err(x.downcast().unwrap()); }crates/vm/src/stdlib/ctypes.rs (1)
372-378: Consider avoiding the clone ifCDataObjectis large or frequently instantiated.The pattern creates a
CDataObject, then clones it for both_base(viaPyCData::new) andcdata. SinceCDataObjectcontains aVec<u8>buffer, this results in duplicating the buffer allocation. If the intent is for_baseandcdatato share the same underlying data, consider usingArc<PyRwLock<CDataObject>>or similar shared ownership. If independent copies are intentional, this is fine.#!/bin/bash # Check if other ctypes types follow the same clone pattern and whether buffer sizes are typically small rg -n "CDataObject::from_bytes" --type=rs -A2 crates/vm/src/stdlib/ctypes/crates/derive-impl/src/pyclass.rs (1)
363-385: Type matching logic may be overly permissive with last-segment matching.The
type_matches_pathfunction compares types by checking if either the full paths match or if just the last segments match (line 384). The last-segment-only matching could lead to false positives if two different types happen to share the same final identifier (e.g.,foo::Barmatchingbaz::Bar).Consider whether this permissiveness is intentional for macro hygiene reasons, or if a stricter comparison would be safer.
crates/stdlib/src/ssl.rs (1)
1512-1512: Consider using SSLError for certificate store loading failures.Line 1512 uses
vm.new_os_error()for Windows certificate store failures, while other SSL-related errors in this file usenew_os_subtype_errorwithPySSLError::class(). For consistency and to help users distinguish SSL-specific errors, consider using:vm.new_os_subtype_error( PySSLError::class(&vm.ctx).to_owned(), None, "Failed to load certificates from Windows store", ) .upcast()The same applies to line 1536 in the non-Windows path.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (27)
crates/derive-impl/src/pyclass.rs(8 hunks)crates/derive-impl/src/pystructseq.rs(2 hunks)crates/stdlib/src/socket.rs(4 hunks)crates/stdlib/src/ssl.rs(20 hunks)crates/stdlib/src/ssl/compat.rs(5 hunks)crates/vm/src/builtins/bool.rs(4 hunks)crates/vm/src/builtins/int.rs(1 hunks)crates/vm/src/class.rs(1 hunks)crates/vm/src/exception_group.rs(1 hunks)crates/vm/src/exceptions.rs(13 hunks)crates/vm/src/object/core.rs(2 hunks)crates/vm/src/object/payload.rs(3 hunks)crates/vm/src/stdlib/ast/pyast.rs(14 hunks)crates/vm/src/stdlib/ctypes.rs(3 hunks)crates/vm/src/stdlib/ctypes/array.rs(8 hunks)crates/vm/src/stdlib/ctypes/base.rs(4 hunks)crates/vm/src/stdlib/ctypes/field.rs(1 hunks)crates/vm/src/stdlib/ctypes/function.rs(6 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(7 hunks)crates/vm/src/stdlib/ctypes/structure.rs(6 hunks)crates/vm/src/stdlib/ctypes/union.rs(6 hunks)crates/vm/src/stdlib/io.rs(22 hunks)crates/vm/src/stdlib/os.rs(1 hunks)crates/vm/src/types/structseq.rs(1 hunks)crates/vm/src/vm/context.rs(7 hunks)crates/vm/src/vm/vm_new.rs(3 hunks)extra_tests/snippets/stdlib_ctypes.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/field.rscrates/vm/src/class.rscrates/vm/src/types/structseq.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/object/core.rscrates/stdlib/src/ssl/compat.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/object/payload.rscrates/vm/src/stdlib/os.rscrates/stdlib/src/socket.rscrates/derive-impl/src/pyclass.rscrates/vm/src/exception_group.rscrates/vm/src/vm/vm_new.rscrates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/builtins/int.rscrates/derive-impl/src/pystructseq.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/vm/context.rscrates/vm/src/builtins/bool.rscrates/vm/src/stdlib/ctypes/base.rscrates/stdlib/src/ssl.rscrates/vm/src/stdlib/ast/pyast.rscrates/vm/src/exceptions.rscrates/vm/src/stdlib/io.rs
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
extra_tests/snippets/stdlib_ctypes.py
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Applied to files:
crates/vm/src/types/structseq.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/types/structseq.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/stdlib/src/socket.rscrates/derive-impl/src/pyclass.rscrates/vm/src/vm/vm_new.rscrates/derive-impl/src/pystructseq.rscrates/vm/src/vm/context.rscrates/vm/src/builtins/bool.rscrates/vm/src/stdlib/ast/pyast.rscrates/vm/src/exceptions.rs
📚 Learning: 2025-12-09T08:46:58.660Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.
Applied to files:
crates/vm/src/object/payload.rscrates/vm/src/stdlib/os.rscrates/stdlib/src/socket.rscrates/vm/src/exception_group.rscrates/vm/src/vm/vm_new.rscrates/vm/src/vm/context.rscrates/vm/src/stdlib/ast/pyast.rscrates/vm/src/exceptions.rs
📚 Learning: 2025-06-28T16:31:03.991Z
Learnt from: arihant2math
Repo: RustPython/RustPython PR: 5790
File: build.rs:2-2
Timestamp: 2025-06-28T16:31:03.991Z
Learning: In Cargo build scripts (build.rs), the environment variable CARGO_CFG_TARGET_OS is guaranteed to exist and is automatically set by Cargo during the build process, making unwrap() safe to use when accessing this variable.
Applied to files:
crates/vm/src/stdlib/os.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/stdlib/src/socket.rscrates/stdlib/src/ssl.rscrates/vm/src/exceptions.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/**/*.py : Minimize modifications to CPython standard library files in the `Lib/` directory; modifications should be minimal and only to work around RustPython limitations
Applied to files:
crates/vm/src/stdlib/io.rs
🧬 Code graph analysis (14)
crates/vm/src/class.rs (2)
crates/vm/src/exceptions.rs (1)
as_base(1451-1453)crates/vm/src/object/core.rs (1)
Self(371-374)
crates/vm/src/stdlib/ctypes.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
from_bytes(184-191)size(211-213)new(223-227)new(261-267)
crates/stdlib/src/ssl/compat.rs (2)
crates/stdlib/src/ssl.rs (4)
create_ssl_want_read_error(413-419)create_ssl_want_write_error(421-427)create_ssl_zero_return_error(437-443)create_ssl_eof_error(429-435)crates/stdlib/src/socket.rs (1)
timeout_error_msg(2427-2429)
crates/vm/src/stdlib/ctypes/structure.rs (2)
crates/vm/src/stdlib/ctypes/base.rs (4)
from_stg_info(174-181)new(223-227)new(261-267)from_bytes(184-191)crates/vm/src/stdlib/ctypes/util.rs (1)
new(56-67)
crates/vm/src/object/payload.rs (2)
crates/vm/src/object/core.rs (2)
class(668-670)obj(1098-1098)crates/vm/src/object/ext.rs (1)
class(488-490)
crates/vm/src/stdlib/os.rs (1)
crates/vm/src/builtins/type.rs (1)
x(1152-1152)
crates/derive-impl/src/pyclass.rs (3)
crates/vm/src/object/payload.rs (4)
std(31-31)class(54-54)payload_type_id(30-32)validate_downcastable_from(41-43)crates/vm/src/class.rs (1)
as_base(187-187)crates/vm/src/builtins/type.rs (1)
fast_issubclass(629-631)
crates/vm/src/vm/vm_new.rs (1)
crates/vm/src/exceptions.rs (5)
errno(1532-1533)errno(1534-1534)errno(1684-1686)errno_to_exc_type(1163-1189)errno_to_exc_type(1192-1194)
crates/vm/src/stdlib/ctypes/union.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
from_stg_info(174-181)new(223-227)new(261-267)from_bytes(184-191)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/builtins/type.rs (1)
fast_issubclass(629-631)crates/vm/src/class.rs (1)
as_base(187-187)
crates/vm/src/vm/context.rs (2)
crates/vm/src/types/slot.rs (3)
from(1236-1238)heap_type_flags(243-249)default(262-264)crates/vm/src/builtins/type.rs (2)
new_heap(190-213)bases(1070-1087)
crates/vm/src/builtins/bool.rs (1)
crates/vm/src/builtins/int.rs (2)
class(52-54)fmt(34-36)
crates/vm/src/stdlib/ctypes/base.rs (2)
crates/vm/src/stdlib/ctypes/field.rs (1)
new(37-54)crates/vm/src/stdlib/ctypes/function.rs (1)
value(52-52)
crates/vm/src/exceptions.rs (2)
crates/vm/src/object/payload.rs (1)
class(54-54)crates/vm/src/class.rs (1)
as_base(187-187)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run snippets and cpython tests (windows-latest)
🔇 Additional comments (69)
crates/vm/src/types/structseq.rs (1)
192-192: Good improvement:expect()with descriptive message.Replacing
.unwrap()with.expect()and providing a clear error message is a best practice that improves debuggability. The message correctly identifies this as an internal RustPython bug, which helps with triage if this invariant is ever violated.crates/vm/src/exception_group.rs (1)
47-50: LGTM! Wrapper pattern aligns with the new subclass layout.The transition from a unit struct to a
#[repr(transparent)]newtype wrapper aroundPyBaseExceptionis consistent with the broader refactor. This enables proper base-class relationships while maintaining memory layout compatibility for safe casting.crates/vm/src/object/payload.rs (1)
19-26: LGTM! Cold path helper for error generation.The
#[cold]annotation is appropriate for an error path that should rarely execute, helping branch prediction optimization.crates/vm/src/vm/context.rs (5)
35-36: LGTM! Type change aligns with PyBool wrapper pattern.Changing
true_valueandfalse_valuefromPyIntReftoPyRef<PyBool>correctly reflects the new type hierarchy wherePyBoolis a proper wrapper aroundPyInt.
282-285: LGTM! Simplified trait bound.Removing the redundant
PyPayloadbound is correct sincePyObjectPayloadalready requires it (as shown at line 141).
306-307: LGTM! Boolean singleton creation with wrapper pattern.Creating
PyBool(PyInt::from(1))andPyBool(PyInt::from(0))correctly establishes the boolean singletons using the new wrapper structure.
450-456: LGTM! Return type correctly updated.The method now returns
PyRef<PyBool>matching the updated field types, maintaining the singleton pattern for boolean values.
510-520: No changes needed — basicsize: 0 is appropriate for heap exception types.The explicit
basicsize: 0in thePyTypeSlotsis correctly handled byPyType::new_heap_inner, which automatically inherits the base class basicsize when it's set to 0 (lines 327-329 incrates/vm/src/builtins/type.rs). This ensures that by the timeinto_ref_with_typeis called, both the exact exception class and the dynamically created exception type have matching basicsize values, preventing any size mismatch errors. This design is safe and consistent with how other heap types (likeExceptionGroupinexception_group.rs) are created.crates/vm/src/stdlib/ast/pyast.rs (2)
6-79: LGTM! Macro updated for wrapper-based AST nodes.The
impl_nodemacro correctly generates#[repr(transparent)]wrapper structs with the base type, aligning with the new subclass payload layout. The pattern matching and recursive macro invocations are consistent across all arms.
81-82: LGTM! Base AST node types converted to wrapper pattern.All intermediate AST base types (
NodeMod,NodeStmt,NodeExpr, etc.) correctly use#[repr(transparent)]wrappers aroundNodeAst, establishing the proper type hierarchy for AST nodes.crates/vm/src/builtins/int.rs (1)
256-263: LGTM! Proper use of upcast for bool subclass.The
.upcast()call correctly convertsPyRef<PyBool>toPyRef<PyInt>, leveraging the newPySubclasstrait to maintain type safety while allowingwith_valueto return a consistentPyRef<Self>type.crates/vm/src/class.rs (1)
174-188: Clean trait design for subclass relationships.The
PySubclasstrait provides a clear abstraction for base-class access. The documentation accurately describes the auto-implementation by#[pyclass]and the#[repr(transparent)]requirement for ownership transfer.crates/derive-impl/src/pystructseq.rs (1)
448-513: Well-structured newtype wrapper implementation.The generated code correctly:
- Uses
#[repr(transparent)]for safe memory layout- Delegates
payload_type_id()toPyTuplefor proper type identification- Implements
validate_downcastable_fromusingfast_issubclassfor correct subtype checking- Provides
PySubclassimplementation for base accessThis aligns well with the PR's subclass payload layout goals.
crates/stdlib/src/ssl/compat.rs (2)
225-256: Consistent error construction with OS subtype pattern.The changes correctly use
vm.new_os_subtype_error()followed by.upcast()to returnPyBaseExceptionRef. This pattern is consistently applied across all SSL error creation functions and aligns with the PR's exception handling refactoring.
546-561: Error conversion paths properly upcast to base exception type.All
SslErrorvariants that create OS-subtype errors (WantRead,WantWrite,Timeout,ZeroReturn,Eof) correctly call.upcast()to convert fromPyRef<PyOSError>toPyBaseExceptionRef.crates/vm/src/stdlib/ctypes/pointer.rs (2)
12-14: Proper repr(transparent) wrapper for type object.The
#[repr(transparent)]attribute onPyCPointerType(PyType)ensures correct memory layout for the type wrapper pattern used throughout this PR.
64-68: No action needed—empty base initialization is correct for pointer types.Pointer types intentionally use
CDataObject::from_bytes(vec![], None)because they store their data in a separatecontentsfield rather than in the CData buffer. In contrast, structures and unions (which store binary field data) usefrom_stg_infowith properly sized buffers. This design is consistent across all pointer-like types (PyCPointer,PyCFuncPtr) and is architecturally sound.crates/vm/src/stdlib/ctypes/field.rs (1)
10-16: The removal ofPyPayloadderive fromPyCFieldTypeis correct.PyCFieldTypeis a metaclass (type) for field descriptors, not a payload-bearing instance class. Metaclass types with#[pyclass(base = PyType)]do not requirePyPayloadimplementation. The actual field descriptor classPyCFieldcorrectly derivesPyPayload. This pattern is consistent across all ctypes metaclasses (PyCStructType,PyCPointerType,PyCArrayType, etc.), which also usebase = PyTypewithoutPyPayloadderive.crates/vm/src/stdlib/ctypes.rs (2)
18-18: LGTM: Public export expanded correctly.Adding
CDataObjectto the public exports is consistent with the new base pattern introduced across ctypes types.
50-50: LGTM: Import aligned with construction changes.The expanded import correctly brings in
CDataObjectandPyCDatafor use innew_simple_type.crates/vm/src/vm/vm_new.rs (4)
96-99: Debug assertion may fail for user-defined exception types.The
debug_assert_eq!checks thatexc_type.slots.basicsizeequalssize_of::<PyBaseException>(). Per the docstring, this function "should only be used with builtin exception types," but the assertion will panic in debug builds if misused. This is good for catching errors early, though consider adding a comment clarifying why this assertion exists.
109-141: LGTM: Well-structured OS error construction.The new
new_os_errorandnew_os_subtype_errormethods properly handle OSError construction with optional errno. The nestednew_os_subtype_error_implfunction keeps the implementation contained, though it could be a private method if reused elsewhere.One minor note: the
.expect("new_os_error usage error")calls will panic on construction failure. This is acceptable for internal API misuse, but ensure callers pass valid exception types.
120-120: Debug assertion validates PyOSError construction.This assertion ensures
exc_typehas the correct basicsize forPyOSError, which is essential for the new subclass payload layout. Good defensive programming.
259-264: LGTM: Errno-based error construction correctly delegates.The updated
new_errno_errorproperly maps errno to a specific exception subtype viaerrno_to_exc_type, falling back toos_errorif no mapping exists, then delegates tonew_os_subtype_error.crates/vm/src/stdlib/ctypes/base.rs (4)
222-228: LGTM: Simple and effective constructor.The
PyCData::newconstructor provides a clean way to wrap aCDataObjectin aPyRwLock, supporting the new base pattern across ctypes types. Based on coding guidelines, this follows the macro system pattern appropriately.
424-428: New_basefield establishes base-class relationship.Adding
_base: PyCDataalongsidecdata: PyRwLock<CDataObject>is consistent with the PR's goal of formalizing subclass-base relationships. Thepyclassdeclaration correctly specifiesbase = PyCData.
659-664: Same clone pattern as other ctypes types.The constructor creates a
CDataObjectand clones it for both_baseandcdatafields. This is consistent with the pattern innew_simple_typeand other ctypes types. The duplication concern raised earlier applies here as well.
238-241: The#[repr(transparent)]usage is correct and follows the established pattern.All other metaclass types in ctypes (
PyCStructType,PyCPointerType,PyCUnionType,PyCArrayType) use the same#[repr(transparent)]attribute with identical structure. This confirms thatPyCSimpleTypecorrectly maintains memory layout compatibility withPyTypeand is consistent with the codebase's metaclass design.crates/vm/src/stdlib/ctypes/function.rs (3)
7-7: LGTM: Import added for CDataObject usage.The import is required for initializing the
_basefield inPyCFuncPtrconstructors.
164-176: LGTM:_basefield added to PyCFuncPtr.The new
_base: PyCDatafield is correctly placed at the beginning of the struct, consistent with the base-class pattern across ctypes types. The struct maintains all existing fields.
197-197: Consistent_baseinitialization across all constructor paths.All four constructor branches (empty args, integer address, tuple form, callable form) correctly initialize
_basewithPyCData::new(CDataObject::from_bytes(vec![], None)). Using an empty buffer is appropriate for function pointers since they don't store data like other ctypes types.crates/vm/src/stdlib/ctypes/structure.rs (6)
18-20: Consistent transparent wrapper pattern for metaclass.
PyCStructType(PyType)with#[repr(transparent)]follows the same pattern asPyCSimpleType, ensuring consistent metaclass handling across ctypes types.
222-229: LGTM:_basefield added to PyCStructure.The structure correctly adds
_base: PyCDatawhile maintaining the existingcdataandfieldsfields. This aligns with the PR's goal of enabling more specific subclass-typed function parameters.
299-304: Consistent initialization pattern in slot_new.Creates
CDataObjectfromStgInfo, then clones for both_baseandcdata. Consistent with other ctypes types.
370-377: LGTM: from_address correctly updated.The
from_addressmethod properly initializes both_baseandcdatafrom the sameCDataObject.
423-430: Good: from_buffer preserves source reference.The
from_buffermethod correctly passesSome(source)toCDataObject::from_bytes, which stores the reference inobjectsto keep the source buffer alive. This is important for memory safety when the structure references external buffer data.
468-475: LGTM: from_buffer_copy creates independent copy.Unlike
from_buffer,from_buffer_copycorrectly passesNonesince it creates an independent copy of the data, not a reference to the source buffer.crates/derive-impl/src/pyclass.rs (5)
308-361: Well-structured base field validation.The
validate_base_fieldfunction properly validates that when a base class is specified, the struct has the base type as its first field. This is essential for#[repr(transparent)]to work correctly and ensures proper memory layout for subclassing.One minor observation: the error messages are clear and informative, providing the expected base type name.
431-449: LGTM: repr(transparent) detection and basicsize computation.The logic correctly detects
#[repr(transparent)]and setsbasicsizeto 0 when the type has a base class with transparent representation, ensuring the type shares the same memory layout as its base.
611-652: PyPayload generation for base types looks correct.The conditional logic properly handles:
- Types with a base: forwards
payload_type_idto the base and addsvalidate_downcastable_fromcheck- Types without a base but with a ctx type: generates simple class resolution
- Types without base or ctx: no additional impl needed
The
#[allow(clippy::collapsible_else_if)]is appropriate here for readability.
1212-1213: Clarifying comment for cfg matching is helpful.The added comment explains that getter and setter must have matching
#[cfg]attributes since they're paired by the map key(name, cfgs).
485-502: Remove the comment addition and close as not an issue.The generated
PySubclassimpl correctly assumes a tuple struct layout. The impl is only generated whenis_repr_transparentis true, and Rust's#[repr(transparent)]attribute guarantees exactly one field. This field must be the base type (validated byvalidate_base_field), so accessing it via&self.0is always correct, whether the struct uses named or unnamed field syntax. Thevalidate_base_fieldfunction handles both cases to provide flexible validation, but the actualPySubclassimpl generation is safely gated by theis_repr_transparentcheck, which ensures the memory layout assumption holds.crates/vm/src/builtins/bool.rs (3)
21-34: TryFromBorrowedObject accepts any integer as a boolean.The implementation now downcasts to
PyIntand treats any non-zero integer astrue. This differs from strict boolean checking and aligns with Python's truthy semantics for integers. However, this meansbool.try_from_borrowed_objectwill accept plain integers, not just booleans.Verify this is the intended behavior - if callers expect only actual
boolobjects, this could be a semantic change.
88-96: LGTM: PyBool as transparent wrapper around PyInt.The
#[repr(transparent)]ensuresPyBoolhas the same memory layout asPyInt, which is correct for the subclass pattern. Thectx = "bool_type"andbase = PyIntattributes properly wire up the class hierarchy.The Debug implementation correctly extracts the boolean value from the inner
PyInt.
222-228: No action needed—all call sites include proper type checks.The
get_valuefunction is safe despite the.unwrap(). All callers verify the object is aPyBoolbefore calling: external callers inpy_serde.rs(line 70) andjit.rs(line 141) usefast_isinstanceorcls.is()checks, and internal calls withinbool.rsoccur in bool-specific methods protected by type guards (e.g., lines 136–137 in__or__checkfast_isinstance(vm.ctx.types.bool_type)for both operands). The function is pub(crate) scoped and never receives non-PyBool objects in practice.Likely an incorrect or invalid review comment.
crates/vm/src/stdlib/ctypes/union.rs (1)
14-18: LGTM: PyCUnionType as transparent wrapper around PyType.The metaclass is correctly defined as a transparent wrapper, consistent with the new subclass pattern.
crates/stdlib/src/socket.rs (4)
18-18: LGTM: PyOSError added to imports.The import aligns with the new error handling pattern that uses
PyOSErrordirectly.
1829-1833: LGTM: From implementation for PyRef.The implementation correctly upcasts
PyRef<PyOSError>toPyBaseExceptionReffor integration with theIoOrPyExceptionenum.
1852-1852: LGTM: Timeout error upcast.The
.upcast()call correctly convertsPyRef<PyOSError>to the expectedPyBaseExceptionReftype.
2420-2428: LGTM: OS subtype error creation and timeout functions.The changes consistently use
vm.new_os_subtype_errorto create properly typed OS exceptions. The function signatures now returnPyRef<PyOSError>, which provides better type safety.crates/vm/src/stdlib/ctypes/array.rs (2)
26-28: LGTM: PyCArrayType as transparent wrapper around PyType.Consistent with the new subclass pattern used throughout the codebase.
162-172: Verify unsafe raw pointer read from symbol address.The code reads data from a symbol address obtained from a shared library. While there's a null check, verify that:
- The symbol address points to valid readable memory of the expected size
- The symbol data remains valid for the lifetime of the
PyCArrayinstance (the dll reference helps, but data could still be modified)crates/stdlib/src/ssl.rs (4)
346-410: LGTM: Exception type restructuring follows newtype wrapper pattern correctly.The transparent newtype wrappers around base exception types (PySSLError wrapping PyOSError, subclasses wrapping PySSLError) align with the PR's goal of enabling more specific subclass-typed parameters. The
__str__implementation correctly preserves OSError-compatible behavior by checking forstrerrorattribute first.
1236-1242: LGTM: Error construction follows OS-subtype pattern.The password decryption error correctly uses
new_os_subtype_errorwithNoneerrno (since this is an SSL/crypto error, not a system error) and callsupcast()to convert to the expected exception type.
1870-1880: LGTM: FileNotFoundError construction is correct.The error correctly sets errno to 2 (ENOENT), creates the FileNotFoundError subtype, sets the filename attribute, and calls upcast(). This matches CPython's FileNotFoundError behavior.
3347-3353: LGTM: Error construction in read/write operations is consistent.The shutdown, EOF, and operation errors correctly use the OS-subtype error pattern with appropriate errno values (None for protocol errors) and descriptive messages. The use of helper functions with
.upcast()properly converts between exception types.Also applies to: 3400-3406, 3478-3484
crates/vm/src/stdlib/io.rs (7)
83-98: Module wiring via_io::unsupported_operation()looks fine.
Pulling the type via_io::unsupported_operation()and exporting it (Line 91-95) is a nice cleanup over ad-hoc statics.
148-156: Import tweak is fine.
Switching toPyBool(Line 152) matches the new more-specific return types in this PR.
205-208:new_unsupported_operation()integration looks good.
Creating an OSError subtype with the centralized UnsupportedOperation type (Line 205-208) is consistent with the refactor direction.
426-462:_IOBase.__closedtype change is OK; just ensure it matches Python-visible expectations.
ReturningPyRef<PyBool>(Line 459-461) aligns with the “more specific subclass-typed parameters” goal; worth running stdlib_iobehavioral tests aroundclosed/__closedsince this is a core attribute.
640-645: Transparent base wrappers are a good approach for the new layout.
The#[repr(transparent)]wrappers for_RawIOBase,_BufferedIOBase,_TextIOBaseplusDefault(Line 640-645, 700-705, 764-770) look consistent with the subclass payload embedding strategy.Also applies to: 700-705, 764-770
1766-2032: Embedded_basefields: looks consistent; please clippy/test for any accidental default-constructed “uninitialized” states.
The_base: <BaseType>pattern across buffered/text/raw/file IO types (Line 1766-2032, 2376-2383, 3657-3695, 3802-3827, 4441-4474) matches the PR objective. Main risk is behavior of anyDefault/DefaultConstructorpaths that could create instances without the expected init invariants (e.g., mutex state, “uninitialized object” errors) — worth validating constructors and__init__flows still gate operations properly.Also applies to: 2376-2383, 3657-3695, 3802-3827, 4441-4474
4246-4268:unsupported_operation()is correctly implemented. RustPython is a single-context architecture where all VirtualMachine instances share the same globalContext::genesis(). The static cell pattern here is safe and intentional—there is no scenario with multiple independent contexts. The concern about multi-context support does not apply to this codebase.crates/vm/src/exceptions.rs (6)
966-969:extend_exception!(PyOSError, ...)change is fine.
Removing dynamic attributes in favor of struct fields + getsets matches the refactor.
1196-1210: Types-module import adjustments look fine.
No concerns with the added imports to support the newPyOSErrorimplementation.
1235-1437:#[repr(transparent)]wrapper migration looks consistent; just ensure subclass/base expectations are tested.
The broad conversion of exception types into transparent wrappers matches the new subclass payload design and should enable more precise typed refs.Also applies to: 1736-2202
1682-1734: Field getsets look good; nice use ofswap_to_temporary_refs.
The getters/setters forerrno/strerror/filename/filename2(/winerror)are straightforward and align with the new field-backed model.
1-341: Process: please run formatting + clippy on this PR.
Given the scale ofunsafeand layout changes, it’s worth ensuringcargo fmt+cargo clippyare clean and that exception/stdlib test suites cover the new constructor + pickle/GC paths.
1487-1521: Non-Windowsfilename2handling appears incorrect (len==4 should populatefilename2).In
py_new,filename2is only set whenlen == 5(Line 1500-1504). On non-Windows, the 4th positional arg is typicallyfilename2, so this loses data. Similar issue inslot_initwherefilename2is only set onlen == 5(Line 1591-1593).Suggested fix (platform-specific indexing):
@@ - let filename2 = if len == 5 { - args.args.get(4).cloned() - } else { - None - }; + #[cfg(windows)] + let filename2 = if len == 5 { args.args.get(4).cloned() } else { None }; + #[cfg(not(windows))] + let filename2 = if len == 4 { args.args.get(3).cloned() } else { None }; @@ - if len == 5 { - let _ = unsafe { exc.filename2.swap(new_args.args.get(4).cloned()) }; - } + #[cfg(windows)] + if len == 5 { + exc.filename2 + .swap_to_temporary_refs(new_args.args.get(4).cloned(), vm); + } + #[cfg(not(windows))] + if len == 4 { + exc.filename2 + .swap_to_temporary_refs(new_args.args.get(3).cloned(), vm); + }Also applies to: 1551-1601
⛔ Skipped due to learnings
Learnt from: youknowone Repo: RustPython/RustPython PR: 6358 File: crates/vm/src/exception_group.rs:173-185 Timestamp: 2025-12-09T08:46:58.660Z Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.
| impl<T: crate::class::PySubclass + std::fmt::Debug> PyRef<T> | ||
| where | ||
| T::Base: std::fmt::Debug, | ||
| { | ||
| /// Converts this reference to the base type (ownership transfer). | ||
| /// # Safety | ||
| /// T and T::Base must have compatible layouts in size_of::<T::Base>() bytes. | ||
| #[inline] | ||
| pub fn into_base(self) -> PyRef<T::Base> { | ||
| let obj: PyObjectRef = self.into(); | ||
| match obj.downcast() { | ||
| Ok(base_ref) => base_ref, | ||
| Err(_) => unsafe { std::hint::unreachable_unchecked() }, | ||
| } | ||
| } | ||
| #[inline] | ||
| pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U> | ||
| where | ||
| T: StaticType, | ||
| { | ||
| debug_assert!(T::static_type().is_subtype(U::static_type())); | ||
| let obj: PyObjectRef = self.into(); | ||
| match obj.downcast::<U>() { | ||
| Ok(upcast_ref) => upcast_ref, | ||
| Err(_) => unsafe { std::hint::unreachable_unchecked() }, | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsafe unreachable_unchecked could cause UB if invariants are violated.
Both into_base() and upcast() use unsafe { std::hint::unreachable_unchecked() } on downcast failure. While the type system constraints should prevent misuse, if invariants are violated (e.g., due to macro bugs or incorrect implementations), this will cause undefined behavior rather than a clear panic.
Consider using unreachable!() with a descriptive message for safer error handling, or at minimum document why this is safe:
pub fn into_base(self) -> PyRef<T::Base> {
let obj: PyObjectRef = self.into();
match obj.downcast() {
Ok(base_ref) => base_ref,
- Err(_) => unsafe { std::hint::unreachable_unchecked() },
+ Err(_) => unreachable!("PySubclass invariant violated: into_base downcast failed"),
}
}
#[inline]
pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U>
where
T: StaticType,
{
debug_assert!(T::static_type().is_subtype(U::static_type()));
let obj: PyObjectRef = self.into();
match obj.downcast::<U>() {
Ok(upcast_ref) => upcast_ref,
- Err(_) => unsafe { std::hint::unreachable_unchecked() },
+ Err(_) => unreachable!("PySubclass invariant violated: upcast to {} failed", std::any::type_name::<U>()),
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| impl<T: crate::class::PySubclass + std::fmt::Debug> PyRef<T> | |
| where | |
| T::Base: std::fmt::Debug, | |
| { | |
| /// Converts this reference to the base type (ownership transfer). | |
| /// # Safety | |
| /// T and T::Base must have compatible layouts in size_of::<T::Base>() bytes. | |
| #[inline] | |
| pub fn into_base(self) -> PyRef<T::Base> { | |
| let obj: PyObjectRef = self.into(); | |
| match obj.downcast() { | |
| Ok(base_ref) => base_ref, | |
| Err(_) => unsafe { std::hint::unreachable_unchecked() }, | |
| } | |
| } | |
| #[inline] | |
| pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U> | |
| where | |
| T: StaticType, | |
| { | |
| debug_assert!(T::static_type().is_subtype(U::static_type())); | |
| let obj: PyObjectRef = self.into(); | |
| match obj.downcast::<U>() { | |
| Ok(upcast_ref) => upcast_ref, | |
| Err(_) => unsafe { std::hint::unreachable_unchecked() }, | |
| } | |
| } | |
| } | |
| impl<T: crate::class::PySubclass + std::fmt::Debug> PyRef<T> | |
| where | |
| T::Base: std::fmt::Debug, | |
| { | |
| /// Converts this reference to the base type (ownership transfer). | |
| /// # Safety | |
| /// T and T::Base must have compatible layouts in size_of::<T::Base>() bytes. | |
| #[inline] | |
| pub fn into_base(self) -> PyRef<T::Base> { | |
| let obj: PyObjectRef = self.into(); | |
| match obj.downcast() { | |
| Ok(base_ref) => base_ref, | |
| Err(_) => unreachable!("PySubclass invariant violated: into_base downcast failed"), | |
| } | |
| } | |
| #[inline] | |
| pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U> | |
| where | |
| T: StaticType, | |
| { | |
| debug_assert!(T::static_type().is_subtype(U::static_type())); | |
| let obj: PyObjectRef = self.into(); | |
| match obj.downcast::<U>() { | |
| Ok(upcast_ref) => upcast_ref, | |
| Err(_) => unreachable!("PySubclass invariant violated: upcast to {} failed", std::any::type_name::<U>()), | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
In crates/vm/src/object/core.rs around lines 1076 to 1103, the methods
into_base() and upcast() call unsafe { std::hint::unreachable_unchecked() } when
a downcast fails, which can produce undefined behavior if invariants are ever
violated; replace the unreachable_unchecked() calls with a safe panic (e.g.,
unreachable!() or panic! with a descriptive message) so a violation yields a
clear panic and backtrace rather than UB, and add a short comment above each
branch documenting the invariant that guarantees the downcast should succeed (or
keep a debug_assert verifying the subtype relation before the downcast).
| #[pyclass(module = "_ctypes", name = "Union", base = PyCData, metaclass = "PyCUnionType")] | ||
| #[derive(PyPayload)] | ||
| pub struct PyCUnion { | ||
| _base: PyCData, | ||
| /// Common CDataObject for memory buffer | ||
| pub(super) cdata: PyRwLock<CDataObject>, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant storage: _base and cdata hold cloned copies of the same CDataObject.
The struct stores both _base: PyCData and cdata: PyRwLock<CDataObject>. Looking at the initialization code (lines 177-180), CDataObject is cloned and stored in both fields. Since CDataObject contains a Vec<u8> buffer, this duplicates the memory allocation.
This pattern is repeated in from_address, from_buffer, and from_buffer_copy. Consider whether _base should hold the canonical data and cdata should be removed, or if they should share a reference.
pub struct PyCUnion {
- _base: PyCData,
- /// Common CDataObject for memory buffer
- pub(super) cdata: PyRwLock<CDataObject>,
+ /// Base PyCData containing the memory buffer
+ _base: PyCData,
}If both are needed, consider having cdata be a reference or accessor to _base's data rather than a clone.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/union.rs around lines 124 to 129, the struct
stores duplicated CDataObject data in both `_base: PyCData` and `cdata:
PyRwLock<CDataObject>`, causing cloned Vec<u8> buffers; fix by consolidating
storage: either remove `cdata` and use `_base` as the canonical CDataObject
everywhere, or make `cdata` a non-owning accessor to `_base` (e.g., point to
`_base`'s inner CDataObject via a shared reference/Arc or expose RwLock on
`_base`), then update the constructors
(from_address/from_buffer/from_buffer_copy) to stop cloning the buffer into two
places and adjust any methods to access the single canonical storage; ensure
thread-safety by using a shared lock or reference type consistently and update
types/signatures where needed.
| let cdata = CDataObject::from_stg_info(&stg_info); | ||
| PyCUnion { | ||
| cdata: PyRwLock::new(CDataObject::from_stg_info(&stg_info)), | ||
| _base: PyCData::new(cdata.clone()), | ||
| cdata: PyRwLock::new(cdata), | ||
| } | ||
| .into_ref_with_type(vm, cls) | ||
| .map(Into::into) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CDataObject is cloned unnecessarily.
The code creates a CDataObject, clones it for _base, and then wraps the original in cdata. This means two separate Vec<u8> buffers exist with identical content. If one is modified, the other won't reflect the change, which could lead to subtle bugs.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/union.rs around lines 177 to 183, the code
constructs a CDataObject and clones it so two independent buffers exist; instead
share the same underlying CDataObject to avoid duplicate Vec<u8> allocations and
divergent state. Change the code to create one shared CDataObject wrapped in a
reference-counted pointer (e.g., Arc or Rc as appropriate for the threading
model) and pass clones of that Arc to both PyCData::new and PyRwLock::new so
both fields reference the same buffer rather than performing a deep clone.
| let tuple = | ||
| <Self::Data as ::rustpython_vm::types::PyStructSequenceData>::into_tuple(data, vm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check how PyStructSequenceData is organized in the types module
echo "=== Checking types module structure ==="
fd -t f "mod.rs" crates/vm/src/types/ -exec cat {} \;
echo ""
echo "=== Checking for PyStructSequenceData definition and re-exports ==="
rg -n "trait PyStructSequenceData|pub use.*PyStructSequenceData" crates/vm/src/types/ --type rust
echo ""
echo "=== Check the current file structseq.rs for trait definition ==="
rg -n "trait PyStructSequenceData|pub.*struct.*Data" crates/vm/src/types/structseq.rs --type rustRepository: RustPython/RustPython
Length of output: 635
🏁 Script executed:
#!/bin/bash
# Check the context around lines 187-188 to understand the change
echo "=== Context around lines 187-188 in structseq.rs ==="
sed -n '180,200p' crates/vm/src/types/structseq.rs
echo ""
echo "=== Check what Data is bound to (look for the full method signature) ==="
sed -n '175,195p' crates/vm/src/types/structseq.rsRepository: RustPython/RustPython
Length of output: 1939
Remove the unnecessary fully qualified trait path; use simpler syntax.
The trait PyStructSequenceData is defined in this file and re-exported in crates/vm/src/types/mod.rs, so it is already in scope. The fully qualified syntax <Self::Data as ::rustpython_vm::types::PyStructSequenceData>::into_tuple(data, vm) is unnecessarily verbose.
Use simpler syntax instead: Self::Data::into_tuple(data, vm) or just rely on method call syntax data.into_tuple(vm).
🤖 Prompt for AI Agents
In crates/vm/src/types/structseq.rs around lines 187 to 188, the code uses a
fully qualified trait path `<Self::Data as
::rustpython_vm::types::PyStructSequenceData>::into_tuple(data, vm)` which is
unnecessarily verbose; replace it with the simpler form
`Self::Data::into_tuple(data, vm)` or call the method directly as
`data.into_tuple(vm)` since PyStructSequenceData is already in scope via the
module re-export; update that line accordingly to remove the fully qualified
path.
|
|
||
| print("done") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# First, find the snippet runner code and output handling
echo "=== Searching for snippet runner code ==="
rg -n "extra_tests.*snippets|snippets.*runner" --hidden -g '!target' . | head -20
echo -e "\n=== Searching for references to stdlib_ctypes ==="
rg -n "stdlib_ctypes" . --hidden -g '!target' | head -20
echo -e "\n=== Looking for extra_tests directory structure ==="
find . -path './target' -prune -o -type f -name "*.py" -path "*extra_tests*" -print | head -20Repository: RustPython/RustPython
Length of output: 1153
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for snippet harness/runner configuration
echo "=== Searching for snippet runner or harness code ==="
fd -H "snippet" --type f -e py -e sh -e toml -e yaml . | grep -v target | head -30
echo -e "\n=== Checking extra_tests directory ==="
ls -la extra_tests/ 2>/dev/null | head -20
echo -e "\n=== Looking for expected output or snapshot files ==="
find extra_tests -type f \( -name "*.expected" -o -name "*.out" -o -name "*.txt" -o -name "*snapshot*" \) 2>/dev/null | head -20Repository: RustPython/RustPython
Length of output: 945
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Read the file around lines 400-401
echo "=== Context around lines 400-401 in stdlib_ctypes.py ==="
if [ -f "extra_tests/snippets/stdlib_ctypes.py" ]; then
wc -l extra_tests/snippets/stdlib_ctypes.py
echo ""
# Show context: lines 390-410
sed -n '390,410p' extra_tests/snippets/stdlib_ctypes.py | cat -n
else
echo "File not found"
fi
echo -e "\n=== Searching for how extra_tests snippets are run ==="
rg -n "import|exec|subprocess|test" -A 3 -B 3 "extra_tests" . --hidden -g '!target' -g '!*.pyc' | head -50Repository: RustPython/RustPython
Length of output: 2790
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if there's a test runner script or CI configuration
echo "=== Looking for test runner scripts ==="
find . -path './target' -prune -o -type f \( -name "*.py" -o -name "*.sh" \) -print | xargs grep -l "extra_tests.*snippets\|snippets.*extra_tests" 2>/dev/null | head -20
echo -e "\n=== Checking CI configuration ==="
ls -la .github/ 2>/dev/null | grep -E "\.yml|\.yaml"
echo -e "\n=== Checking for Makefile or build scripts ==="
ls -la | grep -E "Makefile|makefile|build|test"Repository: RustPython/RustPython
Length of output: 405
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# First check the actual file
echo "=== Reading stdlib_ctypes.py file size and end ==="
if [ -f "extra_tests/snippets/stdlib_ctypes.py" ]; then
echo "File exists"
tail -50 extra_tests/snippets/stdlib_ctypes.py | cat -n
else
echo "File not found at that path"
# Try to find it
find . -name "stdlib_ctypes.py" -path "*extra_tests*" 2>/dev/null
fi
echo -e "\n=== Searching for how snippets are invoked ==="
rg -n "\.py.*test|exec|subprocess" extra_tests/ -g '!*.pyc' -g '!target' 2>/dev/null | head -30Repository: RustPython/RustPython
Length of output: 4671
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for Python files that might be test runners or invoke extra_tests
echo "=== Searching for test runner files ==="
find . -path ./target -prune -o -type f -name "*.py" -print | xargs grep -l "extra_tests" 2>/dev/null
echo -e "\n=== Checking if there's a conftest or pytest config ==="
find . -path ./target -prune -o -type f \( -name "conftest.py" -o -name "pytest.ini" -o -name "pyproject.toml" \) -print
echo -e "\n=== Looking at the actual file to understand context ==="
head -1 extra_tests/snippets/stdlib_ctypes.py 2>/dev/null || echo "File not found"
tail -20 extra_tests/snippets/stdlib_ctypes.py 2>/dev/null | cat -nRepository: RustPython/RustPython
Length of output: 1250
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for any documentation or readme
echo "=== Checking for documentation ==="
find . -path ./target -prune -o -type f \( -name "README*" -o -name "*.md" \) -path "*extra_tests*" -print
# Look for Rust code that might handle snippet execution
echo -e "\n=== Searching for Rust test/snippet handling ==="
find . -path ./target -prune -o -type f -name "*.rs" -print | xargs grep -l "extra_tests\|snippet" 2>/dev/null | head -10Repository: RustPython/RustPython
Length of output: 212
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# List the extra_tests directory
echo "=== Extra tests directory structure ==="
ls -la extra_tests/ 2>/dev/null
echo -e "\n=== Snippets directory ==="
ls -la extra_tests/snippets/ 2>/dev/null | head -30
echo -e "\n=== Check if stdlib_ctypes.py exists ==="
if [ -f "extra_tests/snippets/stdlib_ctypes.py" ]; then
echo "File exists"
echo "File size: $(wc -l < extra_tests/snippets/stdlib_ctypes.py) lines"
echo "Last 10 lines:"
tail -10 extra_tests/snippets/stdlib_ctypes.py
else
echo "File does not exist at expected location"
find . -name "stdlib_ctypes.py" 2>/dev/null
fiRepository: RustPython/RustPython
Length of output: 2704
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Try to find any test-related configuration or runner
find . -type f \( -name "*.toml" -o -name "*.yaml" -o -name "*.yml" -o -name "*.cfg" \) ! -path "*/target/*" | xargs grep -l "test\|snippet" 2>/dev/null | head -10Repository: RustPython/RustPython
Length of output: 363
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check git to see recent changes to stdlib_ctypes.py
echo "=== Git status and recent changes ==="
git status 2>/dev/null | head -30 || echo "Git not available in this view"
# Try to get the actual file listing to verify directory exists
echo -e "\n=== Checking directory structure directly ==="
test -d "extra_tests" && echo "extra_tests exists" || echo "extra_tests not found"
test -d "extra_tests/snippets" && echo "extra_tests/snippets exists" || echo "extra_tests/snippets not found"
# List any Python files in extra_tests
echo -e "\n=== Python files in extra_tests ==="
find extra_tests -name "*.py" 2>/dev/null | head -20 || echo "Could not find extra_tests"Repository: RustPython/RustPython
Length of output: 1066
🌐 Web query:
RustPython extra_tests snippets stdout handling
💡 Result:
Brief summary and tips
-
Rust test harness (and many Rust test runners) capture stdout of passing tests by default; use --nocapture / -- --show-output to see printed output. This explains why println!/stdout may be hidden during tests. [1]
-
RustPython’s extra_tests/snippets are Python tests included in the repo; they run under RustPython’s test-execution infrastructure and can be affected by the Rust-side test runner’s output-capture. See the RustPython contribution guide for running/unskipping CPython unittest files. [2]
-
If you need deterministic capture behavior when running tests with nextest/other runners, configure the runner’s capture strategy (Split, Combined, None) or pass the runner flag to disable capture. For nextest, CaptureStrategy controls stdout/stderr capture. [3]
Practical actions
- Run cargo test with the test-runner flag to disable capture: cargo test -- --nocapture (or cargo nextest run with capture strategy None / appropriate flag for your runner). [1][3]
- If you want output from failing Python snippets only, rely on the default behavior (captured output is printed when a test fails) or explicitly write to stderr from the snippet so it’s more likely to appear. [1]
- When contributing to RustPython extra_tests, follow the repo guidance (skip/expectedFailure) to avoid noisy output while iterating. [2]
Sources
[1] Stack Overflow — Why doesn't println! work in Rust unit tests? (explanation of test stdout capture and --nocapture).
[2] RustPython blog — How to contribute by CPython unittest (notes about extra_tests/snippets).
[3] nextest-runner docs — CaptureStrategy (Split/Combined/None) for controlling stdout/stderr capture.
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check for any documentation about testing
echo "=== Root level documentation ==="
ls -la *.md *.txt *.rst 2>/dev/null | head -20
# Check for Makefile or test scripts in root
echo -e "\n=== Root level test/build scripts ==="
ls -la Makefile* makefile* test* run* 2>/dev/null || echo "None found"
# Try to find any Cargo.toml that might reference test configuration
echo -e "\n=== Checking Cargo.toml for test configuration ==="
if [ -f "Cargo.toml" ]; then
head -50 Cargo.toml
fiRepository: RustPython/RustPython
Length of output: 288
🏁 Script executed:
#!/bin/bash
# Simple listing to understand structure
pwd
ls -la | head -30Repository: RustPython/RustPython
Length of output: 1913
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Examine the test_snippets.py to understand how snippets are executed
echo "=== test_snippets.py content ==="
head -80 extra_tests/test_snippets.pyRepository: RustPython/RustPython
Length of output: 2197
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for any stdout/stderr assertions or expected output handling
echo "=== Checking for output validation in test framework ==="
rg -n "stdout|stderr|expected|output|assert" extra_tests/test_snippets.py -A 2 -B 2Repository: RustPython/RustPython
Length of output: 121
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check extra_tests/README.md for documentation on snippet testing
echo "=== extra_tests/README.md ==="
cat extra_tests/README.mdRepository: RustPython/RustPython
Length of output: 735
Remove the unconditional print("done") — it adds unnecessary noise to test output without providing validation value.
The snippet test harness (extra_tests/test_snippets.py) uses subprocess.check_call() which only validates exit codes, not stdout content. While this print won't cause test failures, it clutters CI logs with output that serves no testing purpose. The commented-out print on the preceding line demonstrates the correct pattern when testing is not needed.
Suggested diff:
-
-print("done")🤖 Prompt for AI Agents
In extra_tests/snippets/stdlib_ctypes.py around lines 400-401, remove the
unconditional print("done") which only clutters CI logs; delete that line (or
revert it to the commented example on the previous line) so the test produces no
unnecessary stdout while preserving the existing behavior validated by exit
codes.
666979e to
941b373
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
crates/vm/src/vm/context.rs (1)
511-526: Custom exception types created withbasicsize=0will trigger the debug_assert invm.new_exception().Exception types created via
new_exception_type()(lines 511–526) setslots.basicsize = 0, butvm.new_exception()asserts thatexc_type.slots.basicsize == std::mem::size_of::<PyBaseException>(). This affects all stdlib exception types created withnew_exception_type()(e.g.,binascii.Error,socket.error, etc.), which are raised viavm.new_exception_msg()→vm.new_exception(). Debug builds will fail when these exceptions are raised.Set
basicsizetostd::mem::size_of::<PyBaseException>()instead of0to match the assertion.crates/vm/src/stdlib/ctypes/base.rs (1)
423-428: Remove redundant storage:PyCSimpleunnecessarily maintains two independent copies ofCDataObject.
PyCSimplestores_base: PyCData(containingPyRwLock<CDataObject>) and a separatecdata: PyRwLock<CDataObject>, both initialized from the sameCDataObject. This pattern appears across all ctypes classes (PyCStructure, PyCUnion, PyCArray, etc.). The_basefield is never accessed after initialization; all code uses thecdatafield directly. This creates two independent copies that can diverge if either is modified, and suggests a design issue with how the PyO3 class inheritance pattern is being used.Consider:
- Whether the
_basefield can be removed if the Python inheritance contract can be satisfied differently- Consolidating to a single source of truth for
CDataObject- Documenting why both copies are intentionally maintained if this pattern is required
crates/vm/src/stdlib/ctypes/array.rs (1)
197-214: Two independentCDataObjects per instance +in_dllkeepalive written to only one copy.Right now
_baseandcdatahold separateCDataObjectclones. Beyond memory overhead, this can break correctness forin_dll: you setbase = Some(dll)onPyCArray.cdata(Line 209-211 / 762-765), but_base’s copy won’t get that update—so any base-class logic (or GC traversal) that reads_base’s backing store may miss the DLL keepalive reference.Recommendation: make
_basethe canonical storage and have all reads/writes go through it (or share storage viaArc), and ensure thebasekeepalive is applied to the canonicalCDataObject.Also applies to: 241-248, 307-317, 538-547, 606-615, 665-674, 751-768
crates/vm/src/stdlib/io.rs (1)
20-73: Prefer not tounwrap()when settingwinerrorattr (avoid panic during exception construction).
exc.as_object().set_attr(...).unwrap()can panic while handling an I/O error, which is a pretty bad failure mode. Return/ignore the error explicitly instead (you already treat this as a best-effort “FIXME” path).- exc.as_object() - .set_attr("winerror", vm.new_pyobj(winerror), vm) - .unwrap(); + let _ = exc + .as_object() + .set_attr("winerror", vm.new_pyobj(winerror), vm);crates/vm/src/exceptions.rs (1)
1548-1602: Thedowncast_ref::<PyOSError>().unwrap()will panic when called with OSError subclass instances.The code comment claims all OSError subclasses use
#[repr(transparent)]wrapping, enabling safe downcasting. However,downcast_refchecks type identity viaTypeIdcomparison, not memory layout. Whenslot_initis inherited and called with aFileNotFoundErrorinstance, theTypeIdisFileNotFoundError, notPyOSError, sodowncast_refreturnsNoneandunwrap()panics.Replace the unwrap with safe error handling:
- #[allow(deprecated)] - let exc: &Py<PyOSError> = zelf.downcast_ref::<PyOSError>().unwrap(); + #[allow(deprecated)] + let Some(exc) = zelf.downcast_ref::<PyOSError>() else { + return Err(vm.new_runtime_error( + "internal error: OSError slot_init called with non-OSError payload", + )); + };
♻️ Duplicate comments (5)
crates/vm/src/types/structseq.rs (1)
186-193: Simplify the fully-qualified trait call infrom_data()(already suggested earlier).
You can callSelf::Data::into_tuple(data, vm)ordata.into_tuple(vm)instead of<Self::Data as ::rustpython_vm::types::PyStructSequenceData>::into_tuple(...).crates/vm/src/object/core.rs (1)
1076-1103: Replaceunreachable_unchecked()ininto_base()/upcast()to avoid UB on invariant violations (raised earlier).
Even if the invariant “should be impossible” holds, a macro/layout bug would become instant UB instead of a debuggable panic.pub fn into_base(self) -> PyRef<T::Base> { let obj: PyObjectRef = self.into(); match obj.downcast() { Ok(base_ref) => base_ref, - Err(_) => unsafe { std::hint::unreachable_unchecked() }, + Err(_) => unreachable!("PySubclass invariant violated: into_base downcast failed"), } } pub fn upcast<U: PyPayload + StaticType>(self) -> PyRef<U> where T: StaticType, { debug_assert!(T::static_type().is_subtype(U::static_type())); let obj: PyObjectRef = self.into(); match obj.downcast::<U>() { Ok(upcast_ref) => upcast_ref, - Err(_) => unsafe { std::hint::unreachable_unchecked() }, + Err(_) => unreachable!( + "PySubclass invariant violated: upcast to {} failed", + std::any::type_name::<U>() + ), } }crates/vm/src/stdlib/ctypes/union.rs (1)
124-129:CDataObjectis duplicated (deep-cloned) between_baseandcdata→ wasted memory + potential state divergence.Each constructor builds one
CDataObject, then clones it into_baseand also stores another copy incdata. If either side is mutated (buffer/base/objects), they can silently diverge.Prefer a single canonical storage:
- Either remove
cdataand route all access through_base’s storage, or- Change
PyCDatato share the same underlying buffer/lock (e.g., store anArc<PyRwLock<CDataObject>>and clone theArc, not theCDataObject>).(Already raised in prior review on earlier commits.)
Also applies to: 175-184, 209-216, 257-264, 296-303
crates/stdlib/src/ssl.rs (1)
412-427: WantRead/WantWrite should not seterrno(set it toNone).
SSL_ERROR_WANT_READ/WRITEare SSL-layer codes, not OS errno values. CPython’sssl.SSLWantReadError/ssl.SSLWantWriteErrortypically exposeerrno = None, so passingSome(SSL_ERROR_WANT_READ/WRITE)is misleading and diverges from expected behavior (as discussed in the earlier review).Proposed change:
pub(super) fn create_ssl_want_read_error(vm: &VirtualMachine) -> PyRef<PyOSError> { vm.new_os_subtype_error( PySSLWantReadError::class(&vm.ctx).to_owned(), - Some(SSL_ERROR_WANT_READ), + None, "The operation did not complete (read)", ) } pub(super) fn create_ssl_want_write_error(vm: &VirtualMachine) -> PyRef<PyOSError> { vm.new_os_subtype_error( PySSLWantWriteError::class(&vm.ctx).to_owned(), - Some(SSL_ERROR_WANT_WRITE), + None, "The operation did not complete (write)", ) }crates/vm/src/exceptions.rs (1)
1438-1482: GC traversal forPyOSErrornow includesbase— fixes the earlier GC-safety concern.
This addresses the missing-base traversal issue from prior review feedback.
🧹 Nitpick comments (7)
crates/derive-impl/src/pyclass.rs (1)
431-502: GeneratedPySubclassimpl assumes tuple struct layout (&self.0)—guard this or you’ll get hard compile errors later.
Right now,subclass_implis generated for any#[repr(transparent)]+base, but it uses&self.0, which only works for tuple structs. Sincevalidate_base_field()permits named structs, a future#[repr(transparent)] struct X { base: Base, ... }will break codegen.Practical fix: plumb a “is_tuple_struct” boolean into
generate_class_def(...)(or generatesubclass_implinsideimpl_pyclass(...)where you have access to theItem::Structfields).crates/vm/src/stdlib/ast/pyast.rs (1)
81-108: Consider makingNodeModexplicitly#[repr(transparent)]too (consistency / future-proof).
NodeStmtis explicitly transparent, butNodeModisn’t even though it’s also a single-field wrapper. If any codegen or safety checks start keying offrepr(transparent), this difference may become surprising.#[pyclass(module = "_ast", name = "mod", base = NodeAst)] -pub(crate) struct NodeMod(NodeAst); +#[repr(transparent)] +pub(crate) struct NodeMod(NodeAst);crates/vm/src/builtins/bool.rs (1)
222-228: Potential panic if called on non-PyBool object.The
get_valuefunction usesunwrap()on the downcast result, which will panic if called with an object that isn't aPyBool. All current call sites in this file checkfast_isinstance(vm.ctx.types.bool_type)before callingget_value, but this is a fragile contract.Consider either:
- Adding a debug assertion or comment documenting the precondition
- Returning an
Option<bool>or usingexpect()with a descriptive messagepub(crate) fn get_value(obj: &PyObject) -> bool { - !obj.downcast_ref::<PyBool>() - .unwrap() - .0 - .as_bigint() - .is_zero() + !obj.downcast_ref::<PyBool>() + .expect("get_value called on non-bool object") + .0 + .as_bigint() + .is_zero() }crates/stdlib/src/ssl.rs (1)
1868-1881: Avoid hardcodingENOENTas2.Using
Some(2)is brittle; prefer the platform constant (e.g.,libc::ENOENTor a helper that mapsstd::io::ErrorKind::NotFoundto the VM’s errno plumbing).At minimum, please confirm this behaves correctly on Windows targets in your CI matrix.
crates/vm/src/stdlib/io.rs (1)
425-461:_IOBase::__closedreturningPyRef<PyBool>is the right typing improvement; consider using a sharedfalseif available.
The typed return aligns with the PR’s “subclass-typed parameters” goal. IfContexthas an internedfalseobject, prefer that to avoid repeated allocations.crates/vm/src/exceptions.rs (2)
1524-1546:slot_newerrno→subtype normalization is good; be careful about silently swallowing constructor failures.
vm.invoke_exception(...).ok()drops error details; if that’s intentional (best-effort normalization), consider a comment to clarify why failures should fall back rather than propagate.
1565-1595: Prefer using the safeswap_to_temporary_refspath inslot_init, notunsafe ...swap(...).
You already useswap_to_temporary_refsin setters; doing the same inslot_initwould likely let you delete theunsafeblocks and reduce the chance of GC-barrier footguns.Also applies to: 1683-1735
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (27)
crates/derive-impl/src/pyclass.rs(8 hunks)crates/derive-impl/src/pystructseq.rs(2 hunks)crates/stdlib/src/socket.rs(4 hunks)crates/stdlib/src/ssl.rs(20 hunks)crates/stdlib/src/ssl/compat.rs(5 hunks)crates/vm/src/builtins/bool.rs(4 hunks)crates/vm/src/builtins/int.rs(1 hunks)crates/vm/src/class.rs(1 hunks)crates/vm/src/exception_group.rs(1 hunks)crates/vm/src/exceptions.rs(13 hunks)crates/vm/src/object/core.rs(2 hunks)crates/vm/src/object/payload.rs(3 hunks)crates/vm/src/stdlib/ast/pyast.rs(14 hunks)crates/vm/src/stdlib/ctypes.rs(3 hunks)crates/vm/src/stdlib/ctypes/array.rs(8 hunks)crates/vm/src/stdlib/ctypes/base.rs(4 hunks)crates/vm/src/stdlib/ctypes/field.rs(1 hunks)crates/vm/src/stdlib/ctypes/function.rs(6 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(7 hunks)crates/vm/src/stdlib/ctypes/structure.rs(6 hunks)crates/vm/src/stdlib/ctypes/union.rs(6 hunks)crates/vm/src/stdlib/io.rs(22 hunks)crates/vm/src/stdlib/os.rs(1 hunks)crates/vm/src/types/structseq.rs(1 hunks)crates/vm/src/vm/context.rs(7 hunks)crates/vm/src/vm/vm_new.rs(3 hunks)extra_tests/snippets/stdlib_ctypes.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
- crates/vm/src/stdlib/os.rs
- crates/vm/src/class.rs
- crates/stdlib/src/ssl/compat.rs
- crates/vm/src/stdlib/ctypes.rs
- extra_tests/snippets/stdlib_ctypes.py
- crates/vm/src/exception_group.rs
- crates/vm/src/stdlib/ctypes/field.rs
- crates/vm/src/builtins/int.rs
- crates/vm/src/stdlib/ctypes/function.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/derive-impl/src/pystructseq.rscrates/vm/src/object/core.rscrates/vm/src/object/payload.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/base.rscrates/derive-impl/src/pyclass.rscrates/vm/src/vm/context.rscrates/vm/src/stdlib/ctypes/structure.rscrates/stdlib/src/ssl.rscrates/vm/src/types/structseq.rscrates/stdlib/src/socket.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ast/pyast.rscrates/vm/src/vm/vm_new.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/builtins/bool.rscrates/vm/src/stdlib/io.rscrates/vm/src/exceptions.rs
🧠 Learnings (6)
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/derive-impl/src/pystructseq.rscrates/derive-impl/src/pyclass.rscrates/vm/src/vm/context.rscrates/stdlib/src/socket.rscrates/vm/src/stdlib/ast/pyast.rscrates/vm/src/vm/vm_new.rscrates/vm/src/builtins/bool.rscrates/vm/src/exceptions.rs
📚 Learning: 2025-12-09T08:46:58.660Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.
Applied to files:
crates/vm/src/object/payload.rscrates/derive-impl/src/pyclass.rscrates/vm/src/vm/context.rscrates/stdlib/src/socket.rscrates/vm/src/stdlib/ast/pyast.rscrates/vm/src/vm/vm_new.rscrates/vm/src/exceptions.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/stdlib/src/ssl.rscrates/stdlib/src/socket.rscrates/vm/src/exceptions.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Applied to files:
crates/vm/src/types/structseq.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/types/structseq.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/**/*.py : Minimize modifications to CPython standard library files in the `Lib/` directory; modifications should be minimal and only to work around RustPython limitations
Applied to files:
crates/vm/src/stdlib/io.rs
🧬 Code graph analysis (9)
crates/derive-impl/src/pystructseq.rs (3)
crates/vm/src/object/payload.rs (4)
payload_type_id(30-32)std(31-31)validate_downcastable_from(41-43)class(54-54)crates/vm/src/builtins/type.rs (1)
fast_issubclass(629-631)crates/vm/src/builtins/str.rs (1)
try_traverse(1928-1930)
crates/vm/src/stdlib/ctypes/pointer.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (3)
new(223-227)new(261-267)from_bytes(184-191)
crates/vm/src/stdlib/ctypes/base.rs (1)
crates/vm/src/stdlib/ctypes/function.rs (1)
value(52-52)
crates/derive-impl/src/pyclass.rs (2)
crates/vm/src/object/payload.rs (4)
std(31-31)class(54-54)payload_type_id(30-32)validate_downcastable_from(41-43)crates/vm/src/class.rs (1)
as_base(187-187)
crates/vm/src/stdlib/ctypes/structure.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
from_stg_info(174-181)new(223-227)new(261-267)from_bytes(184-191)
crates/vm/src/stdlib/ctypes/array.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (5)
from_bytes(184-191)new(223-227)new(261-267)buffer(1032-1032)buffer(1042-1042)
crates/vm/src/stdlib/ast/pyast.rs (1)
crates/derive-impl/src/util.rs (1)
base(416-418)
crates/vm/src/vm/vm_new.rs (1)
crates/vm/src/exceptions.rs (5)
errno(1533-1534)errno(1535-1535)errno(1685-1687)errno_to_exc_type(1163-1189)errno_to_exc_type(1192-1194)
crates/vm/src/stdlib/ctypes/union.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
from_stg_info(174-181)new(223-227)new(261-267)from_bytes(184-191)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (37)
crates/derive-impl/src/pyclass.rs (2)
718-729: Good: stop scanningwith(...)onceConstructoris found.
This reduces wasted work and matches the intent expressed in prior review.
611-638: Verify the semantic intent of the BASICSIZE check for repr(transparent) wrappers.The size/subclass check is sound, but for
#[repr(transparent)]types withbase=, the BASICSIZE validation effectively becomes a no-op: the macro generatesBASICSIZE = 0(since the wrapper has identical layout to its base), and at runtimeslots.basicsizeis automatically inherited to match the base's size. This means the checkBASICSIZE <= obj.class().slots.basicsizebecomes0 <= base_size, which always passes.This design is correct—
BASICSIZE = 0semantically signals "no additional payload"—but confirm that:
- The size check isn't intended as a safety guard against under-allocated objects (since it's effectively always-true for repr(transparent))
- The actual validation responsibility falls entirely to the
fast_issubclass()call for these types- All code paths that construct repr(transparent) types correctly inherit
slots.basicsizefrom their basecrates/vm/src/stdlib/ast/pyast.rs (1)
6-16: Macro wrapper layout looks correct for the new base-first-field scheme.
The generated#[repr(transparent)] struct $name($base);is consistent with the base-field validation and PySubclass expectations.crates/vm/src/object/payload.rs (2)
19-26: Good refactor: centralcold_downcast_type_error()improves call-site clarity.Also applies to: 50-52
102-121: Good safety check: reject subclass construction whenbasicsizediffers.
This closes a class of “valid subtype but incompatible payload layout” issues and provides a clear error.crates/vm/src/stdlib/ctypes/pointer.rs (2)
11-15: PyCPointerType wrapper layout looks consistent with the new subclass/payload approach.
58-68: The _base field design for pointer types is intentional and correct; no refactoring is needed.The concern about pointer backing storage is based on a misunderstanding of the implementation. PyCPointer types intentionally store their value in a separate
contentsfield (a PyRwLock) rather than in the _base.buffer. This design is consistent with how other complex ctypes like PyCArray also separate the inherited base class (_base: PyCData) from type-specific data storage. The empty buffer in _base is not a limitation—it's the correct representation for pointer types, which are value wrappers rather than memory buffers. All five pointer instantiation sites consistently use this pattern, there are no buffer size checks for pointers, and no downstream code attempts to access or depend on _base.buffer for pointer types.crates/vm/src/vm/context.rs (1)
35-37: Bool caching viaPyRef<PyBool>looks consistent and keepsnew_bool()fast.Also applies to: 306-307, 450-457
crates/vm/src/vm/vm_new.rs (1)
95-107: Good:new_exception()now asserts payload size matchesPyBaseException.
This is helpful for catching layout regressions during the subclass-payload transition.crates/vm/src/stdlib/ctypes/base.rs (3)
222-228: LGTM - New PyCData constructor.The
newconstructor properly wraps theCDataObjectin aPyRwLock, providing a clean API for creatingPyCDatainstances.
238-241: LGTM - repr(transparent) for PyCSimpleType.The
#[repr(transparent)]attribute ensuresPyCSimpleTypehas the same memory layout as its innerPyType, which is correct for the newtype wrapper pattern being adopted across this PR.
659-664: Data cloned into both_baseandcdata.The
CDataObjectis cloned and stored in both_base(viaPyCData::new(cdata.clone())) andcdata(viaPyRwLock::new(cdata)). This means modifications to one won't reflect in the other. If this is intentional (e.g.,_baseis only for type hierarchy andcdatais the working copy), please add a comment clarifying this design choice.crates/derive-impl/src/pystructseq.rs (3)
448-452: LGTM - Newtype wrapper for PyStructSequence.The generated struct correctly uses
#[repr(transparent)]to wrapPyTuple, ensuring the memory layout matches the inner type. The public inner field (pub) allows direct access when needed.
481-496: LGTM - PyPayload implementation for struct sequences.The implementation correctly:
- Delegates
payload_type_id()toPyTupleto share the same type ID- Uses
fast_issubclassinvalidate_downcastable_fromto properly handle subclass relationships- Returns the static type from
class()This aligns with the external code snippets showing similar patterns.
498-513: LGTM - MaybeTraverse and PySubclass implementations.The
MaybeTraverseimplementation correctly delegates to the innerPyTuplefor GC traversal. ThePySubclassimplementation properly defines the base type relationship and provides access to the inner tuple viaas_base().crates/stdlib/src/socket.rs (5)
18-18: LGTM - PyOSError import added.The import aligns with the updated error handling that now uses
PyOSErrorsubtypes.
1852-1852: LGTM - Timeout error now uses upcast.The
timeout_error(vm).upcast()correctly converts thePyRef<PyOSError>toPyBaseExceptionReffor the exception handling flow.
2420-2421: LGTM - Improved socket error construction.Using
new_os_subtype_errorinstead ofnew_exceptionpreserves the specific error subtype (gaierror/herror), includes the error number, and maintains the error message. This provides better error information for debugging.
2424-2429: LGTM - Updated timeout error functions.Both
timeout_errorandtimeout_error_msgnow returnPyRef<PyOSError>, usingnew_os_subtype_errorfor consistent error construction with the timeout exception class. TheNonefor errno is appropriate since timeouts don't have a specific system error number.
1829-1833: LGTM - From implementation for PyOSError.This enables seamless conversion from
PyRef<PyOSError>toIoOrPyExceptionviaupcast(), which safely converts the specialized exception type to its basePyBaseExceptionReftype through the generic method defined incrate::vm::object::core.crates/vm/src/stdlib/ctypes/structure.rs (4)
17-20: LGTM - repr(transparent) for PyCStructType.Consistent with the pattern applied to
PyCSimpleTypein base.rs, this ensures proper memory layout for the newtype wrapper.
222-229: Same duplication pattern as PyCSimple.
PyCStructurenow has both_base: PyCDataandcdata: PyRwLock<CDataObject>, following the same pattern asPyCSimple. The concern about potential inconsistencies between the two copies applies here as well.
299-304: Consistent construction pattern with PyCSimple.The construction correctly creates a
CDataObjectand uses it for both_baseandcdata. This is consistent with the approach in base.rs.
370-377: LGTM - Consistent construction in factory methods.All factory methods (
from_address,from_buffer,from_buffer_copy) follow the same pattern of creating aCDataObjectand populating both_baseandcdatafields consistently.Also applies to: 423-430, 468-475
crates/vm/src/builtins/bool.rs (3)
4-4: LGTM - Import consolidation.Minor formatting change to consolidate imports on one line.
88-97: LGTM - PyBool as repr(transparent) wrapper around PyInt.The
#[repr(transparent)]attribute ensuresPyBoolhas the same layout asPyInt, enabling proper subclass payload handling. TheDebugimplementation correctly derives the boolean value from the inner bigint.Note: The past review comment mentioned concerns about
#[repr(C)]vs#[repr(transparent)]for base field layout. Using#[repr(transparent)]here is correct since it guarantees the same layout as the inner type.
21-34: Behavior change: TryFromBorrowedObject now accepts PyInt in addition to bool.The implementation downcasts to
PyIntinstead of checking for thebool_type, accepting any integer where zero is falsy and non-zero is truthy. This aligns with Python semantics whereboolis a subclass ofintandbool(0)returnsFalsewhilebool(n)returnsTruefor any non-zeron.Verify that this reflects the intended behavior: does the VM's type system follow Python's class hierarchy where
boolis a subclass ofint, or shouldboolconversion only accept actual bool instances? Check call sites and tests to confirm the expected usage.crates/vm/src/stdlib/ctypes/union.rs (1)
15-44: Metaclass newtype looks fine; ensure fmt/clippy are clean.The
#[repr(transparent)] PyCUnionType(PyType)pattern is sensible for the new subclass payload layout; just make surecargo fmt+cargo clippyare run since these metaclass wrappers can surface lint/trait/derive edge cases quickly.crates/vm/src/stdlib/ctypes/array.rs (1)
25-28:PyCArrayType(PyType)wrapper is consistent with the PR direction.crates/vm/src/stdlib/io.rs (5)
204-207:new_unsupported_operation()usingnew_os_subtype_error()is a nice consolidation; double-check the “errno None” semantics.
This matches the direction of the PR (centralized OS subtype construction). Just ensure callers don’t rely onUnsupportedOperation.errnobeing set in cases where CPython would leave it unset.
639-769:repr(transparent)base wrappers (_RawIOBase/_BufferedIOBase/_TextIOBase) look consistent; verify macro expectations around layout.
This is a core mechanism for the new payload layout. Please sanity-check that the#[pyclass(base = _IOBase)]machinery treats these as valid base payloads (esp. ZST base + transparent wrapper) across all supported compilers/targets.
1765-1950: Embedded_basefields: good direction; keep_basefirst everywhere to avoid layout surprises.
Most structs correctly place_baseas the first field—please keep this invariant for any follow-ups (it matters a lot for transparent/base-embedded subclass payload schemes).Also applies to: 2375-2382, 3656-3663, 3801-3810, 4440-4473
4245-4267: SingletonUnsupportedOperationtype: avoidContext::genesis()if “multiple contexts” is a supported future.
If multipleContexts are possible, consider keying the cache byContext(or storing onContextitself) rather than a process-global static created from genesis. If multiple contexts are impossible by design, a short comment here would prevent future refactors from breaking it.
82-97: The singleton Context assumption is architectural, not a bug. The code correctly usesContext::genesis()which is a true singleton via static initialization. All exception types and special objects in the codebase consistently rely on this same singleton Context design. This is an intentional architectural choice, not a hidden assumption. No changes are needed unless RustPython's design fundamentally shifts to support independent per-VM Contexts—which would require refactoring the entire context system, not just this function.crates/vm/src/exceptions.rs (3)
966-969: Exception zoo extension forPyOSErroris correctly simplified.
Dropping dynamic attribute wiring now thatPyOSErrorhas real fields is the right cleanup.
1235-1437: Broad#[repr(transparent)]wrapper rewrite looks consistent; please keep wrappers truly “transparent.”
Given how much relies on layout here, ensure none of these wrapper types ever gain extra fields (even accidentally) without also updating the subclass-payload assumptions. A brief doc comment near the pattern could help.Also applies to: 1737-2204
1485-1522: No changes needed—current implementation is CPython-compatible.The code correctly handles
filename2at index 4 when 5 arguments are provided. CPython uses the same argument structure across all platforms:OSError(errno, strerror, filename, winerror, filename2), wherewinerrorat index 3 is simply unused/ignored on POSIX systems rather than being repurposed. There is no separate 4-argument form for POSIX systems.Likely an incorrect or invalid review comment.
| pub fn new_os_error(&self, msg: impl ToPyObject) -> PyRef<PyBaseException> { | ||
| self.new_os_subtype_error(self.ctx.exceptions.os_error.to_owned(), None, msg) | ||
| .upcast() | ||
| } | ||
|
|
||
| pub fn new_os_subtype_error( | ||
| &self, | ||
| exc_type: PyTypeRef, | ||
| errno: Option<i32>, | ||
| msg: impl ToPyObject, | ||
| ) -> PyRef<PyOSError> { | ||
| debug_assert_eq!(exc_type.slots.basicsize, std::mem::size_of::<PyOSError>()); | ||
| let msg = msg.to_pyobject(self); | ||
|
|
||
| fn new_os_subtype_error_impl( | ||
| vm: &VirtualMachine, | ||
| exc_type: PyTypeRef, | ||
| errno: Option<i32>, | ||
| msg: PyObjectRef, | ||
| ) -> PyRef<PyOSError> { | ||
| let args = match errno { | ||
| Some(e) => vec![vm.new_pyobj(e), msg], | ||
| None => vec![msg], | ||
| }; | ||
| let payload = | ||
| PyOSError::py_new(&exc_type, args.into(), vm).expect("new_os_error usage error"); | ||
| payload | ||
| .into_ref_with_type(vm, exc_type) | ||
| .expect("new_os_error usage error") | ||
| } | ||
|
|
||
| new_os_subtype_error_impl(self, exc_type, errno, msg) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Searching for call sites of new_os_subtype_error..."
rg -n --type=rust "new_os_subtype_error\\(" -S || true
echo
echo "Searching for new_exception usages referencing os_error or OSError..."
rg -n --type=rust "new_exception\\(.*os_error|new_exception\\(.*OSError" -S || true
echo
echo "Show vm/src/vm/vm_new.rs around the function for context (lines 1-220)..."
sed -n '1,220p' crates/vm/src/vm/vm_new.rs || trueRepository: RustPython/RustPython
Length of output: 10670
Avoid panics when constructing OSError subtypes
new_os_subtype_error_impl() calls PyOSError::py_new(...).expect(...) and into_ref_with_type(...).expect(...), which will panic in release builds if exc_type isn't a proper OSError subtype. Change the API to avoid hard panics — either return PyResult<PyRef> or keep the current signature but replace the expects with a non-panicking path that raises a Python TypeError (e.g., vm.new_type_error(...)) so misuse produces a Python exception instead of aborting the process. Applies to crates/vm/src/vm/vm_new.rs (lines ~109–141) and the similar usage around lines ~259–264.
941b373 to
9017be0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (10)
crates/vm/src/stdlib/ctypes/function.rs (1)
212-227: Don’t silently coerce non-usize / negative “address” into NULL.
to_usize().unwrap_or(0)turns overflow/negative values into0, which can later trigger confusing “Function pointer not set” / NULL-call behavior. Return a Python exception instead.- if let Ok(addr) = first_arg.try_int(vm) { - let ptr_val = addr.as_bigint().to_usize().unwrap_or(0); + if let Ok(addr) = first_arg.try_int(vm) { + let ptr_val = addr + .as_bigint() + .to_usize() + .ok_or_else(|| vm.new_overflow_error("function address does not fit in usize".to_owned()))?; return PyCFuncPtr { _base: PyCData::new(CDataObject::from_bytes(vec![], None)), ptr: PyRwLock::new(Some(CodePtr(ptr_val as *mut _))), // ... } .into_ref_with_type(vm, cls) .map(Into::into); }crates/vm/src/stdlib/ctypes/base.rs (1)
423-428: Derivedcdataduplicates (and can desync from) base_base.cdata.
BecauseCDataObjectcontains an ownedVec<u8>,cdata.clone()copies the buffer; mutations through one path won’t be reflected in the other. This is especially risky since base methods (defined on_CData) will observe_base.cdata, while many derived ops mutateself.cdata.Concrete fix direction: ensure a single shared backing store (e.g.,
Arc<PyRwLock<CDataObject>>inPyCData) or eliminate the duplicatedcdatafield from derived types and always use_base.cdata.Also applies to: 658-668
crates/vm/src/stdlib/ctypes/structure.rs (1)
215-229: Same deep-clone backing-store split:_baseandcdatawon’t stay in sync.
All constructor paths build_basefromcdata.clone(), butCloneonCDataObjectcopies theVec<u8>. If any behavior reads/writes through the base_CDataAPI, it can observe different bytes/objects than the structure’s owncdata.Suggested fix is the same as for
PyCSimple: unify on a single shared backing store (or store only_baseand access its lock everywhere).Also applies to: 296-304, 369-377, 422-431, 467-476
crates/vm/src/stdlib/ctypes/array.rs (5)
95-124: Avoidunwrap()andusizeoverflow in__mul__(can panic / mis-size allocations).
let type_ref = zelf.downcast_ref::<PyType>().unwrap();should be a TypeError instead of a panic.let total_size = new_element_size * (n as usize);should bechecked_mul, returning OverflowError on overflow.- let type_ref = zelf.downcast_ref::<PyType>().unwrap(); + let type_ref = zelf + .downcast_ref::<PyType>() + .ok_or_else(|| vm.new_type_error("expected a type object".to_owned()))?; let new_element_size = inner_size; - let total_size = new_element_size * (n as usize); + let total_size = new_element_size + .checked_mul(n as usize) + .ok_or_else(|| vm.new_overflow_error("array size too large".to_owned()))?;
197-214:CDataObjectis stored twice (deep clone) → double memory + potential divergent state.This pattern:
let cdata = ...;_base: PyCData::new(cdata.clone()),cdata: PyRwLock::new(cdata),very likely duplicates the
Vec<u8>buffer and allows_basevsself.cdatato get out of sync.Strongly consider making
_basethe only owner of theCDataObjectand routing all accesses through it (or refactorPyCDatato share a single lock/Arc withPyCArray).pub struct PyCArray { _base: PyCData, - pub(super) cdata: PyRwLock<CDataObject>, + // cdata should live in _base (single source of truth) }
241-248:PyCArraylayout likely correct for subclass payloads, butcdataduplication should be removed.Same root issue as above: keeping both
_base: PyCDataandcdata: PyRwLock<CDataObject>suggests two separate buffers.
290-317: Uncheckedelement_size * lengthcan overflow; allocations should fail as Python errors (not wrap/panic).
let total_size = element_size * length; let mut buffer = vec![0u8; total_size];should usechecked_mul+ map allocation failure toMemoryError/OverflowError.- let total_size = element_size * length; - let mut buffer = vec![0u8; total_size]; + let total_size = element_size + .checked_mul(length) + .ok_or_else(|| vm.new_overflow_error("array size too large".to_owned()))?; + let mut buffer = vec![0u8; total_size];(If you want to be stricter, consider
Vec::try_reserve_exactto avoid aborting on OOM.)
507-548: All constructors repeat the_base = PyCData::new(cdata.clone())/cdata = PyRwLock::new(cdata)duplication.Please consolidate to a single canonical
CDataObjectstorage (ideally via_base) across:
from_addressfrom_bufferfrom_buffer_copyin_dllOtherwise modifications via buffer protocol / setters may not reflect in
_baseand vice versa.Also applies to: 550-616, 617-675, 676-769
crates/vm/src/stdlib/io.rs (1)
20-73: Avoidunwrap()onset_attrfor Windowswinerrorassignment (panic path).
Line 70 will abort the VM on an unexpected attribute failure; prefer propagating/ignoring error consistently (you already tolerate stderr failures elsewhere).- exc.as_object() - .set_attr("winerror", vm.new_pyobj(winerror), vm) - .unwrap(); + let _ = exc + .as_object() + .set_attr("winerror", vm.new_pyobj(winerror), vm);crates/vm/src/exceptions.rs (1)
1556-1602: Critical:slot_initwill panic when called on OSError subclass instances.This
__init__is inherited by all OSError subclasses. When invoked withzelfbeing a subclass instance (e.g.,FileNotFoundError),downcast_ref::<PyOSError>()returnsNonebecause it checks exact typeid match, not subclass relationship. Theunwrap()then panics.The comment claims "pointer cast" but the code uses
downcast_ref, which is type-ID based. Since OSError subclasses are#[repr(transparent)]wrappers with identical memory layout, use unsafe pointer casting instead:- #[allow(deprecated)] - let exc: &Py<PyOSError> = zelf.downcast_ref::<PyOSError>().unwrap(); + // SAFETY: OSError subclasses are #[repr(transparent)] wrappers around PyOSError. + // Memory layout is guaranteed identical, enabling safe pointer cast access. + let exc: &Py<PyOSError> = unsafe { + &*(zelf.as_object_ref() as *const _ as *const Py<PyOSError>) + };(Adjust the casting pattern to match the project's conventions for transparent wrapper access.)
♻️ Duplicate comments (5)
crates/vm/src/object/core.rs (1)
1076-1103: Unsafeunreachable_uncheckedconcern already raised in prior review.The past review correctly identified that using
unsafe { std::hint::unreachable_unchecked() }can cause undefined behavior if type invariants are violated (e.g., due to macro bugs). The suggested fix to useunreachable!()with descriptive messages remains applicable.crates/vm/src/vm/vm_new.rs (1)
109-141: Avoid panics when constructing OSError subtypes (still present).
This is the same issue previously raised:new_os_subtype_error_impl()uses.expect(...)and can abort the process if the passedexc_typeis wrong. Please switch to a non-panicking path (returnPyResult<PyRef<PyOSError>>or raise a PythonTypeError).Also applies to: 259-264
crates/derive-impl/src/pyclass.rs (1)
308-385:type_matches_path()is still too permissive for a layout-safety gate.
Matching only the last segment can accept the wrong base type from another module, undermining the base-field/layout invariant.crates/vm/src/stdlib/ctypes/union.rs (1)
124-129:_base: PyCData+cdata: PyRwLock<CDataObject>still deep-clones the buffer (double alloc + divergence).This repeats the previously raised issue:
PyCData::new(cdata.clone())very likely duplicates theVec<u8>backing store and can lead to_baseandcdataobserving different bytes after mutation.Recommendation remains: make
_basethe single source of truth forCDataObject, and route buffer access through it (or refactorPyCDatato share a single lock/Arc with the outer type).Also applies to: 176-184, 209-217, 257-264, 296-303
crates/stdlib/src/ssl.rs (1)
412-427: Regression:SSLWantReadError/SSLWantWriteErrorshould not seterrno(CPython keeps itNone).This reintroduces the earlier mismatch by using
Some(SSL_ERROR_WANT_READ)/Some(SSL_ERROR_WANT_WRITE).pub(super) fn create_ssl_want_read_error(vm: &VirtualMachine) -> PyRef<PyOSError> { vm.new_os_subtype_error( PySSLWantReadError::class(&vm.ctx).to_owned(), - Some(SSL_ERROR_WANT_READ), + None, "The operation did not complete (read)", ) } pub(super) fn create_ssl_want_write_error(vm: &VirtualMachine) -> PyRef<PyOSError> { vm.new_os_subtype_error( PySSLWantWriteError::class(&vm.ctx).to_owned(), - Some(SSL_ERROR_WANT_WRITE), + None, "The operation did not complete (write)", ) }In CPython's ssl module, do SSLWantReadError / SSLWantWriteError set the OSError errno attribute, or is it typically None?
🧹 Nitpick comments (4)
crates/vm/src/stdlib/ctypes/pointer.rs (1)
80-80: Consider extracting repeated base initialization to a helper.All five construction sites use identical initialization:
_base: PyCData::new(CDataObject::from_bytes(vec![], None))This could be simplified with a helper method like
PyCData::empty()or similar, which would reduce duplication and make the intent clearer. This matches the pattern seen in other ctypes types and would improve maintainability.Example helper (could be added to
PyCData):impl PyCData { pub fn empty() -> Self { Self::new(CDataObject::from_bytes(vec![], None)) } }Also applies to: 130-130, 175-175, 212-212, 268-268
crates/vm/src/stdlib/ctypes/function.rs (1)
164-176: Avoid repeating ad-hoc_baseinitialization; also consider whether a 0-byte base buffer is sufficient for CFuncPtr semantics.
Right now every constructor path re-creates_basewith an emptyCDataObject. If CFuncPtr is expected to behave like other_CDatainstances (e.g., byref/addressof-style ops), a consistent base init helper will reduce drift and make it easier to adjust buffer sizing later.impl Constructor for PyCFuncPtr { type Args = FuncArgs; fn slot_new(cls: PyTypeRef, args: FuncArgs, vm: &VirtualMachine) -> PyResult { + fn new_base() -> PyCData { + PyCData::new(CDataObject::from_bytes(vec![], None)) + } // ... if args.args.is_empty() { return PyCFuncPtr { - _base: PyCData::new(CDataObject::from_bytes(vec![], None)), + _base: new_base(), // ... } // ... } // ... if let Ok(addr) = first_arg.try_int(vm) { let ptr_val = addr.as_bigint().to_usize().unwrap_or(0); return PyCFuncPtr { - _base: PyCData::new(CDataObject::from_bytes(vec![], None)), + _base: new_base(), // ... } // ... } // ... apply similarly to the other constructor arms ... } }Also applies to: 195-207, 215-227, 275-287, 319-331
crates/vm/src/stdlib/io.rs (1)
639-769:#[repr(transparent)]base wrappers: confirm derive-macro/layout expectations.
_RawIOBase(_IOBase)/_BufferedIOBase(_IOBase)/_TextIOBase(_IOBase)are a big semantic shift; please ensure the#[pyclass(base = _IOBase)]machinery expects an embedded base payload and that the tuple-field order matches what the new subclass payload layout requires.crates/vm/src/exceptions.rs (1)
1683-1735: Getset backing fields viaPyAtomicRefis fine; consider making setters returnPyResult<()>if they can fail.
Ifswap_to_temporary_refsis infallible, current signature is fine; if it can allocate/GC-root in a fallible way, prefer surfacing errors.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
crates/derive-impl/src/pyclass.rs(8 hunks)crates/stdlib/src/socket.rs(4 hunks)crates/stdlib/src/ssl.rs(20 hunks)crates/stdlib/src/ssl/compat.rs(5 hunks)crates/vm/src/builtins/int.rs(1 hunks)crates/vm/src/exception_group.rs(1 hunks)crates/vm/src/exceptions.rs(13 hunks)crates/vm/src/object/core.rs(2 hunks)crates/vm/src/stdlib/ctypes.rs(3 hunks)crates/vm/src/stdlib/ctypes/array.rs(8 hunks)crates/vm/src/stdlib/ctypes/base.rs(4 hunks)crates/vm/src/stdlib/ctypes/field.rs(1 hunks)crates/vm/src/stdlib/ctypes/function.rs(6 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(7 hunks)crates/vm/src/stdlib/ctypes/structure.rs(6 hunks)crates/vm/src/stdlib/ctypes/union.rs(6 hunks)crates/vm/src/stdlib/io.rs(22 hunks)crates/vm/src/stdlib/os.rs(1 hunks)crates/vm/src/vm/context.rs(7 hunks)crates/vm/src/vm/vm_new.rs(3 hunks)extra_tests/snippets/stdlib_ctypes.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- crates/vm/src/builtins/int.rs
- crates/vm/src/stdlib/ctypes/field.rs
- crates/stdlib/src/ssl/compat.rs
- extra_tests/snippets/stdlib_ctypes.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/pointer.rscrates/derive-impl/src/pyclass.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/os.rscrates/stdlib/src/socket.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/function.rscrates/stdlib/src/ssl.rscrates/vm/src/vm/vm_new.rscrates/vm/src/object/core.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/exception_group.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/vm/context.rscrates/vm/src/stdlib/io.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/exceptions.rs
🧠 Learnings (6)
📚 Learning: 2025-12-09T08:46:58.660Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.
Applied to files:
crates/derive-impl/src/pyclass.rscrates/stdlib/src/socket.rscrates/vm/src/vm/vm_new.rscrates/vm/src/exception_group.rscrates/vm/src/vm/context.rscrates/vm/src/exceptions.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/derive-impl/src/pyclass.rscrates/stdlib/src/socket.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/vm/vm_new.rscrates/vm/src/vm/context.rscrates/vm/src/exceptions.rs
📚 Learning: 2025-06-28T16:31:03.991Z
Learnt from: arihant2math
Repo: RustPython/RustPython PR: 5790
File: build.rs:2-2
Timestamp: 2025-06-28T16:31:03.991Z
Learning: In Cargo build scripts (build.rs), the environment variable CARGO_CFG_TARGET_OS is guaranteed to exist and is automatically set by Cargo during the build process, making unwrap() safe to use when accessing this variable.
Applied to files:
crates/vm/src/stdlib/os.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/stdlib/src/socket.rscrates/vm/src/vm/vm_new.rscrates/vm/src/exceptions.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Follow Rust best practices for error handling and memory management
Applied to files:
crates/vm/src/vm/vm_new.rs
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/**/*.py : Minimize modifications to CPython standard library files in the `Lib/` directory; modifications should be minimal and only to work around RustPython limitations
Applied to files:
crates/vm/src/stdlib/io.rs
🧬 Code graph analysis (8)
crates/vm/src/stdlib/ctypes/pointer.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (3)
new(223-227)new(261-267)from_bytes(184-191)
crates/vm/src/stdlib/ctypes/base.rs (4)
crates/vm/src/stdlib/ctypes/field.rs (1)
new(37-54)crates/vm/src/stdlib/ctypes/function.rs (1)
value(52-52)crates/vm/src/builtins/type.rs (2)
value(154-154)value(964-964)crates/vm/src/builtins/object.rs (1)
value(460-460)
crates/vm/src/stdlib/ctypes.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
from_bytes(184-191)size(211-213)new(223-227)new(261-267)
crates/vm/src/stdlib/ctypes/function.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (4)
ffi_type_from_str(23-46)new(223-227)new(261-267)from_bytes(184-191)
crates/vm/src/vm/vm_new.rs (1)
crates/vm/src/exceptions.rs (7)
new(547-555)new(1061-1063)errno(1533-1534)errno(1535-1535)errno(1685-1687)errno_to_exc_type(1163-1189)errno_to_exc_type(1192-1194)
crates/vm/src/object/core.rs (2)
crates/vm/src/stdlib/ast/other.rs (1)
object(39-39)crates/vm/src/class.rs (1)
static_type(24-32)
crates/vm/src/exception_group.rs (1)
crates/vm/src/stdlib/builtins.rs (1)
repr(777-779)
crates/vm/src/exceptions.rs (2)
crates/vm/src/vm/mod.rs (3)
class(553-562)value(646-646)value(648-648)crates/common/src/os.rs (1)
winerror_to_errno(129-227)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (36)
crates/vm/src/stdlib/os.rs (1)
528-536: LGTM - Correct adaptation to new error type hierarchy.The intermediate variable and
.upcast()call properly converts the more specificPyRef<PyOSError>returned bynew_errno_errorto the genericPyBaseExceptionRefrequired byPyResult. This aligns with the PR's broader changes to return typed OS error references.crates/vm/src/exception_group.rs (1)
49-50: LGTM - Correct newtype wrapper for subclass support.The
#[repr(transparent)]newtype wrapper aroundPyBaseExceptionenables thePySubclasstrait machinery while maintaining ABI compatibility. This allowsPyBaseExceptionGroupto leverage the newinto_base()andupcast()methods onPyRef<T>.crates/vm/src/object/core.rs (1)
29-32: LGTM - Required imports for new upcast/downcast utilities.These imports support the new
into_base()andupcast()methods below.crates/vm/src/vm/context.rs (6)
4-7: LGTM - Import updates for PyBool type changes.The imports correctly reflect the new
PyBoolnewtype wrapper and the updatedStaticTypepath.Also applies to: 17-17
35-36: LGTM - Type-safe boolean value fields.Changing
true_valueandfalse_valuefromPyIntReftoPyRef<PyBool>provides stronger type guarantees, enabling callers to work with the more specific boolean type when needed.
283-285: LGTM - Simplified trait bound.The
PyPayloadbound is removed sincePyObjectPayload(or the type's usage context) already implies the necessary capabilities.
306-307: LGTM - PyBool newtype construction.The boolean values are now created as
PyBool(PyInt::from(...))wrappers, aligning with the new subclass payload layout wherePyBoolwrapsPyIntas its base.
450-456: LGTM - Updated return type fornew_bool.The signature now returns
PyRef<PyBool>instead ofPyIntRef, providing type-safe access to boolean values. Callers can use.upcast()or.into()if they need the value as aPyIntReforPyObjectRef.
511-521: LGTM - Explicit slot construction for exception types.Constructing
PyTypeSlotsexplicitly withheap_type_flags()andHAS_DICTis appropriate for dynamically created exception types. This provides clearer control over the type's capabilities compared to deriving from a base'smake_slots().crates/vm/src/stdlib/ctypes/pointer.rs (3)
7-7: LGTM - Required imports for base payload pattern.Importing
CDataObjectandPyCDataenables the subclass payload layout used across ctypes types.
12-14: LGTM - repr(transparent) newtype for metaclass.The
PyCPointerTypeas a transparent wrapper aroundPyTypeenables proper metaclass behavior while supporting thePySubclasstrait machinery.
64-67: LGTM - Base payload field for subclass layout.Adding the
_base: PyCDatafield follows the established pattern for ctypes types, enabling proper inheritance and memory layout for the subclass hierarchy.crates/vm/src/stdlib/ctypes.rs (1)
18-18: Public re-export + internal import changes look consistent with the new base model.Also applies to: 50-57
crates/vm/src/stdlib/ctypes/base.rs (2)
222-228:PyCData::new(...)helper is a nice addition.
238-242: repr(transparent) wrapper forPyCSimpleTypematches the new base-field validation expectations.crates/derive-impl/src/pyclass.rs (1)
431-449: Please double-checkBASICSIZE = 0semantics for repr(transparent)+base in the runtime allocator.
If0is meant to mean “no extra beyond base”, make sure the allocator/class machinery still uses the base’s actual basicsize for instance allocation (otherwise this risks under-allocation). A targeted test exercising allocation + field access for a repr(transparent) subclass would be ideal here.crates/vm/src/stdlib/ctypes/structure.rs (1)
17-21: repr(transparent)PyCStructType(PyType)is aligned with the new subclass/base rules.crates/stdlib/src/socket.rs (2)
1829-1833:From<PyRef<PyOSError>> for IoOrPyExceptionis a clean integration for the new error path.Also applies to: 1849-1856
2390-2422: Please ensuregaierror/herror/timeoutare guaranteed OSError subtypes, or this can currently panic vianew_os_subtype_error.
Givennew_os_subtype_errorstill uses.expect(...), any accidental non-OSError subtype here would abort instead of raising a Python exception. Oncenew_os_subtype_erroris made non-panicking, this becomes much safer.Also applies to: 2424-2429
crates/vm/src/stdlib/ctypes/array.rs (1)
25-28:PyCArrayType(PyType)wrapper looks consistent with the new subclass/base layout.crates/vm/src/stdlib/ctypes/union.rs (1)
15-18:PyCUnionType(PyType)wrapper looks aligned with the new base-payload approach.crates/stdlib/src/ssl.rs (3)
40-43: Exception newtype wrappers + updated imports look consistent with the subclass payload refactor.Also applies to: 346-410
1619-1622:set_cipherscorrectly maps parse errors toSSLError(OS-subtype path).
1983-2000: Error mapping toSSLError/subtypes and.upcast()usage looks coherent after the refactor.The protocol-mismatch errors, shutdown read/write guards, SNI invalid return-type error, EOF mapping, timeout mapping, and “unsupported PHA” path are all consistently expressed via
new_os_subtype_error(...).upcast()and subtype classes.Also applies to: 2053-2070, 2788-2801, 3346-3354, 3398-3422, 3474-3485, 3538-3542, 4096-4107
crates/vm/src/stdlib/io.rs (8)
204-207: Newnew_unsupported_operation()construction looks good, but confirmNoneerrno is desired.
Ifnew_os_subtype_error(type, errno, msg)useserrnofor subclass selection or message formatting, ensureNonedoesn’t accidentally produce a default errno-dependent subtype.
425-503:_IOBase::__closednow returnsbool— good, but double-check attribute semantics.
ReturningPyRef<PyBool>aligns with Pythonclosed: bool. Please verify no callers relied onint-like behavior (e.g., arithmetic) and thatclosedproperty still round-trips correctly through__closed.
1765-1950: Embedded_basefields on buffered types look consistent.
The_base: _BufferedIOBaseadditions +Defaultderivations appear coherent with the “base embedding” goal.
2375-2442:TextIOWrapperbase embedding looks fine; watch locking API consistency.
The new_base: _TextIOBasefield is fine; ensureDefaultdoesn’t accidentally create a usable-but-uninitialized instance in Python space (you already gate operations onOption<TextIOData>).
3656-3826:StringIO/BytesIOconstructors updated correctly for embedded base.
_base: Default::default()in both constructors matches the new layout and keeps initialization explicit.
4440-4473:FileIObase embedding +Defaultupdated cleanly.
_base: Default::default()and the new field layout are consistent with the rest of the IO refactor.
82-97: Consider documenting the implicit assumption that allVirtualMachineinstances shareContext::genesis().The pattern of using
_io::unsupported_operation().to_owned()— which stores aPyTypecreated fromContext::genesis()in a static singleton — is consistent with howexception_group.rshandlesexception_group(). However, this relies on an implicit design invariant: the publicInterpreter::with_init()API always initializesVirtualMachinewithContext::genesis(). WhileVirtualMachine::new()accepts arbitrary contexts, this invariant is currently undocumented. Consider adding a comment toContext::genesis()or theVirtualMachineconstructor clarifying this assumption, or verify the design prevents custom contexts from being used with the standard module initialization.
4245-4267: The current design is intentional:Context::genesis()creates a single global context with shared builtin types. The use ofContext::genesis()inunsupported_operation()is consistent with how other builtin exception types (e.g.,exception_group()) are implemented. RustPython's architecture assumes one genesis context per application; multiple contexts are not a supported usage pattern. Identity checks viafast_isinstancework correctly because all VMs use the same shared static type instance, not separate per-context instances.Likely an incorrect or invalid review comment.
crates/vm/src/exceptions.rs (4)
966-969: Dropping dynamic attrs forPyOSErroris a good simplification.
Relying on getsets for standard fields should reduce per-instance dict traffic and make layout more predictable.
1438-1482: GC traversal delegation viaself.base.try_traverse(...)looks right.
Good to see the embedded base being traversed before visiting the extra optional fields.
1524-1546: Subtype selection inslot_newis reasonable; ensure it can’t recurse/loop.
Thevm.invoke_exception(typ, args_vec)call path should not come back into the errno-specialization branch for non-OSErrorclasses (yourcls.name()guard seems to prevent it). Worth sanity-checking with a small test forOSError(EPIPE, ...)etc.
1597-1601:__reduce__/ args truncation: verify it matches CPython for filename2 on non-Windows.
Oncefilename2is fixed for 4-arg non-Windows, ensure__reduce__includes it (and that truncating base args to 2 doesn’t lose reconstruction fidelity).Also applies to: 1646-1681
| // Generate PySubclass impl for #[repr(transparent)] types with base class | ||
| // (tuple struct assumed, so &self.0 works) | ||
| let subclass_impl = if !is_pystruct && is_repr_transparent { | ||
| base.as_ref().map(|typ| { | ||
| quote! { | ||
| impl ::rustpython_vm::class::PySubclass for #ident { | ||
| type Base = #typ; | ||
|
|
||
| #[inline] | ||
| fn as_base(&self) -> &Self::Base { | ||
| &self.0 | ||
| } | ||
| } | ||
| } | ||
| }) | ||
| } else { | ||
| None | ||
| }; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PySubclass::as_base() assumes tuple structs (self.0) but validate_base_field also allows named-field structs.
For #[repr(transparent)] struct S { base: Base } this will generate invalid code. Either (a) restrict repr(transparent)+base to tuple structs, or (b) generate as_base() using the actual first field (named vs unnamed) discovered during validation.
🤖 Prompt for AI Agents
In crates/derive-impl/src/pyclass.rs around lines 485-503, the generated
PySubclass::as_base() unconditionally uses tuple struct access (&self.0) which
is invalid for named-field repr(transparent) structs; update the codegen to use
the actual first field determined during validation: have the validator record
whether the single field is named or unnamed and its identifier or index, then
emit as_base() using &self.<field_ident> for named fields and &self.0 for
unnamed tuple fields (alternatively, you may tighten validation to only allow
tuple structs and emit a clear compile error for named fields).
| cert::validate_cert_key_match(&certs, &key).map_err(|e| { | ||
| vm.new_exception_msg( | ||
| PySSLError::class(&vm.ctx).to_owned(), | ||
| if e.contains("key values mismatch") { | ||
| "[SSL: KEY_VALUES_MISMATCH] key values mismatch".to_owned() | ||
| } else { | ||
| e | ||
| }, | ||
| ) | ||
| let msg = if e.contains("key values mismatch") { | ||
| "[SSL: KEY_VALUES_MISMATCH] key values mismatch".to_owned() | ||
| } else { | ||
| e | ||
| }; | ||
| vm.new_os_subtype_error(PySSLError::class(&vm.ctx).to_owned(), Some(0), msg) | ||
| .upcast() | ||
| })?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
CPython ssl module SSLError key values mismatch errno
💡 Result:
Short answer: that SSLError comes from OpenSSL’s X509_check_private_key failing — the private key doesn’t match the certificate (or the certificate chain is ordered/packed incorrectly). Python’s ssl module surfaces that OpenSSL error as ssl.SSLError (often shown as "key values mismatch"); the underlying OpenSSL error queue can make the reported errno/message come from an earlier call, so the numeric errno/text may look mismatched. [1][2][3][4]
How to confirm and fix (quick):
- Verify the certificate and private key match (compare moduli or public keys):
- openssl x509 -noout -modulus -in cert.crt | openssl sha256
- openssl rsa -noout -modulus -in privkey.key | openssl sha256
(or compare public keys with openssl pkey -pubout / openssl x509 -noout -pubkey | openssl sha256). [1][2]
- If using a chain file, ensure the leaf certificate appears first and intermediates follow; wrong order can trigger checks to fail. [2][3]
- If you still see confusing errno/text, it can be caused by stale errors left on OpenSSL’s error queue by earlier API calls; clearing/consuming the queue (or fixing the code path that left the error) is required — this is a known cause in other language bindings (see bug reports). [4]
If you want, paste the exact SSLError traceback and the openssl command outputs and I’ll help interpret them.
🌐 Web query:
SSL_ERROR_* constants CPython ssl.SSLError errno
💡 Result:
Short answer — which SSL_ERROR_* constants exist in CPython's ssl module, and how they relate to ssl.SSLError.errno:
-
Common SSL_ERROR_* constants exposed by the ssl module (enum ssl.SSLErrorNumber) include:
- SSL_ERROR_ZERO_RETURN
- SSL_ERROR_WANT_READ
- SSL_ERROR_WANT_WRITE
- SSL_ERROR_WANT_X509_LOOKUP
- SSL_ERROR_SYSCALL
- SSL_ERROR_SSL
- SSL_ERROR_WANT_CONNECT
- SSL_ERROR_EOF
- SSL_ERROR_INVALID_ERROR_CODE. [1][2]
-
How they appear in exceptions:
- ssl.SSLError (and SSLError.args) can contain one of those SSL_ERROR_* codes (available as the ssl.SSLErrorNumber IntEnum since Python 3.6). [1]
- The SSLError.errno attribute is not always an OS errno; it may be one of the SSL_ERROR_* codes or None for many SSL errors (some SSL errors do not correspond to a single OS error). See Python issue discussion that explains errno can be None for non-OS errors. [3]
References
- Python ssl module docs (constants & SSLErrorNumber). [1]
- Extracted constants listing from stdlib/third-party mirror. [2]
- Python issue explaining SSLError.errno semantics (errno may be None). [3]
If you want, I can:
- show these constants programmatically from your installed Python version (via a small snippet you can run), or
- map typical cases (e.g. SSL_ERROR_SYSCALL) to when underlying OS errno is used.
🏁 Script executed:
# Search the codebase to understand how SSLError errno is typically set in this file
rg -n "new_os_subtype_error" crates/stdlib/src/ssl.rs -A 2 -B 2Repository: RustPython/RustPython
Length of output: 5989
🏁 Script executed:
# Check how errno values are used elsewhere in ssl.rs for context
rg -n "Some\(" crates/stdlib/src/ssl.rs | grep -E "(errno|error|SSL)" | head -20Repository: RustPython/RustPython
Length of output: 565
🏁 Script executed:
# Search for SSL_ERROR_SSL or other SSL_ERROR constants defined in the file
rg -n "SSL_ERROR" crates/stdlib/src/ssl.rs | head -30Repository: RustPython/RustPython
Length of output: 595
🏁 Script executed:
# Check what constants are defined for SSL errors
rg -n "const.*SSL_ERROR" crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 508
🏁 Script executed:
# Look at cert validation error handling pattern in context
sed -n '1300,1320p' crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 972
🏁 Script executed:
# Check function context for line 1278
sed -n '1250,1285p' crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 1556
🏁 Script executed:
# Check function context for line 1310
sed -n '1290,1325p' crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 1709
Change Some(0) to None for consistent SSLError errno semantics.
Line 1278 uses Some(0) (SSL_ERROR_NONE) for the key mismatch error, but identical key validation errors on lines 1303 and 1314 use None. Since SSL_ERROR_NONE should never be used as an errno for an actual error condition, and CPython's ssl module raises SSLError with errno as None for generic SSL errors, this should be None for consistency with the rest of the codebase.
vm.new_os_subtype_error(PySSLError::class(&vm.ctx).to_owned(), None, msg)
🤖 Prompt for AI Agents
In crates/stdlib/src/ssl.rs around lines 1272 to 1280, the SSLError construction
uses Some(0) for the errno which is inconsistent with the other key validation
errors at lines ~1303 and ~1314 and wrongly represents an actual error with
SSL_ERROR_NONE; change the errno argument from Some(0) to None when calling
vm.new_os_subtype_error so the SSLError is created with errno None, matching
CPython semantics and the surrounding code.
| // Create FileNotFoundError with errno=ENOENT (2) | ||
| let exc = vm.new_os_subtype_error( | ||
| vm.ctx.exceptions.file_not_found_error.to_owned(), | ||
| vec![ | ||
| vm.ctx.new_int(2).into(), // errno = ENOENT (2) | ||
| vm.ctx.new_str("No such file or directory").into(), | ||
| vm.ctx.new_str(path_str.clone()).into(), // filename | ||
| ], | ||
| Some(2), // errno = ENOENT (2) | ||
| "No such file or directory", | ||
| ); | ||
| return Err(exc); | ||
| // Set filename attribute | ||
| let _ = exc | ||
| .as_object() | ||
| .set_attr("filename", vm.ctx.new_str(path_str.clone()), vm); | ||
| return Err(exc.upcast()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cd /tmp && find . -name "ssl.rs" -type f 2>/dev/null | head -5Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# First, find the ssl.rs file
find . -path "*/stdlib/src/ssl.rs" -type fRepository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Search for ssl.rs in the repository
fd -t f "ssl.rs" | head -10Repository: RustPython/RustPython
Length of output: 118
🏁 Script executed:
# Read the ssl.rs file around lines 1870-1881 with context
sed -n '1850,1900p' ./crates/stdlib/src/ssl.rs | cat -nRepository: RustPython/RustPython
Length of output: 2713
🏁 Script executed:
# Get more context - read from line 1850 to 1900 to see the function signature
head -1900 ./crates/stdlib/src/ssl.rs | tail -60 | cat -nRepository: RustPython/RustPython
Length of output: 3016
🏁 Script executed:
# Let's check the file size first
wc -l ./crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 96
🏁 Script executed:
# Search for how other modules use new_os_subtype_error and errno
rg -A5 "new_os_subtype_error" --type rs | head -80Repository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Search for FileNotFoundError usage patterns
rg -B3 -A3 "file_not_found_error" --type rs | head -100Repository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Check how CPython's ssl module handles file not found errors
web_searchRepository: RustPython/RustPython
Length of output: 115
🏁 Script executed:
# Search for how other modules use new_os_subtype_error
rg "new_os_subtype_error" -A5 | head -100Repository: RustPython/RustPython
Length of output: 5611
🏁 Script executed:
# Search for FileNotFoundError patterns in the codebase
rg "file_not_found_error" -B3 -A3 | head -100Repository: RustPython/RustPython
Length of output: 7049
🏁 Script executed:
# Check if there's any errno constant definition or usage patterns
rg "errno.*2\|ENOENT" --max-count 20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check how std::fs errors are handled elsewhere
rg "std::fs::" | grep -i "error\|map_err" | head -20Repository: RustPython/RustPython
Length of output: 483
🏁 Script executed:
# Check for errno/ENOENT constants
rg "ENOENT|errors::" crates/vm/src/exceptions.rs | head -20Repository: RustPython/RustPython
Length of output: 1148
🏁 Script executed:
# Look at the full socket.rs pattern for errno extraction
rg -B5 -A5 "error_num\(\)" crates/stdlib/src/socket.rs | head -40Repository: RustPython/RustPython
Length of output: 966
🏁 Script executed:
# Check the error mapping in exceptions.rs more carefully
rg -B2 -A2 "ENOENT" crates/vm/src/exceptions.rsRepository: RustPython/RustPython
Length of output: 364
🌐 Web query:
CPython ssl.load_dh_params FileNotFoundError errno Windows POSIX behavior
💡 Result:
Short answer
- ssl.SSLContext.load_dh_params(dhfile) opens the file with the C runtime and calls OpenSSL's PEM_read_DHparams; if fopen/_Py_fopen_obj fails Python exposes the OS error (FileNotFoundError / OSError) with errno. [1][2]
- On POSIX this typically behaves as you expect (fopen fails → errno = ENOENT → FileNotFoundError). On Windows the same underlying fopen can also fail for other reasons (path encoding/mbcs issues, wide/ANSI conversion, or Windows-specific fopen behavior), so you can also get FileNotFoundError or other OSError variants even when a path that looks valid to Python strings is used. This is a known class of platform differences for ssl file-loading APIs. [2][3]
Why this happens (brief technical outline)
- CPython's _ssl implementation calls _Py_fopen_obj(filepath, "rb") and then PEM_read_DHparams on the FILE*. If _Py_fopen_obj returns NULL, CPython inspects errno and raises an OSError (FileNotFoundError if errno==ENOENT). If PEM_read_DHparams returns NULL with errno==0, a different SSL error is raised. (Code and reproduction notes shown in community / source snippets). [2]
- Differences in how Windows converts Python filename objects to the C fopen call (MBCS/ANSI vs wide APIs used by Python internals, and what OpenSSL or the C runtime expects) produce platform-dependent failures; similar issues have been reported and fixed historically for other ssl file-loading functions (e.g. load_verify_locations). [3][4]
What to do
- Verify the file exists and is readable from Python (os.path.exists, open(...,'rb')). If that succeeds but load_dh_params still fails, the problem is likely an encoding/ABI/path-form issue on Windows (non-ASCII characters, reserved names, long paths). Use a plain ASCII path or a path exposed via a file descriptor opened in Python and re-create the file in a known-good location. [1][2]
- If you need a robust workaround on Windows for non-encodable paths, open the file yourself in Python (open(path,'rb')) to confirm accessibility; report/attach a minimal reproducer to the CPython issue tracker if fopen succeeds but load_dh_params still fails. See linked reports about related Windows path/encoding behavior in the ssl module. [3][4]
Sources
- CPython ssl docs (SSLContext.load_dh_params). [1]
- Community excerpt showing the _ssl.c implementation calling _Py_fopen_obj and handling errno / PEM_read_DHparams. [2]
- Python bug tracker / issue about Windows path/encoding differences for ssl.load_verify_locations (same class of problem). [3]
- Python ssl module documentation note about platform-dependent behavior (OpenSSL/OS differences). [4]
References
[1] Python ssl docs — SSLContext.load_dh_params. (docs.python.org)
[2] Community discussion / snippet showing _ssl__SSLContext_load_dh_params implementation and errno handling. (community.letsencrypt.org)
[3] Python issue: SSLContext.load_verify_locations Windows encoding/path behavior. (bugs.python.org)
[4] Python ssl module notes on platform/OpenSSL differences. (docs.python.org)
Citations:
- 1: https://docs.python.org/library/ssl.html?utm_source=openai
- 2: https://community.letsencrypt.org/t/renew-failure-unknown-error-ssl-c-3057/153930?utm_source=openai
- 3: https://bugs.python.org/issue27354?utm_source=openai
- 4: https://docs.python.org/library/ssl.html?utm_source=openai
🏁 Script executed:
# Check the stat module mentioned in learnings for platform-specific handling pattern
rg -B5 -A5 "SF_SUPPORTED\|SF_SYNTHETIC" crates/stdlib/src/stat.rs | head -50Repository: RustPython/RustPython
Length of output: 130
🏁 Script executed:
# Check if there's a utility function to convert std::io::Error to proper errno
rg -B3 -A3 "e.kind\(\)" crates/stdlib/src/ | head -60Repository: RustPython/RustPython
Length of output: 3925
🏁 Script executed:
# Check how into_pyexception() works
rg -B3 -A5 "into_pyexception" crates/stdlib/src/ssl.rs | head -60Repository: RustPython/RustPython
Length of output: 1890
🏁 Script executed:
# Check the trait definition or implementation
rg "into_pyexception" crates/ | grep -E "(fn|impl|trait)" | head -20Repository: RustPython/RustPython
Length of output: 1166
🏁 Script executed:
# Look at the specific load_dh_params code more carefully
sed -n '1840,1895p' ./crates/stdlib/src/ssl.rs | cat -nRepository: RustPython/RustPython
Length of output: 2880
🏁 Script executed:
# Check the into_pyexception implementation for std::io::Error
rg -B5 -A15 "impl.*into_pyexception.*std::io::Error|fn into_pyexception.*io::Error" crates/Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check os.rs implementation
sed -n '1,100p' crates/vm/src/stdlib/os.rs | grep -A20 "into_pyexception"Repository: RustPython/RustPython
Length of output: 982
🏁 Script executed:
# Find the exact implementation in os.rs
rg -B2 -A10 "fn into_pyexception" crates/vm/src/stdlib/os.rsRepository: RustPython/RustPython
Length of output: 668
🏁 Script executed:
# Find the io::Error into_pyexception implementation
rg -B5 -A15 "impl.*IntoPyException.*for.*io::Error" crates/Repository: RustPython/RustPython
Length of output: 1183
🏁 Script executed:
# Check the convert/to_pyobject.rs file for io::Error
rg -B5 -A20 "io::Error" crates/vm/src/convert/to_pyobject.rs | head -80Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Find to_pyexception implementation
rg -B5 -A20 "fn to_pyexception" crates/vm/src/stdlib/io.rs | head -100Repository: RustPython/RustPython
Length of output: 1160
🏁 Script executed:
# Check the full to_pyexception implementation to see how it maps errno to exception types
rg -B5 -A50 "fn to_pyexception" crates/vm/src/stdlib/io.rs | tail -80Repository: RustPython/RustPython
Length of output: 2135
Refactor to eliminate the existence check and use proper error handling. Instead of checking Path::exists() and hardcoding errno=2, try to read the file directly and match on std::io::ErrorKind to convert errors properly via into_pyexception(). This approach:
- Properly extracts errno from the actual OS error (works on Windows and POSIX)
- Matches the pattern used elsewhere in the same file
- Fixes the race condition where the file could be deleted between the existence check and read
🤖 Prompt for AI Agents
In crates/stdlib/src/ssl.rs around lines 1870 to 1881, remove the Path::exists()
pre-check and instead attempt to open/read the file directly, match on the
resulting std::io::Error (using error.kind()) and convert it to the appropriate
Python exception via into_pyexception() (as done elsewhere in this file), then
set the "filename" attribute on that created exception and return
Err(exc.upcast()); this preserves the actual OS errno/info (works on
Windows/POSIX), avoids the TOCTOU race, and follows the existing
error-conversion pattern.
| // OS Errors: | ||
| #[pyexception] | ||
| impl PyOSError { | ||
| #[cfg(not(target_arch = "wasm32"))] | ||
| fn optional_new(args: Vec<PyObjectRef>, vm: &VirtualMachine) -> Option<PyBaseExceptionRef> { | ||
| let len = args.len(); | ||
| if (2..=5).contains(&len) { | ||
| let errno = &args[0]; | ||
| errno | ||
| .downcast_ref::<PyInt>() | ||
| .and_then(|errno| errno.try_to_primitive::<i32>(vm).ok()) | ||
| .and_then(|errno| super::errno_to_exc_type(errno, vm)) | ||
| .and_then(|typ| vm.invoke_exception(typ.to_owned(), args.to_vec()).ok()) | ||
| impl Constructor for PyOSError { | ||
| type Args = FuncArgs; | ||
|
|
||
| fn py_new(_cls: &Py<PyType>, args: FuncArgs, vm: &VirtualMachine) -> PyResult<Self> { | ||
| let len = args.args.len(); | ||
| // CPython only sets errno/strerror when args len is 2-5 | ||
| let (errno, strerror) = if (2..=5).contains(&len) { | ||
| (Some(args.args[0].clone()), Some(args.args[1].clone())) | ||
| } else { | ||
| (None, None) | ||
| }; | ||
| let filename = if (3..=5).contains(&len) { | ||
| Some(args.args[2].clone()) | ||
| } else { | ||
| None | ||
| } | ||
| }; | ||
| let filename2 = if len == 5 { | ||
| args.args.get(4).cloned() | ||
| } else { | ||
| None | ||
| }; | ||
| // Truncate args for base exception when 3-5 args | ||
| let base_args = if (3..=5).contains(&len) { | ||
| args.args[..2].to_vec() | ||
| } else { | ||
| args.args.to_vec() | ||
| }; | ||
| let base_exception = PyBaseException::new(base_args, vm); | ||
| Ok(Self { | ||
| base: PyException(base_exception), | ||
| errno: errno.into(), | ||
| strerror: strerror.into(), | ||
| filename: filename.into(), | ||
| filename2: filename2.into(), | ||
| #[cfg(windows)] | ||
| winerror: None.into(), | ||
| }) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: filename2 arg indexing is wrong on non-Windows (4-arg OSError).
On non-Windows, CPython uses OSError(errno, strerror, filename, filename2) (4 args). Current logic only sets filename2 when len == 5 and uses index 4, so filename2 will be silently dropped for the common 4-arg case.
- let filename2 = if len == 5 {
- args.args.get(4).cloned()
- } else {
- None
- };
+ #[cfg(windows)]
+ let filename2 = if len == 5 { args.args.get(4).cloned() } else { None };
+ #[cfg(not(windows))]
+ let filename2 = if len >= 4 { args.args.get(3).cloned() } else { None };🤖 Prompt for AI Agents
In crates/vm/src/exceptions.rs around lines 1484 to 1522, filename2 is only set
when len == 5 which drops the 4-argument OSError form; change the filename2
assignment to also handle the common 4-arg case by checking for len == 4 (use
args.get(3).cloned()) and keep the len == 5 branch (args.get(4).cloned()), so
filename2 is populated for both 4- and 5-argument OS errors.
| let size = get_size(&tp_str); | ||
| let cdata = CDataObject::from_bytes(vec![0u8; size], None); | ||
| Ok(PyCSimple { | ||
| _base: PyCData::new(cdata.clone()), | ||
| _type_: tp_str, | ||
| value: AtomicCell::new(vm.ctx.none()), | ||
| cdata: rustpython_common::lock::PyRwLock::new(CDataObject::from_bytes( | ||
| vec![0u8; size], | ||
| None, | ||
| )), | ||
| cdata: rustpython_common::lock::PyRwLock::new(cdata), | ||
| }) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cdata.clone() deep-copies the buffer; base/derived will diverge.
If the goal is “one underlying C data backing store”, current wiring creates two independent CDataObjects (because Vec<u8> clones). Anything operating via the base (PyCData) can see different bytes/objects than code operating via the derived field.
Actionable direction: make the backing storage shared (e.g., Arc<PyRwLock<CDataObject>> inside PyCData, and have derived types hold only _base: PyCData and access self._base.cdata), or remove the duplicated cdata field from derived types entirely and standardize on the base field.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes.rs around lines 371–379, the code clones the
CDataObject into both PyCData (base) and the derived PyCSimple, creating two
independent buffers; instead make the backing storage shared so base and derived
view the same bytes. Fix by removing the duplicated cdata field from derived
types (or change PyCData to hold an Arc<PyRwLock<CDataObject>> and have derived
types keep only _base: PyCData and access the shared cdata via self._base), and
stop calling .clone() to create separate Vecs; initialize a single CDataObject
wrapped in the shared container and use that same handle for both base and
derived.
| pub fn new_exception(&self, exc_type: PyTypeRef, args: Vec<PyObjectRef>) -> PyBaseExceptionRef { | ||
| debug_assert_eq!( | ||
| exc_type.slots.basicsize, | ||
| std::mem::size_of::<PyBaseException>() | ||
| ); | ||
| // TODO: add repr of args into logging? | ||
|
|
||
| PyRef::new_ref( | ||
| // TODO: this constructor might be invalid, because multiple | ||
| // exception (even builtin ones) are using custom constructors, | ||
| // see `OSError` as an example: | ||
| PyBaseException::new(args, self), | ||
| exc_type, | ||
| Some(self.ctx.new_dict()), | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update doc/contract for new_exception: it now explicitly assumes PyBaseException payload size.
The new debug_assert_eq!(exc_type.slots.basicsize, size_of::<PyBaseException>()) is a good guardrail; consider tightening the doc comment to say “builtin types whose payload is PyBaseException (not OSError-family payloads)” to prevent accidental debug-only failures.
🤖 Prompt for AI Agents
In crates/vm/src/vm/vm_new.rs around lines 95 to 107, the debug_assert checks
that exc_type.slots.basicsize equals size_of::<PyBaseException>(), but the
public API/docs don't state this payload assumption; update the function
doc/contract to explicitly require callers to pass builtin exception types whose
payload is PyBaseException (i.e., not OSError-family or other sized payloads) so
the debug-only assert cannot be accidentally violated, and mention that callers
must only use types with basicsize == size_of::<PyBaseException>() (or else
construct the exception differently).
ShaharNaveh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
9017be0 to
99a3b04
Compare
I am considering new subclass payload layout to provide
I will left open questions on the code.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.