-
Notifications
You must be signed in to change notification settings - Fork 1.4k
ctypes buffer #6311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ctypes buffer #6311
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughAdds StgInfo/CDataObject-backed storage across ctypes types, migrates instance buffers to CDataObject with AsBuffer support and address/buffer constructors, implements bytes↔Python serialization, alignment/size/cast helpers, updates pointers/arrays/structures/unions to use stg_info/cdata, and implements libffi-backed C-callable thunks. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro 📒 Files selected for processing (10)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
bff3045 to
e71e7a8
Compare
|
Code has been automatically formatted The code in this PR has been formatted using git pull origin ctypes-buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/ctypes/base.rs (1)
45-53: Duplicate condition inset_primitivefor type "c".Lines 46-49 and 50-53 have identical conditions checking
downcast_exact::<PyBytes>withv.len() == 1. This appears to be a copy-paste error.if value .clone() .downcast_exact::<PyBytes>(vm) .is_ok_and(|v| v.len() == 1) - || value - .clone() - .downcast_exact::<PyBytes>(vm) - .is_ok_and(|v| v.len() == 1) || value .clone() .downcast_exact::<PyInt>(vm)
🧹 Nitpick comments (7)
crates/vm/src/stdlib/ctypes/field.rs (2)
106-125: Consider extracting duplicated buffer access logic.The structure and union branches have identical logic for reading from the cdata buffer. This could be refactored to reduce duplication, perhaps by extracting a helper that takes the buffer reference.
- // Instance attribute access - read value from the structure/union's buffer - if let Some(structure) = obj.downcast_ref::<PyCStructure>() { - let cdata = structure.cdata.read(); - let offset = zelf.byte_offset; - let size = zelf.byte_size; - - if offset + size <= cdata.buffer.len() { - let bytes = &cdata.buffer[offset..offset + size]; - return PyCField::bytes_to_value(bytes, size, vm); - } - } else if let Some(union) = obj.downcast_ref::<PyCUnion>() { - let cdata = union.cdata.read(); - let offset = zelf.byte_offset; - let size = zelf.byte_size; - - if offset + size <= cdata.buffer.len() { - let bytes = &cdata.buffer[offset..offset + size]; - return PyCField::bytes_to_value(bytes, size, vm); - } - } + // Instance attribute access - read value from the structure/union's buffer + let cdata_guard = if let Some(structure) = obj.downcast_ref::<PyCStructure>() { + Some(structure.cdata.read()) + } else if let Some(union) = obj.downcast_ref::<PyCUnion>() { + Some(union.cdata.read()) + } else { + None + }; + + if let Some(cdata) = cdata_guard { + let offset = zelf.byte_offset; + let size = zelf.byte_size; + if offset + size <= cdata.buffer.len() { + let bytes = &cdata.buffer[offset..offset + size]; + return PyCField::bytes_to_value(bytes, size, vm); + } + }
200-240: Duplicated write logic could be consolidated.Similar to
descr_get, the write logic is duplicated between structure and union branches. The differentiated delete error messages are appropriate, but the assign paths are identical.crates/vm/src/stdlib/ctypes/structure.rs (1)
318-324: Add safety documentation for unsafe block.The unsafe block reads from a raw pointer without any validation that the memory is valid or properly sized. While this is inherent to FFI, a safety comment documenting the requirements would be helpful.
// Read data from address if address == 0 || size == 0 { return Err(vm.new_value_error("NULL pointer access".to_owned())); } + // SAFETY: Caller must ensure `address` points to valid memory of at least + // `size` bytes that remains valid for the duration of this read. let data = unsafe { let ptr = address as *const u8; std::slice::from_raw_parts(ptr, size).to_vec() };crates/vm/src/stdlib/ctypes/union.rs (1)
87-111: Duplicatedget_field_sizeimplementation.This method is identical to
PyCStructType::get_field_sizeinstructure.rs. Consider extracting to a shared helper function to avoid code duplication.crates/vm/src/stdlib/ctypes/base.rs (2)
563-591: Verify the safety documentation forfrom_address.The unsafe block reads from an arbitrary memory address. While this matches CPython's behavior, consider adding a
# Safetydoc comment explaining the preconditions (valid readable address, correct size) for maintainability.
641-641: Address the TODO: Missing reference tracking could cause use-after-free.The
from_buffermethod creates a ctypes instance backed by the source buffer but doesn't store a reference to keep the buffer alive. If the source is garbage-collected while the ctypes instance is still in use, it could lead to undefined behavior.Would you like me to open an issue to track implementing proper reference tracking via
_objects?crates/vm/src/stdlib/ctypes/array.rs (1)
486-497: Consider validating_type_attribute exists before downcast.The
_type_attribute retrieval could fail silently if the class doesn't have this attribute, but the subsequent downcast provides an error. The error message could be more specific.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/ctypes/wintypes.pyis excluded by!Lib/**
📒 Files selected for processing (7)
crates/vm/src/stdlib/ctypes.rs(3 hunks)crates/vm/src/stdlib/ctypes/array.rs(12 hunks)crates/vm/src/stdlib/ctypes/base.rs(9 hunks)crates/vm/src/stdlib/ctypes/field.rs(3 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(3 hunks)crates/vm/src/stdlib/ctypes/structure.rs(5 hunks)crates/vm/src/stdlib/ctypes/union.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/array.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/array.rs
🧠 Learnings (3)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (4)
crates/vm/src/stdlib/ctypes.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (3)
size(178-180)new(161-167)new(205-211)
crates/vm/src/stdlib/ctypes/pointer.rs (4)
crates/vm/src/stdlib/ctypes/base.rs (8)
new(161-167)new(205-211)from_address(564-591)from_buffer(594-643)size(178-180)buffer(732-732)buffer(742-742)from_buffer_copy(646-686)crates/vm/src/stdlib/ctypes/array.rs (5)
from_address(404-440)from_buffer(443-506)buffer(579-579)buffer(589-589)from_buffer_copy(509-564)crates/vm/src/stdlib/ctypes/structure.rs (5)
from_address(310-333)from_buffer(336-385)buffer(436-436)buffer(446-446)from_buffer_copy(388-429)crates/vm/src/stdlib/ctypes/union.rs (5)
from_address(136-153)from_buffer(156-199)buffer(243-243)buffer(250-250)from_buffer_copy(202-237)
crates/vm/src/stdlib/ctypes/structure.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (8)
new(161-167)new(205-211)std(544-544)std(547-547)std(550-550)size(178-180)buffer(732-732)buffer(742-742)
crates/vm/src/stdlib/ctypes/union.rs (2)
crates/vm/src/stdlib/ctypes.rs (2)
get_size(140-159)size_of(361-374)crates/vm/src/stdlib/ctypes/pointer.rs (6)
std(29-29)std(115-115)std(156-156)from_address(88-98)from_buffer(101-142)from_buffer_copy(145-178)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run tests under miri
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
🔇 Additional comments (21)
crates/vm/src/stdlib/ctypes/field.rs (1)
8-8: LGTM!Import for
PyCUnioncorrectly added to support the new union field access.crates/vm/src/stdlib/ctypes.rs (3)
48-48: LGTM!Import correctly updated to include
CDataObjectfor the new buffer-backed storage model.
161-175: Dead code annotation - verify usage or remove.The
#[allow(dead_code)]attribute suggests this function isn't currently used. If it's intended for future use, consider adding a TODO comment explaining the planned usage. Otherwise, remove it to keep the codebase clean.
345-350: LGTM!The
PyCSimpleinitialization correctly creates aCDataObjectwith the appropriate size for the simple type, enabling buffer-backed storage.crates/vm/src/stdlib/ctypes/pointer.rs (5)
9-11: LGTM!Imports correctly added for the new buffer-backed storage model.
30-39: LGTM!The array type creation correctly uses the new storage model with object-wrapped type references and zero-sized
CDataObjectfor type-level metadata.
87-98: LGTM!The
from_addressimplementation correctly validates against NULL pointer access and stores the address value.
100-142: LGTM!The
from_bufferimplementation correctly validates offset, buffer writability, and bounds before reading the pointer value.
144-178: LGTM!The
from_buffer_copyimplementation correctly handles offset validation and bounds checking. Unlikefrom_buffer, it appropriately doesn't require a writable source since it's copying data.crates/vm/src/stdlib/ctypes/structure.rs (4)
1-9: LGTM!Imports correctly updated for the new buffer-backed storage model with
CDataObject, buffer protocol types, andAsBuffertrait.
335-385: LGTM!The
from_bufferimplementation correctly validates offset, buffer writability, and performs proper bounds checking before copying data.
387-429: LGTM!The
from_buffer_copyimplementation correctly handles offset validation and bounds checking for copying data from a bytes-like source.
432-466: LGTM!The buffer methods and
AsBufferimplementation correctly expose the underlyingcdatabuffer with proper locking semantics. Thereadonly=falsesetting is appropriate for ctypes which allows mutation.crates/vm/src/stdlib/ctypes/union.rs (4)
1-11: LGTM!Imports correctly updated for the new buffer-backed union implementation with
CDataObject, buffer protocol types, andAsBuffertrait.
155-199: LGTM!The
from_bufferimplementation correctly validates offset, buffer writability, performs bounds checking, and copies the data.
201-238: LGTM!The
from_buffer_copyimplementation correctly handles offset validation, bounds checking, and data copying.
240-268: LGTM!The buffer methods and
AsBufferimplementation correctly expose the union'scdatabuffer with proper locking semantics.crates/vm/src/stdlib/ctypes/base.rs (2)
148-181: LGTM!The
CDataObjectstruct provides a clean abstraction for ctypes memory-backed storage with appropriate constructor helpers.
728-762: LGTM!The
AsBufferimplementation correctly maps thePyRwLockguards to buffer access methods, enabling proper buffer protocol support forPyCSimple.crates/vm/src/stdlib/ctypes/array.rs (2)
403-440: Inconsistent error handling:from_addressreturns error for NULL butfrom_buffer/from_buffer_copywould proceed.
from_addressproperly validatesaddress == 0 || size == 0and returns an error. However, this is the only classmethod with explicit NULL protection. The validation is good here.
575-609: LGTM!The
AsBufferimplementation forPyCArrayfollows the same correct pattern asPyCSimple, properly mapping lock guards to buffer access.
cfa2fea to
9af05c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
crates/vm/src/stdlib/ctypes/array.rs (1)
36-67: CDataObject integration for arrays and theto_argfix look correct
- Type and instance ctors now consistently wrap the underlying Vec in
CDataObject, andPyCArrayType/PyCArraystore the element type asPyObjectRef, which plays well with nested array types.to_argnow takes&cdata.bufferinstead of&cdata.buffer.clone(), fixing the previous dangling-pointer issue by ensuring the libffi Arg points into the live backing buffer.- ARRAY_BUFFER_METHODS and
AsBuffer for PyCArraymirror the patterns used for simple types and structures, providing a writable view overcdata.buffer.These changes align the array implementation with the new shared CData model and make the FFI path safer.
Also applies to: 211-219, 570-575, 578-612
crates/vm/src/stdlib/ctypes/base.rs (1)
14-15: Fix platform‑dependent size mismatches forc_wchar/c_long/c_ulonginvalue_to_bytesThe new
value_to_bytesdrives the backingCDataObjectsize for simple types, so size mistakes here now directly affect.cdata.bufferand the buffer protocol:
"u"(c_wchar) always emits 4 bytes, while_ctypes::get_size("u")and_ctypes::bytes_to_pyobjectuse the nativeWideCharwidth. On Windows (2‑bytewchar_t) this will makec_wcharinstances expose a 4‑byte buffer even thoughsizeof(c_wchar)is 2."l"/"L"still hardcode 4‑byte layouts, the same issue previously raised for 64‑bit platforms wherec_long/c_ulongare 8 bytes. With the new buffer backing this now also mis-sizesPyCSimple.cdataand AsBuffer views for these types.Consider making these arms respect the native C sizes, e.g.:
-use std::ffi::{c_uint, c_ulong, c_ulonglong, c_ushort}; +use std::ffi::{c_long, c_uint, c_ulong, c_ulonglong, c_ushort}; +use widestring::WideChar; @@ - "u" => { - // c_wchar - 4 bytes (wchar_t on most platforms) - if let Ok(s) = value.str(vm) - && let Some(c) = s.as_str().chars().next() - { - return (c as u32).to_ne_bytes().to_vec(); - } - vec![0; 4] - } + "u" => { + // c_wchar - native WideChar width + const SIZE: usize = std::mem::size_of::<WideChar>(); + if let Ok(s) = value.str(vm) + && let Some(c) = s.as_str().chars().next() + { + let code = c as u32; + if SIZE == 2 { + let w = code as u16; + return w.to_ne_bytes().to_vec(); + } else { + return code.to_ne_bytes().to_vec(); + } + } + vec![0; SIZE] + } @@ - "l" => { - // c_long (platform dependent, using 4 bytes) - if let Ok(int_val) = value.try_int(vm) - && let Some(v) = int_val.as_bigint().to_i32() - { - return v.to_ne_bytes().to_vec(); - } - vec![0; 4] - } - "L" => { - // c_ulong (platform dependent, using 4 bytes) - if let Ok(int_val) = value.try_int(vm) - && let Some(v) = int_val.as_bigint().to_u32() - { - return v.to_ne_bytes().to_vec(); - } - vec![0; 4] - } + "l" => { + // c_long (platform dependent) + let size = std::mem::size_of::<c_long>(); + if let Ok(int_val) = value.try_int(vm) { + let i = int_val.as_bigint(); + if size == 4 { + if let Some(v) = i.to_i32() { + return v.to_ne_bytes().to_vec(); + } + } else if size == 8 { + if let Some(v) = i.to_i64() { + return v.to_ne_bytes().to_vec(); + } + } + } + vec![0; size] + } + "L" => { + // c_ulong (platform dependent) + let size = std::mem::size_of::<c_ulong>(); + if let Ok(int_val) = value.try_int(vm) { + let i = int_val.as_bigint(); + if size == 4 { + if let Some(v) = i.to_u32() { + return v.to_ne_bytes().to_vec(); + } + } else if size == 8 { + if let Some(v) = i.to_u64() { + return v.to_ne_bytes().to_vec(); + } + } + } + vec![0; size] + }This also addresses the earlier review about 4‑byte
c_long/c_ulongwhile keeping behavior correct on both 32‑ and 64‑bit targets.Also applies to: 277-420
crates/vm/src/stdlib/ctypes/union.rs (1)
133-153:Union.from_addressstill doesn’t read from the given address (and still usessize_of(Either::A(cls)))This method is still constructing a zero-initialized buffer with
CDataObject::new(size)and never reading fromaddress, so the resulting union doesn’t reflect the memory at that location. It also usessize_of(Either::A(cls)), which is unlikely to hit thesize_of_instancespath on the type object.Align it with the
Structure.from_addresspattern instead:#[pyclassmethod] fn from_address(cls: PyTypeRef, address: isize, vm: &VirtualMachine) -> PyResult { use crate::function::Either; use crate::stdlib::ctypes::_ctypes::size_of; - // Get size from cls - let size = size_of(Either::A(cls.clone()), vm)?; - - // Create instance with data from address + // Get size from cls + let size = size_of(Either::B(cls.as_object().clone()), vm)?; + + // Create instance with data from address if address == 0 || size == 0 { return Err(vm.new_value_error("NULL pointer access".to_owned())); } - Ok(PyCUnion { - cdata: PyRwLock::new(CDataObject::new(size)), - } + let data = unsafe { + let ptr = address as *const u8; + std::slice::from_raw_parts(ptr, size).to_vec() + }; + + Ok(PyCUnion { + cdata: PyRwLock::new(CDataObject::from_bytes(data, None)), + } .into_ref_with_type(vm, cls)? .into()) }This addresses the original concern and makes
from_addressactually reflect the pointed-to memory.
🧹 Nitpick comments (3)
crates/vm/src/stdlib/ctypes/field.rs (1)
106-125: Union support inCFieldlooks correct; consider surfacing OOB write errorsThe new structure/union branches correctly use
cdataand guard againstoffset + size > buffer.len(), matching the new buffer model. Right now, failed bounds checks just skip the write (and reads fall back to0), which can make descriptor/size bugs hard to notice. If you ever see mis-sized fields at runtime, it may be worth turning these silent skips into explicit errors or at least debug assertions.Also applies to: 200-238
crates/vm/src/stdlib/ctypes/pointer.rs (1)
30-37: Pointer constructors are safe and consistent with current.contentssemanticsThe new
from_address/from_buffer/from_buffer_copyhelpers perform the right null/offset/bounds checks and only stash the address as a Python int incontents, which is safe given that.contentsis still marked “TODO: not correct”. Once you implement real pointee semantics, you’ll likely want these paths to also consult_type_and keep a reference to the source buffer forfrom_buffer/from_buffer_copy, but for now the implementation is internally consistent.Also applies to: 87-178
crates/vm/src/stdlib/ctypes/array.rs (1)
279-295: Avoid cloning the whole buffer ingetitem_by_indexfor large arrays
getitem_by_indexcurrently does:let buffer = zelf.cdata.read().buffer.clone(); if offset + element_size <= buffer.len() { let bytes = &buffer[offset..offset + element_size]; ... }This clones the entire backing buffer on every element access. For large arrays, it would be cheaper to slice directly under the read lock:
- let buffer = zelf.cdata.read().buffer.clone(); - if offset + element_size <= buffer.len() { - let bytes = &buffer[offset..offset + element_size]; + let cdata = zelf.cdata.read(); + if offset + element_size <= cdata.buffer.len() { + let bytes = &cdata.buffer[offset..offset + element_size];The semantics stay the same, but you avoid an O(n) copy per access.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
crates/vm/src/stdlib/ctypes.rs(4 hunks)crates/vm/src/stdlib/ctypes/array.rs(12 hunks)crates/vm/src/stdlib/ctypes/base.rs(9 hunks)crates/vm/src/stdlib/ctypes/field.rs(3 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(3 hunks)crates/vm/src/stdlib/ctypes/structure.rs(5 hunks)crates/vm/src/stdlib/ctypes/union.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/union.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/union.rs
🧠 Learnings (5)
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/base.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/base.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/base.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (5)
crates/vm/src/stdlib/ctypes.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (3)
size(178-180)new(161-167)new(205-211)
crates/vm/src/stdlib/ctypes/field.rs (2)
crates/vm/src/stdlib/ctypes/base.rs (14)
size(178-180)value(46-48)value(50-52)value(54-56)value(86-86)value(96-96)value(106-106)value(114-114)value(115-115)value(126-126)value(137-137)value(138-138)value(282-282)value(471-516)crates/vm/src/stdlib/ctypes/array.rs (2)
value(367-371)value(375-375)
crates/vm/src/stdlib/ctypes/structure.rs (2)
crates/vm/src/stdlib/ctypes.rs (2)
get_size(140-159)size_of(380-393)crates/vm/src/stdlib/ctypes/base.rs (9)
new(161-167)new(205-211)std(544-544)std(547-547)std(550-550)size(178-180)from_bytes(169-175)buffer(732-732)buffer(742-742)
crates/vm/src/stdlib/ctypes/array.rs (2)
crates/vm/src/stdlib/ctypes/base.rs (9)
new(161-167)new(205-211)from_bytes(169-175)buffer(732-732)buffer(742-742)std(544-544)std(547-547)std(550-550)size(178-180)crates/vm/src/stdlib/ctypes.rs (1)
size_of(380-393)
crates/vm/src/stdlib/ctypes/union.rs (3)
crates/vm/src/stdlib/ctypes/structure.rs (9)
field(72-73)field(233-233)std(133-133)std(144-144)from_address(310-333)from_buffer(336-385)buffer(436-436)buffer(446-446)from_buffer_copy(388-429)crates/vm/src/stdlib/ctypes/base.rs (12)
std(544-544)std(547-547)std(550-550)from_address(564-591)size(178-180)new(161-167)new(205-211)from_buffer(594-643)buffer(732-732)buffer(742-742)from_bytes(169-175)from_buffer_copy(646-686)crates/vm/src/stdlib/ctypes/pointer.rs (6)
std(29-29)std(115-115)std(156-156)from_address(88-98)from_buffer(101-142)from_buffer_copy(145-178)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run tests under miri
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
🔇 Additional comments (6)
crates/vm/src/stdlib/ctypes.rs (1)
161-175: New size/byte helpers align with the buffer-backed model
get_size_from_typeandbytes_to_pyobjectcorrectly derive sizes from_type_and use platform C types (c_short,c_long,c_double,WideChar, etc.) for decoding, andnew_simple_typenow initializesPyCSimple.cdatawith an appropriately sizedCDataObject. This ties the high‑level ctypes API cleanly into the new shared buffer representation.Also applies to: 177-340, 364-369
crates/vm/src/stdlib/ctypes/base.rs (1)
148-181: CDataObject abstraction and PyCSimple buffer exposure look solidThe new
CDataObjectwrapper cleanly centralizes buffer/base/object tracking, and wiring it throughPyCData/PyCSimple.cdataplusSIMPLE_BUFFER_METHODSandAsBuffer for PyCSimplegives a consistent, writable buffer view for simple ctypes. The nestedPyMappedRwLock*Guard::mapusage correctly exposesbuffer.as_slice()/buffer.as_mut_slice()without leaking locks.Also applies to: 183-195, 256-267, 728-763
crates/vm/src/stdlib/ctypes/structure.rs (2)
187-207: cdata-backedPyCStructureand buffer protocol integration look goodSwitching
PyCStructureto a sharedCDataObjectand wiringSTRUCTURE_BUFFER_METHODS/AsBufferto exposecdata.buffergives a coherent, writable view of structure storage. The debug impl reportingcdata.size()is also a nice sanity aid when diagnosing layout issues.Also applies to: 432-466
309-385: Unable to verify through code access due to repository clone failures. Based on the provided code snippet and learnings, I cannot confirm the specific technical claims about Either enum behavior and PyBuffer structure through direct inspection.
from_address/from_buffershould query size via the type object and keep from_buffer's source alive — verification inconclusiveThe review raises concerns about:
- Using
size_of(Either::A(cls.clone()))vsEither::B(cls.as_object().clone())for accessing type'ssize_of_instanceshook- Passing
NonevsSome(buffer.obj.clone())to preserve buffer lifetimeHowever, I cannot directly verify these claims in the codebase due to infrastructure limitations. The concerns rest on:
- Whether the Either enum truly has branches A and B with the described semantics
- Whether PyBuffer has an
objfield accessible as described- Whether the distinction between Either variants actually affects how
size_ofresolves for structure typesThese are architectural decisions that require examination of the Either enum definition,
size_offunction implementation, and PyBuffer structure. Without this access, the validity of the proposed changes cannot be confirmed.crates/vm/src/stdlib/ctypes/union.rs (1)
117-123: cdata-backedPyCUnionand buffer protocol hook-in look consistentThe
PyCUnionstruct’scdatafield plusUNION_BUFFER_METHODSandAsBuffergive unions the same buffer behavior as simple types and structures. Mapping thePyRwLockguards down toCDataObject.bufferis clean and should work well with the rest of the new ctypes buffer model.Also applies to: 240-269
crates/vm/src/stdlib/ctypes/array.rs (1)
403-440: I apologize, but I'm encountering persistent infrastructure issues that prevent me from verifying this review comment:
- Repository cloning failed: Multiple attempts to clone the RustPython repository have failed
- Web search limitations: Generic searches don't return the specific implementation details needed to verify the
size_offunction behavior withEither::AvsEither::BThe review comment makes a specific technical claim about how
_ctypes::size_of()should be called with array types, but I cannot:
- Access the actual implementation of
size_of()to understand how it handlesEither::AvsEither::B- Verify whether array types have
size_of_instancesimplemented- Confirm the current code actually fails as claimed
- Validate that the proposed fix is correct
I recommend manual verification by:
- Checking the
_ctypes::size_of()implementation to understand how it processesEither::A(PyTypeRef)vsEither::B(PyObjectRef)- Verifying whether
size_of_instancesis implemented on array types- Testing whether the current code produces the claimed error or if it works correctly
- Confirming the proposed fix resolves any actual issue
9af05c3 to
274519c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (3)
crates/vm/src/stdlib/ctypes/union.rs (2)
197-214:from_addressdoesn't read data from the provided memory address.This implementation creates a zero-initialized buffer instead of reading the actual bytes at the given address. The
stg_infois used to create an emptyCDataObject, but the data ataddressis never read.+ // SAFETY: Caller must ensure address points to valid memory of at least size bytes + let data = unsafe { + let ptr = address as *const u8; + std::slice::from_raw_parts(ptr, size).to_vec() + }; + Ok(PyCUnion { - cdata: PyRwLock::new(CDataObject::from_stg_info(&stg_info)), + cdata: PyRwLock::new(CDataObject::from_bytes(data, None)), }
250-259:from_buffershould preserve reference to source buffer.The
Nonepassed toCDataObject::from_bytesmeans no reference to the source buffer is kept. If the source buffer is garbage collected, data integrity could be compromised for shared buffer scenarios.- cdata: PyRwLock::new(CDataObject::from_bytes(data, None)), + cdata: PyRwLock::new(CDataObject::from_bytes(data, Some(buffer.obj.clone()))),crates/vm/src/stdlib/ctypes/array.rs (1)
750-755: Critical:to_argstill creates a dangling pointer.This was flagged in a previous review.
Arg::new(&cdata.buffer)borrows a reference to data inside theRwLockReadGuard, but the guard is dropped when the function returns, making theArghold a dangling reference.The returned
Argwill point to freed memory. The caller must ensure the buffer outlives the FFI call:impl PyCArray { #[allow(unused)] - pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<libffi::middle::Arg> { - let cdata = self.cdata.read(); - Ok(libffi::middle::Arg::new(&cdata.buffer)) + pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<(Vec<u8>, libffi::middle::Arg)> { + let buffer = self.cdata.read().buffer.clone(); + // Caller must keep the Vec alive for the duration of the FFI call + let arg = libffi::middle::Arg::new(&buffer); + Ok((buffer, arg)) } }Note: This fix returns the owned buffer so the caller can keep it alive. The exact solution depends on how
to_argis used in FFI calls.
🧹 Nitpick comments (8)
crates/vm/src/stdlib/ctypes/field.rs (2)
106-129: Silent fallback on bounds check failure may hide bugs.When
offset + size > cdata.buffer.len(), the code falls through to returnvm.ctx.new_int(0)without any indication that the access was out of bounds. This could mask programming errors in field offset calculations.Consider returning an error for out-of-bounds access:
if offset + size <= cdata.buffer.len() { let bytes = &cdata.buffer[offset..offset + size]; return PyCField::bytes_to_value(bytes, size, vm); + } else { + return Err(vm.new_value_error(format!( + "field offset {} + size {} exceeds buffer length {}", + offset, size, cdata.buffer.len() + ))); }
200-240: Consider extracting common buffer write logic.The
PyCStructureandPyCUnionbranches share nearly identical logic for writing tocdata.buffer. A helper method or macro could reduce duplication:// Example helper approach fn write_to_cdata_buffer( cdata: &mut CDataObject, offset: usize, size: usize, bytes: &[u8], ) -> bool { if offset + size <= cdata.buffer.len() { cdata.buffer[offset..offset + size].copy_from_slice(bytes); true } else { false } }crates/vm/src/stdlib/ctypes/base.rs (2)
794-798: Address TODO: Store buffer reference to prevent premature GC.The TODO notes that
from_buffershould store a reference to the source buffer in_objects. While the value is copied, failing to keep the source alive could cause issues if the ctypes object is expected to share memory with the original buffer (which is the semantic offrom_buffervsfrom_buffer_copy).Consider implementing this before merge or tracking it as a follow-up issue.
Would you like me to open an issue to track implementing buffer reference storage?
959-978: Consider simplifying the double-map pattern.The nested
PyMappedRwLockReadGuard::map(PyRwLockReadGuard::map(...))pattern is verbose. A simpler approach would be:obj_bytes: |buffer| { rustpython_common::lock::PyRwLockReadGuard::map( buffer.obj_as::<PyCSimple>().cdata.read(), |x| x.buffer.as_slice(), ) .into() },The intermediate map to
&CDataObjectbefore mapping to the slice appears redundant.crates/vm/src/stdlib/ctypes/pointer.rs (2)
134-176: from_buffer reads pointer value correctly but doesn't track source reference.The implementation correctly reads a pointer-sized value from the buffer. However, unlike CPython's
from_buffer, this doesn't maintain a reference to the source buffer. For pointers this is less critical since only a usize is read, but for consistency with other ctypes, consider storing the source reference inCDataObject.objects.
249-259: Unsafe symbol access looks correct but lacks documentation.The unsafe block correctly retrieves the symbol address from the library. The dereference
*symbolgets the address stored in the symbol table. Consider adding a safety comment explaining why this is safe (library is locked, symbol lifetime is managed by libloading).let symbol_address = if let Some(lib) = &*inner_lib { unsafe { + // SAFETY: Library is held via lock, and symbol is valid for the lifetime + // of the library reference. We only read the address value. // Try to get the symbol from the library let symbol: Symbol<'_, *mut u8> = lib.get(symbol_name.as_bytes()).map_err(|e| { vm.new_attribute_error(format!("{}: symbol '{}' not found", e, name.as_str())) })?; *symbol as usize }crates/vm/src/stdlib/ctypes/structure.rs (1)
354-357: Unsafe memory read from arbitrary address - expected but dangerous.This correctly implements CPython's
from_addressbehavior, which is inherently unsafe. The caller is responsible for ensuring the address points to valid, properly-sized memory. Consider adding a safety comment documenting the contract.// Read data from address if address == 0 || size == 0 { return Err(vm.new_value_error("NULL pointer access".to_owned())); } + // SAFETY: Caller must ensure `address` points to valid, readable memory + // of at least `size` bytes. This matches CPython's from_address behavior. let data = unsafe { let ptr = address as *const u8; std::slice::from_raw_parts(ptr, size).to_vec() };crates/vm/src/stdlib/ctypes/array.rs (1)
657-747: Duplicatein_dllimplementation between PyCArrayType and PyCArray.There are two
in_dllimplementations with nearly identical logic - one onPyCArrayType(lines 123-202) and one onPyCArray(lines 657-747). Consider consolidating to reduce code duplication.The implementations are almost identical except for how they obtain element_type/length. Consider extracting the common DLL symbol resolution logic into a helper function.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.github/copilot-instructions.md(1 hunks)crates/vm/src/stdlib/ctypes.rs(7 hunks)crates/vm/src/stdlib/ctypes/array.rs(13 hunks)crates/vm/src/stdlib/ctypes/base.rs(10 hunks)crates/vm/src/stdlib/ctypes/field.rs(3 hunks)crates/vm/src/stdlib/ctypes/function.rs(5 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(3 hunks)crates/vm/src/stdlib/ctypes/structure.rs(9 hunks)crates/vm/src/stdlib/ctypes/thunk.rs(1 hunks)crates/vm/src/stdlib/ctypes/union.rs(3 hunks)crates/vm/src/stdlib/ctypes/util.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/util.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/structure.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/field.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/util.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/structure.rs
🧠 Learnings (7)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-11-25T11:05:02.900Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.900Z
Learning: When testing Python code, use RustPython (`cargo run -- script.py`) instead of the standard `python` command
Applied to files:
.github/copilot-instructions.md
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/*test*.py : CRITICAL: Do not modify test assertions, test logic, or test data
Applied to files:
.github/copilot-instructions.md
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (9)
crates/vm/src/stdlib/ctypes/union.rs (4)
crates/vm/src/stdlib/ctypes/structure.rs (7)
std(136-136)std(165-165)from_address(344-366)from_buffer(369-417)buffer(467-467)buffer(477-477)as_buffer(489-497)crates/vm/src/stdlib/ctypes/util.rs (2)
default(15-24)new(28-37)crates/vm/src/stdlib/ctypes/base.rs (14)
new(253-259)size(210-212)std(515-515)std(523-523)std(700-700)std(703-703)std(706-706)from_stg_info(173-180)from_address(720-747)from_buffer(750-799)buffer(963-963)buffer(973-973)from_bytes(183-190)as_buffer(985-993)crates/vm/src/stdlib/ctypes/function.rs (1)
f(368-368)
crates/vm/src/stdlib/ctypes/field.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (18)
size(210-212)value(54-56)value(58-60)value(62-64)value(94-94)value(104-104)value(114-114)value(122-122)value(123-123)value(134-134)value(145-145)value(146-146)value(296-296)value(316-316)value(327-327)value(335-335)value(344-344)value_to_bytes(431-571)
crates/vm/src/stdlib/ctypes/base.rs (1)
crates/vm/src/stdlib/ctypes.rs (25)
bytes_to_pyobject(185-347)new_simple_type(351-387)std(417-417)std(434-434)std(634-634)type_attr(651-651)get_size(142-161)get_align(164-166)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)mem(157-157)mem(158-158)mem(212-212)size_of(391-439)
crates/vm/src/stdlib/ctypes/util.rs (4)
crates/vm/src/stdlib/ctypes/union.rs (1)
default(22-26)crates/vm/src/stdlib/ctypes/base.rs (2)
new(253-259)size(210-212)crates/vm/src/stdlib/ctypes/thunk.rs (1)
new(212-297)crates/vm/src/stdlib/ctypes/field.rs (2)
new(36-53)size(259-261)
crates/vm/src/stdlib/ctypes/pointer.rs (2)
crates/vm/src/stdlib/ctypes/array.rs (1)
__mul__(96-121)crates/vm/src/stdlib/ctypes/util.rs (1)
new(28-37)
crates/vm/src/stdlib/ctypes/thunk.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (8)
std(515-515)std(523-523)std(700-700)std(703-703)std(706-706)ffi_type_from_str(25-48)get_type_code(18-23)t(22-22)
crates/vm/src/stdlib/ctypes/function.rs (1)
crates/vm/src/stdlib/ctypes/thunk.rs (3)
t(31-31)new(212-297)code_ptr(300-302)
crates/vm/src/stdlib/ctypes.rs (2)
crates/vm/src/stdlib/ctypes/base.rs (10)
std(515-515)std(523-523)std(700-700)std(703-703)std(706-706)size(210-212)new(253-259)from_bytes(183-190)t(22-22)get_stg_info(238-250)crates/vm/src/stdlib/ctypes/util.rs (1)
new(28-37)
crates/vm/src/stdlib/ctypes/structure.rs (5)
crates/vm/src/stdlib/ctypes.rs (22)
field(669-669)get_size(142-161)mem(144-144)mem(145-145)mem(146-146)mem(147-147)mem(148-148)mem(149-149)mem(150-150)mem(151-151)mem(152-152)mem(153-153)mem(154-154)mem(155-155)mem(156-156)mem(157-157)mem(158-158)mem(212-212)size_of(391-439)std(417-417)std(434-434)std(634-634)crates/vm/src/stdlib/ctypes/base.rs (17)
new(253-259)size(210-212)__mul__(382-384)std(515-515)std(523-523)std(700-700)std(703-703)std(706-706)from_stg_info(173-180)_objects(224-226)_objects(622-624)from_address(720-747)from_buffer(750-799)buffer(963-963)buffer(973-973)from_buffer_copy(802-842)as_buffer(985-993)crates/vm/src/stdlib/ctypes/util.rs (1)
new(28-37)crates/vm/src/stdlib/ctypes/array.rs (11)
__mul__(96-121)std(271-271)std(274-274)std(277-277)_objects(335-337)from_address(495-530)from_buffer(533-598)buffer(762-762)buffer(772-772)from_buffer_copy(601-655)as_buffer(784-792)crates/vm/src/stdlib/ctypes/pointer.rs (7)
__mul__(23-40)std(29-29)std(149-149)std(190-190)from_address(122-132)from_buffer(135-176)from_buffer_copy(179-212)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Run tests under miri
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run rust tests (windows-latest)
🔇 Additional comments (22)
.github/copilot-instructions.md (1)
47-47: LGTM!The feature flag example update from
ssltojitis consistent with the JIT support section documented later in the file.crates/vm/src/stdlib/ctypes/util.rs (1)
1-38: LGTM!Clean implementation of
StgInfothat mirrors CPython's storage information pattern. The struct appropriately captures size, alignment, and metadata needed for ctypes types. TheDefaultimplementation withalign: 1is correct as a minimal alignment constraint.crates/vm/src/stdlib/ctypes/function.rs (2)
274-315: LGTM! Proper thunk lifecycle management for Python callables.The implementation correctly:
- Creates a
PyCThunkto wrap the Python callable- Extracts and parses
_argtypes_and_restype_from the class- Stores the thunk as a
PyRef<PyCThunk>in thehandlerfield to keep the closure alive- Sets
needs_free: AtomicCell::new(true)to indicate resource ownershipThe thunk's
CodePtris properly extracted and stored inptrfor FFI calls.
78-135: LGTM!Comprehensive handling of FFI return type conversion covering all ctypes type codes. The fallback to
i32interpretation when_type_is absent provides sensible default behavior for FFI returns.crates/vm/src/stdlib/ctypes/base.rs (4)
156-213: LGTM! Clean CDataObject abstraction.The
CDataObjectstruct provides a well-designed backing store for ctypes data:
from_stg_infofor zero-initialized buffersfrom_bytesfor pre-existing datafrom_basefor sub-object scenariosThe separation of concerns between buffer storage and object references is appropriate.
510-525: LGTM! Platform-dependent sizing correctly implemented.The
c_longandc_ulonghandling now useslibc::c_long/libc::c_ulongwithstd::mem::size_offor correct platform-dependent sizing, addressing the previous concern about hardcoded 4-byte sizes.
719-747: LGTM!The
from_addressimplementation correctly reads bytes from the provided memory address, with appropriate null pointer checks and safety documentation. This is the expected behavior for ctypes address-based construction.
844-917: LGTM! Thoroughin_dllimplementation.The implementation correctly:
- Handles both integer handles and CDLL-like objects with
_handleattribute- Uses proper null-terminated symbol names
- Reads symbol values with appropriate safety considerations
- Stores the dll reference in
cdata.baseto prevent premature unloadingcrates/vm/src/stdlib/ctypes/thunk.rs (2)
183-303: Well-structured thunk implementation.The
PyCThunk::newimplementation follows a sound pattern:
- Parses argument and result types correctly
- Creates FFI CIF with appropriate types
- Manages userdata lifetime via
Box::into_rawwith cleanup inDrop- Stores both closure and userdata together to ensure proper lifecycle
The use of
'staticlifetime for the userdata reference is safe becauseThunkDatamaintains ownership of the leaked box and cleans it up on drop.
305-310: Based on my investigation, I've identified significant evidence that supports the original review concern, though I cannot directly access the codebase for complete verification.Key findings from external sources:
libffi closure thread-safety (confirmed via libffi documentation): Closures are NOT inherently thread-safe. The documented requirements include:
ffi_prep_cifmust be called before threads start or with serialization- Closure objects should not be concurrently allocated/freed
- User-data/environment must be made explicitly thread-safe
- libffi does not automatically make captured data Send/Sync
RustPython VM threading model (confirmed via RustPython embedding guidance):
- RustPython's VM is not thread-safe for arbitrary concurrent access
- VM APIs must be called from within
vm.enter()context on the VM's thread- Calling VM methods from arbitrary threads violates the threading contract
Critical concern: The callback executes from C code (libffi), potentially from an arbitrary thread. If it then calls
with_current_vmto invoke the Python callable, this violates RustPython's threading model unless there's explicit synchronization not reflected in the comment.The justification comment "thunk_data contains the libffi closure which is heap-allocated" is insufficient. Heap allocation alone does not provide thread-safety—explicit synchronization mechanisms or thread-pinning are required.
Verify that the callback mechanism includes proper synchronization (e.g., channels, locks, or thread-pinning) to ensure
with_current_vmis only invoked from the VM's thread, not from arbitrary C threads spawned by libffi.crates/vm/src/stdlib/ctypes.rs (5)
184-347: Well-structured type conversion with proper platform-aware sizes.The implementation correctly uses
mem::size_offor each C type, addressing the previous platform-dependent size concern. The pattern of usingconst SIZEwithfrom_ne_bytesis clean and efficient.
371-378: LGTM - proper CData initialization.The simple type now correctly allocates a zero-initialized buffer of the appropriate size based on the type character.
389-439: Good coverage of ctypes size computation.The function handles various ctypes types comprehensively. The logic correctly distinguishes between instances (using
cdata.read().size()) and type objects (usingstg_infoor computing from_type_).
163-166: Based on my verification using the web search, I can now provide my assessment:The alignment assumption in
get_alignis technically incorrect for some platforms.The web search confirms the review comment's concern is accurate: on 32-bit x86 Linux,
c_doublehas 8-byte size but only 4-byte alignment (GCC default behavior per the i386 ABI). This means the assumption that "alignment equals size" does not hold universally.However, I cannot access the repository to fully verify:
- How
get_alignis actually used in the codebase- Whether this causes actual bugs or is a documented limitation
- If tests or other code already account for this
The original review comment appropriately notes this is consistent with the existing approach, suggesting it may be an accepted trade-off rather than an undiscovered issue.
763-810: Based on my verification, the review comment is accurate. Let me provide the final rewritten review:
Cast implementation correctly matches CPython's permissive reinterpret_cast behavior.
The function properly validates that the destination type is a ctypes pointer type (via
_type_attribute check) and accepts pointer-convertible source objects (CData instances, arrays, and integers). It intentionally does not validate type compatibility between source and destination, which aligns with CPython's design—cast()exists precisely to bypass strict type checks and reinterpret memory. This implementation is sound and correct.crates/vm/src/stdlib/ctypes/pointer.rs (2)
74-88: Constructor allows any value as initial contents.The constructor accepts any
PyObjectRefas initial contents without validating it against the pointer's_type_. This is consistent with CPython's behavior but could lead to confusing runtime errors.
121-132: Pointer from_address correctly stores address without dereferencing.Unlike
Structure.from_addresswhich reads memory at the address,_Pointer.from_addresscorrectly just stores the address value. This matches CPython's pointer semantics.crates/vm/src/stdlib/ctypes/structure.rs (2)
463-497: Buffer protocol implementation looks correct.The
AsBufferimplementation properly exposes the structure's internal buffer through the Python buffer protocol. The use of mapped lock guards ensures the buffer reference remains valid while accessed.
166-178: I'll help you verify the review comment about the overflow check in the__mul__implementation. Let me start by examining the code context and searching for similar patterns in the ctypes module.
<function_calls>
#!/bin/bashFirst, get context around lines 166-178 in structure.rs
wc -l crates/vm/src/stdlib/ctypes/structure.rs
</function_calls>
<function_calls>
#!/bin/bashRead the section around line 165-180 to see the TODO and context
sed -n '160,185p' crates/vm/src/stdlib/ctypes/structure.rs
</function_calls>
<function_calls>
#!/bin/bashSearch for other places in ctypes where we multiply size/length values
that might need similar overflow checks
rg -n 'element_size.**|.checked_mul|overflow' crates/vm/src/stdlib/ctypes/ --type rs
</function_calls>crates/vm/src/stdlib/ctypes/array.rs (3)
588-597: Good - from_buffer correctly tracks source buffer reference.Unlike the structure implementation, this correctly stores a reference to the source buffer in
cdata.objects, ensuring the source stays alive. This is the correct CPython behavior forfrom_buffer.
758-792: LGTM - consistent AsBuffer implementation.The buffer protocol implementation follows the same correct pattern as
PyCStructure, properly exposing the internal buffer through mapped lock guards.
505-529: Unsafe memory read matches CPython semantics.Similar to
PyCStructure.from_address, this correctly implements the unsafefrom_addressbehavior. The caller is responsible for ensuring valid memory.
| fn python_to_ffi(obj: PyResult, ty: &PyTypeRef, result: *mut c_void, vm: &VirtualMachine) { | ||
| let obj = match obj { | ||
| Ok(o) => o, | ||
| Err(_) => return, // Exception occurred, leave result as-is | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception in callable leaves FFI result uninitialized.
When the Python callable raises an exception, python_to_ffi returns early without writing to result. The C caller will receive whatever garbage was in the result buffer. Consider zero-initializing the result on error:
let obj = match obj {
Ok(o) => o,
Err(_) => {
// Zero-initialize result on error
unsafe { std::ptr::write_bytes(result, 0, 1); }
return;
}
};Note: CPython sets an exception flag that the caller should check; this may be acceptable if RustPython has similar semantics.
🤖 Prompt for AI Agents
crates/vm/src/stdlib/ctypes/thunk.rs lines 60-64: when the Python callable
returns Err the function returns without writing to the FFI result buffer,
leaving uninitialized garbage for the C caller; fix by zero-initializing the
result buffer before returning on error (use an unsafe pointer write that fills
result with zeros for the appropriate size of the target type) so the C side
sees a deterministic value, and keep existing exception propagation semantics.
| let size = PyCUnionType::get_field_size(&field_type, vm)?; | ||
| max_size = max_size.max(size); | ||
| // For simple types, alignment == size | ||
| max_align = max_align.max(size); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alignment calculation assumes alignment equals size, which may be incorrect.
Setting max_align = max_align.max(size) assumes every field's alignment equals its size. This isn't true for composite types like arrays (where alignment is the element's alignment, not the array size) or structures with padding.
Consider using the type's actual alignment:
// For simple types, get alignment from _type_ attribute or stg_info
let align = PyCUnionType::get_field_align(&field_type, vm)?;
max_align = max_align.max(align);🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/union.rs around lines 173 to 177, the code
incorrectly assumes field alignment equals field size when computing max_align;
instead obtain each field's actual alignment (e.g., via a new or existing
PyCUnionType::get_field_align(&field_type, vm)? helper or reading the type's
stg_info/_type_/alignment) and use that value to update max_align (max_align =
max_align.max(align)); ensure the helper handles composite types (arrays,
structs) and returns the element/contained type alignment rather than array
size.
a0d067e to
b8c7509
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/vm/src/stdlib/ctypes.rs (1)
563-571:addressofreturns the address of the Python value cell, not the underlying C bufferWith the new CDataObject backing,
addressofon_SimpleCDatanow does:let simple = obj.downcast_ref::<PyCSimple>().unwrap(); Ok(simple.value.as_ptr() as usize)This is the address of the
AtomicCell<PyObjectRef>storage, not the C data buffer, and ignores the newcdatafield. For FFI and CPython‑compatible semantics, this should point at the C storage (e.g.,simple.cdata.read().buffer.as_ptr()), and it currently rejects all non‑simple ctypes instances.Refactor
addressofto use the CDataObject buffer (ideally viaPyCData/cdata, so it works for structures/arrays/unions as well), and only fall back to a TypeError if the object is not a ctypes instance.crates/vm/src/stdlib/ctypes/base.rs (1)
142-145:c_wchar("u") uses different sizes invalue_to_bytesvsget_size/bytes_to_pyobject
get_size("u")(andbytes_to_pyobject’s"u"arm) usemem::size_of::<WideChar>(), butvalue_to_byteshardcodes 4 bytes for"u":"u" => { // c_wchar - 4 bytes (wchar_t on most platforms) … return (c as u32).to_ne_bytes().to_vec(); }On platforms where
wchar_t/WideCharis 2 bytes (notably Windows), the CDataObject buffer for_SimpleCData"u"instances will be 4 bytes whileget_size("u")and thefrom_address/from_bufferpaths expect 2 bytes. This leads to inconsistentsizeof, buffer lengths, and memory reads.Make the
"u"branch derive its width fromget_size("u")(ormem::size_of::<WideChar>()), e.g. by writing either a 2‑byteu16or 4‑byteu32depending on the configured size, and zero‑filling to that size when conversion fails.Also applies to: 448-456
♻️ Duplicate comments (1)
crates/vm/src/stdlib/ctypes/thunk.rs (1)
59-64: Result buffer is left uninitialized when the Python callable raisesIf the wrapped Python callable returns
Err,python_to_ffisimplyreturns without touchingresult, so C callers see whatever garbage was in the result slot previously.Even if the C side is expected to check for a Python exception, it’s safer and more deterministic to zero‑initialize the result buffer (according to the result type) before returning on error. This was already noted in earlier review; keeping it in mind while refining the thunk API would be beneficial.
🧹 Nitpick comments (5)
crates/vm/src/stdlib/ctypes.rs (1)
763-810:_castonly mutates.contentsand never touches pointer backing; semantics may diverge from CPython
pycfunction_castconstructs an instance ofctypeand then implements cast by doing:let result = ctype.call((), vm)?; … result.set_attr("contents", ptr_value, vm)?;This treats
castessentially as “make a fresh ctypes instance and assign itscontentsattribute”:
- It doesn’t validate that
ctypeis actually a pointer type (only checks_type_exists).- It doesn’t interact with
CDataObject/ buffer storage, so any code that relies on pointer value being encoded in the backing cdata may not see the casted address.- For non‑pointer types that define a
contentsattribute, this will behave oddly.If you intend to match CPython’s
ctypes.cast, consider:
- Verifying
ctypeis a subclass of_Pointer, not just something with_type_.- Setting the pointer value in the underlying cdata representation instead of (or in addition to) assigning
contents.- Adding tests that mirror CPython’s
ctypes.castbehaviour for ctypes instances vs integer addresses.crates/vm/src/stdlib/ctypes/base.rs (2)
566-569: Pointer-like simple types expose zeroed buffers independent of their valueFor
"P" | "z" | "Z"the newvalue_to_bytesimplementation currently returns:// Pointer types (platform pointer size) vec![0; std::mem::size_of::<usize>()]ignoring the actual Python value. That means memoryviews and any buffer‑based consumers of
_SimpleCDatapointer types will always see a zeroed buffer, even when the object holds a non‑NULL address or string.If you intend the buffer protocol to reflect the underlying C representation, consider encoding:
- For
"P": the integer address (or 0 forNone) as a native‑endianusize.- For
"z"/"Z": either the address of the underlying bytes/wchar buffer (if you model it), or at least an integer address consistent withbytes_to_pyobject/from_address.
628-672: Unsafe direct access toAtomicCell<PyObjectRef>is a concurrency footgun
PyCSimple::valueclones the stored value via:let raw_value = unsafe { (*zelf.value.as_ptr()).clone() };This bypasses
AtomicCell’s safe API and relies on manual unsafe pointer reads. While ctypes values are unlikely to be mutated concurrently in practice, this pattern is fragile and easy to get wrong if the representation evolves.Consider replacing this with a safer container (e.g.,
PyRwLock<PyObjectRef>) or wrapping the unsafety into a small, well‑documented helper so that concurrent mutations and future refactors don’t accidentally introduce UB.crates/vm/src/stdlib/ctypes/pointer.rs (1)
75-213:PyCPointer.contentsmixes raw addresses and referent objectsThe new constructor,
__init__,from_address,from_buffer*, andin_dllall write different kinds of values intocontents:
py_new/__init__store an arbitraryPyObjectRef(typically a ctypes instance).from_address/from_buffer*/in_dllstore an integer address in the same field.Downstream code will need to distinguish “pointer to object” vs “raw address” based solely on runtime type, which is error‑prone. It may be clearer to standardize representation (e.g., keep the raw address in cdata and treat
contentspurely as the dereferenced value, or vice versa) and have helpers consistently translate between the two.crates/vm/src/stdlib/ctypes/structure.rs (1)
140-156: Structure layout tracks alignment but never pads fields or final size
get_field_aligncomputes an alignment for each field (defaulting to 1), andPyCStructure::py_newkeeps amax_alignand stores it inStgInfo. However, field offsets are still computed as a simple running sum of sizes:let mut offset = 0usize; … offset += size; … total_size = offset; let mut stg_info = StgInfo::new(total_size, max_align);There’s no per‑field padding to alignment boundaries and no rounding of
total_sizeup tomax_align, so the actual in‑memory layout is tightly packed regardless of the computed alignments. If you intend to approximate C struct layout (as CPythonctypes.Structuredoes), you’ll likely need to:
- Align each field’s
offsetup to its alignment, and- Round the final
total_sizeup tomax_align.Otherwise, the tracked alignment metadata may be misleading.
Also applies to: 245-295
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
crates/vm/src/stdlib/ctypes.rs(7 hunks)crates/vm/src/stdlib/ctypes/base.rs(10 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(3 hunks)crates/vm/src/stdlib/ctypes/structure.rs(9 hunks)crates/vm/src/stdlib/ctypes/thunk.rs(1 hunks)crates/vm/src/stdlib/ctypes/util.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/vm/src/stdlib/ctypes/util.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes.rs
🧠 Learnings (4)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/structure.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/vm/src/stdlib/ctypes.rs
🧬 Code graph analysis (4)
crates/vm/src/stdlib/ctypes/thunk.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (8)
std(516-516)std(524-524)std(701-701)std(704-704)std(707-707)ffi_type_from_str(25-48)get_type_code(18-23)t(22-22)
crates/vm/src/stdlib/ctypes/pointer.rs (3)
crates/vm/src/stdlib/ctypes/structure.rs (1)
__mul__(159-179)crates/vm/src/stdlib/ctypes/array.rs (1)
__mul__(96-121)crates/vm/src/stdlib/ctypes/util.rs (1)
new(31-40)
crates/vm/src/stdlib/ctypes/structure.rs (2)
crates/vm/src/stdlib/ctypes/base.rs (11)
new(254-260)size(210-212)std(516-516)std(524-524)std(701-701)std(704-704)std(707-707)instance(913-913)from_stg_info(173-180)buffer(964-964)buffer(974-974)crates/vm/src/stdlib/ctypes/util.rs (1)
new(31-40)
crates/vm/src/stdlib/ctypes.rs (7)
crates/vm/src/stdlib/ctypes/base.rs (9)
std(516-516)std(524-524)std(701-701)std(704-704)std(707-707)size(210-212)new(254-260)t(22-22)get_stg_info(239-251)crates/vm/src/stdlib/ctypes/pointer.rs (3)
std(30-30)std(150-150)std(191-191)crates/vm/src/stdlib/ctypes/structure.rs (5)
std(137-137)std(166-166)field(76-77)field(259-259)field_type(274-275)crates/vm/src/stdlib/ctypes/array.rs (3)
std(271-271)std(274-274)std(277-277)crates/vm/src/stdlib/ctypes/field.rs (7)
size(259-261)new(36-53)name(279-281)obj(107-107)obj(116-116)obj(201-201)obj(218-218)crates/vm/src/stdlib/ctypes/thunk.rs (2)
new(213-298)t(31-31)crates/vm/src/stdlib/ctypes/util.rs (1)
new(31-40)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run rust tests (macos-latest)
🔇 Additional comments (4)
crates/vm/src/stdlib/ctypes/base.rs (1)
571-618: CDataObject-backed storage and buffer implementation for_SimpleCDatalook consistentThe introduction of
CDataObjectplusPyCSimple.cdataand theSIMPLE_BUFFER_METHODS/AsBufferimplementation cleanly route buffer access through the shared CData backing, with read/write views correctly mapped viaPyMappedRwLock*Guard.This design should work well for both value updates (
set_valueupdatingcdata.buffer) and externalmemoryview/buffer users.Also applies to: 960-995
crates/vm/src/stdlib/ctypes/pointer.rs (1)
21-41: Pointer__mul__and StgInfo integration look good
PyCPointerType::__mul__computes element size fromsize_of::<usize>(), populatesStgInfo(includinglength), and returns aPyCArrayTypewithtypset to the original pointer type and the expected length/element_size.This mirrors the array/structure patterns and should give consistent sizing/alignment for arrays of pointers.
crates/vm/src/stdlib/ctypes/structure.rs (1)
464-499: Buffer protocol integration for structures looks correct
STRUCTURE_BUFFER_METHODSand theAsBufferimpl forPyCStructureconsistently expose the underlyingCDataObject.bufferas a writable, simple buffer, matching the new CDataObject‑backed design.This should make
memoryview(Structure())and similar operations work as expected.crates/vm/src/stdlib/ctypes/thunk.rs (1)
105-113: Based on the web search results, the technical concern raised in the review is valid and well-founded. The search confirms that:
- On LP64 systems (Linux, macOS):
c_long= 8 bytes- On LLP64 systems (Windows x64):
c_long= 4 bytesi64is always 8 bytesThis means writing
i64values into memory slots where libffi expectsc_longwould indeed create undefined behavior on Windows 64-bit and 32-bit platforms.However, I cannot access the repository to verify the actual current state of the code due to persistent cloning failures. Therefore, I cannot confirm:
- Whether the code at lines 105-113 still has the problematic pattern shown
- Whether lines 34-55 in
ffi_to_pythonhave been updated to usec_long/c_ulong- Whether
ffi_type_from_strmatches the actual storage size- The full context of error handling paths
The review comment's technical analysis is sound, but I cannot verify the actual current code state or whether the issue still exists due to infrastructure limitations. Manual verification of the current code is needed to confirm the asymmetry persists and whether the suggested fix should be applied.
b8c7509 to
8933738
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/vm/src/stdlib/ctypes/base.rs (2)
54-62: Duplicate condition inset_primitive.The check for
PyByteswith length 1 is duplicated on lines 54-57 and 58-61. This appears to be a copy-paste error."c" => { if value .clone() .downcast_exact::<PyBytes>(vm) .is_ok_and(|v| v.len() == 1) - || value - .clone() - .downcast_exact::<PyBytes>(vm) - .is_ok_and(|v| v.len() == 1) || value .clone() .downcast_exact::<PyInt>(vm)
35-45: Perfect. The web searches confirm the review comment is correct. Here's what I've verified:
libffi-rs provides the required methods:
Type::c_long(),Type::c_ulong(), andType::pointer()all exist in the libffi-rs crate.Format code mappings are correct:
- 'l' = c_long, 'L' = c_ulong
- 'z' = c_char_p, 'Z' = c_wchar_p, 'P' = c_void_p (all pointer types)
The bug is real and critical:
- Hardcoding
i32/u32for 'l'/'L' is wrong on 64-bit systems where c_long is 8 bytes- Hardcoding
u64for pointers ('z'/'Z'/'P') breaks on 32-bit systems where pointers are 4 bytes- libffi correctly treats all pointer types uniformly as
FFI_TYPE_POINTERThe suggested fix is valid: Using
Type::c_long(),Type::c_ulong(), andType::pointer()will correctly adapt to the host platform.
Platform-dependent FFI type sizes are hardcoded incorrectly.
c_long/c_ulongmapped toi32/u32are 8 bytes on 64-bit platforms. Pointer typesz/Z/Phardcoded tou64fail on 32-bit platforms where pointers are 4 bytes. libffi-rs provides platform-aware methods that should be used instead:- "l" => Some(libffi::middle::Type::i32()), - "L" => Some(libffi::middle::Type::u32()), + "l" => Some(libffi::middle::Type::c_long()), + "L" => Some(libffi::middle::Type::c_ulong()), "q" => Some(libffi::middle::Type::i64()), "Q" => Some(libffi::middle::Type::u64()), "f" => Some(libffi::middle::Type::f32()), "d" => Some(libffi::middle::Type::f64()), "g" => Some(libffi::middle::Type::f64()), "?" => Some(libffi::middle::Type::u8()), - "z" => Some(libffi::middle::Type::u64()), - "Z" => Some(libffi::middle::Type::u64()), - "P" => Some(libffi::middle::Type::u64()), + "z" => Some(libffi::middle::Type::pointer()), + "Z" => Some(libffi::middle::Type::pointer()), + "P" => Some(libffi::middle::Type::pointer()),
♻️ Duplicate comments (3)
crates/vm/src/stdlib/ctypes/union.rs (1)
198-214:from_addresscreates zeroed buffer instead of reading from address.This implementation allocates a zero-initialized buffer rather than reading the actual data from the provided memory address. This is inconsistent with the expected behavior of
from_address.This issue was flagged in a previous review but appears to still be present. The implementation should read data from the address:
fn from_address(cls: PyTypeRef, address: isize, vm: &VirtualMachine) -> PyResult { use crate::stdlib::ctypes::_ctypes::size_of; // Get size from cls let size = size_of(cls.clone().into(), vm)?; // Create instance with data from address if address == 0 || size == 0 { return Err(vm.new_value_error("NULL pointer access".to_owned())); } - let stg_info = StgInfo::new(size, 1); + + // SAFETY: Caller must ensure address points to valid memory + let data = unsafe { + let ptr = address as *const u8; + std::slice::from_raw_parts(ptr, size).to_vec() + }; + Ok(PyCUnion { - cdata: PyRwLock::new(CDataObject::from_stg_info(&stg_info)), + cdata: PyRwLock::new(CDataObject::from_bytes(data, None)), }crates/vm/src/stdlib/ctypes/thunk.rs (1)
60-64: Exception in callable leaves FFI result uninitialized.When the Python callable raises an exception, the function returns without writing to
result, leaving uninitialized memory that the C caller may read.This was flagged in a previous review. Consider zero-initializing:
let obj = match obj { Ok(o) => o, - Err(_) => return, // Exception occurred, leave result as-is + Err(_) => { + // Zero-initialize result on error to avoid undefined behavior + // Note: We don't know the exact size, but zeroing a reasonable max is safer + unsafe { std::ptr::write_bytes(result, 0, std::mem::size_of::<u64>()); } + return; + } };crates/vm/src/stdlib/ctypes/array.rs (1)
750-756: Critical:to_argcreates a dangling pointer.The
Arg::new(&cdata.buffer)borrows fromcdatawhich is aRwLockReadGuard. The guard is dropped at the end of the function, but the returnedArgstill holds a reference to the buffer data, causing undefined behavior.This was flagged in a previous review. The
Argholds a pointer to data that will be deallocated when the lock guard is dropped.impl PyCArray { #[allow(unused)] - pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<libffi::middle::Arg> { - let cdata = self.cdata.read(); - Ok(libffi::middle::Arg::new(&cdata.buffer)) + pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<(Vec<u8>, libffi::middle::Arg)> { + let buffer = self.cdata.read().buffer.clone(); + // SAFETY: Caller must ensure buffer outlives the Arg + let arg = unsafe { libffi::middle::Arg::new(&buffer) }; + Ok((buffer, arg)) } }
🧹 Nitpick comments (4)
crates/vm/src/stdlib/ctypes.rs (1)
163-166:get_alignassumes alignment equals size, which may be incorrect for some types.For most simple C types this holds, but some platforms may have different alignment requirements (e.g.,
doubleon 32-bit x86 may have 4-byte alignment despite 8-byte size).Consider documenting this assumption or using proper alignment intrinsics if precision is needed:
/// Get alignment for a simple type - for C types, alignment equals size +/// Note: This is a simplification that works for most platforms. +/// Some exotic platforms may have different alignment requirements. pub fn get_align(ty: &str) -> usize { get_size(ty) }crates/vm/src/stdlib/ctypes/pointer.rs (1)
75-89: Constructor and__init__have overlapping initialization logic.Both
py_new(Constructor) and__init__can set the contents value. When Python calls the constructor,py_newruns first settingcontents, then__init__runs potentially overwriting it with the same value. This is redundant but not incorrect.Consider having
py_newalways initialize withNoneand letting__init__handle the actual initialization:fn py_new(cls: PyTypeRef, args: Self::Args, vm: &VirtualMachine) -> PyResult { - // Get the initial contents value if provided - let initial_contents = args.0.into_option().unwrap_or_else(|| vm.ctx.none()); - - // Create a new PyCPointer instance with the provided value PyCPointer { - contents: PyRwLock::new(initial_contents), + contents: PyRwLock::new(vm.ctx.none()), } .into_ref_with_type(vm, cls) .map(Into::into) }crates/vm/src/stdlib/ctypes/array.rs (1)
657-747: Significant code duplication withPyCArrayType::in_dll.The
PyCArray::in_dllimplementation (lines 657-747) duplicates most of the logic fromPyCArrayType::in_dll(lines 123-202). Both methods perform identical steps for handle extraction, library lookup, and symbol resolution.Consider extracting the common logic into a shared helper function:
fn get_symbol_address( dll: PyObjectRef, name: &str, vm: &VirtualMachine, ) -> PyResult<usize> { // Common handle extraction and symbol lookup logic // ... }This would reduce duplication and make maintenance easier.
crates/vm/src/stdlib/ctypes/structure.rs (1)
372-464: Consider extracting commonfrom_buffer*patterns.The
from_buffer,from_buffer_copy, and related methods follow the same pattern acrossPyCArray,PyCStructure, andPyCPointer:
- Validate offset
- Get buffer/size
- Check buffer bounds
- Read/copy data
- Create instance
Consider creating helper functions in a utility module to reduce duplication:
// In util.rs fn validate_buffer_offset(offset: isize) -> PyResult<usize> { ... } fn check_buffer_bounds(buffer_len: usize, offset: usize, size: usize, vm: &VirtualMachine) -> PyResult<()> { ... }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
crates/vm/src/stdlib/ctypes.rs(7 hunks)crates/vm/src/stdlib/ctypes/array.rs(13 hunks)crates/vm/src/stdlib/ctypes/base.rs(10 hunks)crates/vm/src/stdlib/ctypes/function.rs(5 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(3 hunks)crates/vm/src/stdlib/ctypes/structure.rs(9 hunks)crates/vm/src/stdlib/ctypes/thunk.rs(1 hunks)crates/vm/src/stdlib/ctypes/union.rs(3 hunks)crates/vm/src/stdlib/ctypes/util.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/vm/src/stdlib/ctypes/util.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/array.rs
🧠 Learnings (5)
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (2)
crates/vm/src/stdlib/ctypes.rs (4)
crates/vm/src/stdlib/ctypes/array.rs (3)
std(271-271)std(274-274)std(277-277)crates/vm/src/stdlib/ctypes/base.rs (8)
std(548-548)std(556-556)std(761-761)std(764-764)std(767-767)size(213-215)new(257-263)t(22-22)crates/vm/src/stdlib/ctypes/field.rs (2)
size(259-261)new(36-53)crates/vm/src/stdlib/ctypes/util.rs (1)
new(31-40)
crates/vm/src/stdlib/ctypes/array.rs (3)
crates/vm/src/stdlib/ctypes.rs (4)
std(417-417)std(448-448)std(696-696)size_of(391-453)crates/vm/src/stdlib/ctypes/base.rs (11)
std(548-548)std(556-556)std(761-761)std(764-764)std(767-767)new(257-263)from_bytes(186-193)buffer(1022-1022)buffer(1032-1032)instance(971-971)size(213-215)crates/vm/src/stdlib/ctypes/util.rs (1)
new(31-40)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (13)
crates/vm/src/stdlib/ctypes/base.rs (1)
1018-1041: LGTM: Buffer methods implementation is correct.The use of mapped lock guards ensures safe access to the underlying buffer while maintaining proper lifetime semantics.
crates/vm/src/stdlib/ctypes/function.rs (1)
274-315: Callback/thunk creation looks correct.The thunk lifecycle is properly managed by storing
thunk_refin thehandlerfield, ensuring the closure remains alive while thePyCFuncPtrexists.crates/vm/src/stdlib/ctypes/union.rs (1)
299-327: LGTM: AsBuffer implementation is consistent with PyCSimple.The buffer methods follow the same correct pattern used in
SIMPLE_BUFFER_METHODS.crates/vm/src/stdlib/ctypes/thunk.rs (1)
175-182: LGTM: ThunkData Drop correctly reclaims leaked memory.The
Dropimplementation properly reclaims theBox-leaked userdata, preventing memory leaks.crates/vm/src/stdlib/ctypes.rs (3)
184-347: LGTM:bytes_to_pyobjectcorrectly uses platform-specific types.The implementation properly handles platform-dependent sizes for
c_long,c_ulong, and other types by using the actual C type sizes viamem::size_of.
389-453: LGTM:size_ofnow properly handles struct/union types.The function now covers struct and union type objects by delegating to
calculate_struct_sizeandcalculate_union_size, addressing the previous review concern.
825-872: Cast implementation delegates to pointer type correctly.The
_castfunction creates a pointer instance and sets itscontents, which is the expected pattern for ctypes casting.crates/vm/src/stdlib/ctypes/array.rs (3)
588-597: Good: Source reference is now tracked infrom_buffer.The implementation correctly stores
buffer.obj.clone()in theCDataObject, keeping the source buffer alive. This addresses the CPython semantics forfrom_buffer.
505-529: Unsafe pointer read with minimal validation.The
from_addressmethod only checks for NULL pointer but cannot validate that the address points to valid, readable memory. This matches CPython's behavior where it's the caller's responsibility to ensure address validity, but it means invalid addresses will cause undefined behavior (likely segfault).This is expected behavior for ctypes FFI. The comment documenting the safety requirements would be helpful but not required.
758-791: AsBuffer implementation looks correct.The buffer methods properly map through the
PyRwLockguards to provide zero-copy access to the underlyingcdata.buffer. Settingreadonly=falseis appropriate for ctypes objects.crates/vm/src/stdlib/ctypes/structure.rs (2)
414-420: Good: Source reference tracking implemented.The
from_buffermethod now correctly storesSome(source)in theCDataObject, keeping the source buffer alive as per CPython semantics.
467-501: AsBuffer implementation is consistent with PyCArray.The buffer methods correctly expose the structure's memory buffer through the lock guards, matching the pattern used in
array.rs.crates/vm/src/stdlib/ctypes/pointer.rs (1)
246-260: Let me search for libloading API documentation:
<function_calls>
libloading crate Library::get method documentation Rust
</function_calls>
| let value = if address != 0 && size > 0 { | ||
| // Safety: This is inherently unsafe - reading from arbitrary memory address | ||
| unsafe { | ||
| let ptr = address as *const u8; | ||
| let bytes = std::slice::from_raw_parts(ptr, size); | ||
| // Convert bytes to appropriate Python value based on type | ||
| bytes_to_pyobject(&cls, bytes, vm)? | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsafe memory read lacks alignment check.
Reading from an arbitrary address without verifying alignment can cause undefined behavior on platforms requiring aligned access. The comment acknowledges the unsafety but doesn't document alignment requirements.
Consider adding an alignment check:
let value = if address != 0 && size > 0 {
// Safety: This is inherently unsafe - reading from arbitrary memory address
+ // Caller must ensure address is properly aligned for the type
unsafe {
let ptr = address as *const u8;
+ // Note: Reading as u8 bytes avoids alignment issues
let bytes = std::slice::from_raw_parts(ptr, size);
// Convert bytes to appropriate Python value based on type
bytes_to_pyobject(&cls, bytes, vm)?
}Alternatively, add a runtime alignment check:
let align = get_align(&type_str);
if (address as usize) % align != 0 {
return Err(vm.new_value_error("address is not properly aligned".to_owned()));
}| // Create instance | ||
| let instance = PyCSimple::py_new(cls.clone(), (OptionalArg::Present(value),), vm)?; | ||
|
|
||
| // TODO: Store reference to source in _objects to keep buffer alive | ||
| Ok(instance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_buffer doesn't preserve source buffer reference.
The TODO indicates a missing implementation. The source buffer object should be stored in _objects to prevent garbage collection while the ctypes object is alive. Without this, the underlying memory could be freed while still referenced.
// Create instance
let instance = PyCSimple::py_new(cls.clone(), (OptionalArg::Present(value),), vm)?;
- // TODO: Store reference to source in _objects to keep buffer alive
+ // Store reference to source buffer in _objects to keep it alive
+ if let Ok(simple_ref) = instance.clone().downcast::<PyCSimple>() {
+ simple_ref.cdata.write().objects = Some(source);
+ }
+
Ok(instance)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/base.rs around lines 854–858, the created ctypes
instance does not retain a reference to the source buffer per the TODO; to fix,
obtain or create the instance's internal _objects container (e.g., ensure
_objects is a list on the instance), append the source buffer object to that
list so the VM holds a strong reference, and then return the instance—make sure
you use the VM APIs that add a Python-level reference (not a temporary borrow)
so the buffer won't be GC'd while the ctypes object exists.
| #[pyclassmethod] | ||
| fn from_buffer( | ||
| cls: PyTypeRef, | ||
| source: PyObjectRef, | ||
| offset: crate::function::OptionalArg<isize>, | ||
| vm: &VirtualMachine, | ||
| ) -> PyResult { | ||
| use crate::TryFromObject; | ||
| use crate::protocol::PyBuffer; | ||
|
|
||
| let offset = offset.unwrap_or(0); | ||
| if offset < 0 { | ||
| return Err(vm.new_value_error("offset cannot be negative".to_owned())); | ||
| } | ||
| let offset = offset as usize; | ||
| let size = std::mem::size_of::<usize>(); | ||
|
|
||
| let buffer = PyBuffer::try_from_object(vm, source.clone())?; | ||
|
|
||
| if buffer.desc.readonly { | ||
| return Err(vm.new_type_error("underlying buffer is not writable".to_owned())); | ||
| } | ||
|
|
||
| let buffer_len = buffer.desc.len; | ||
| if offset + size > buffer_len { | ||
| return Err(vm.new_value_error(format!( | ||
| "Buffer size too small ({} instead of at least {} bytes)", | ||
| buffer_len, | ||
| offset + size | ||
| ))); | ||
| } | ||
|
|
||
| // Read pointer value from buffer | ||
| let bytes = buffer.obj_bytes(); | ||
| let ptr_bytes = &bytes[offset..offset + size]; | ||
| let ptr_val = usize::from_ne_bytes(ptr_bytes.try_into().expect("size is checked above")); | ||
|
|
||
| Ok(PyCPointer { | ||
| contents: PyRwLock::new(vm.ctx.new_int(ptr_val).into()), | ||
| } | ||
| .into_ref_with_type(vm, cls)? | ||
| .into()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_buffer doesn't track source reference.
CPython's from_buffer keeps the source buffer alive by storing a reference. This implementation reads the data but doesn't store a reference to source, which means the source could be garbage collected while the pointer still references memory from it.
Since PyCPointer only stores the pointer value (not a view into the buffer), this is less critical than for arrays/structures. However, for consistency with CPython semantics, consider whether source tracking is needed.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/pointer.rs around lines 135-177, from_buffer
reads a pointer value from the given buffer but does not retain a reference to
the source object so the underlying buffer may be freed; update PyCPointer to
hold an owner/source PyObjectRef (or re-use existing ownership slot if present),
set that owner to source.clone() when constructing the PyCPointer in
from_buffer, and ensure the struct constructors, into_ref_with_type path and any
Drop/clone semantics preserve that reference so the buffer remains alive for the
lifetime of the PyCPointer instance.
| /// Get the alignment of a ctypes type | ||
| fn get_field_align(field_type: &PyObjectRef, vm: &VirtualMachine) -> usize { | ||
| // Try to get _type_ attribute for simple types | ||
| if let Some(align) = field_type | ||
| .get_attr("_type_", vm) | ||
| .ok() | ||
| .and_then(|type_attr| type_attr.str(vm).ok()) | ||
| .and_then(|type_str| { | ||
| let s = type_str.to_string(); | ||
| (s.len() == 1).then(|| get_size(&s)) // alignment == size for simple types | ||
| }) | ||
| { | ||
| return align; | ||
| } | ||
| // Default alignment | ||
| 1 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alignment calculation may be incomplete for complex types.
The get_field_align helper only handles simple types (where alignment equals size). For nested structures, arrays, or pointers, the default alignment of 1 may be incorrect and could cause misaligned field access.
Consider extending this to handle:
- Pointer types (alignment =
std::mem::align_of::<*const ()>()) - Nested structures (use their maximum field alignment)
- Arrays (use element alignment)
fn get_field_align(field_type: &PyObjectRef, vm: &VirtualMachine) -> usize {
// Try to get _type_ attribute for simple types
if let Some(align) = field_type
.get_attr("_type_", vm)
.ok()
.and_then(|type_attr| type_attr.str(vm).ok())
.and_then(|type_str| {
let s = type_str.to_string();
(s.len() == 1).then(|| get_size(&s)) // alignment == size for simple types
})
{
return align;
}
+ // Pointer types have pointer alignment
+ if field_type.class().fast_issubclass(super::pointer::PyCPointer::static_type()) {
+ return std::mem::align_of::<*const ()>();
+ }
// Default alignment
1
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Get the alignment of a ctypes type | |
| fn get_field_align(field_type: &PyObjectRef, vm: &VirtualMachine) -> usize { | |
| // Try to get _type_ attribute for simple types | |
| if let Some(align) = field_type | |
| .get_attr("_type_", vm) | |
| .ok() | |
| .and_then(|type_attr| type_attr.str(vm).ok()) | |
| .and_then(|type_str| { | |
| let s = type_str.to_string(); | |
| (s.len() == 1).then(|| get_size(&s)) // alignment == size for simple types | |
| }) | |
| { | |
| return align; | |
| } | |
| // Default alignment | |
| 1 | |
| } | |
| /// Get the alignment of a ctypes type | |
| fn get_field_align(field_type: &PyObjectRef, vm: &VirtualMachine) -> usize { | |
| // Try to get _type_ attribute for simple types | |
| if let Some(align) = field_type | |
| .get_attr("_type_", vm) | |
| .ok() | |
| .and_then(|type_attr| type_attr.str(vm).ok()) | |
| .and_then(|type_str| { | |
| let s = type_str.to_string(); | |
| (s.len() == 1).then(|| get_size(&s)) // alignment == size for simple types | |
| }) | |
| { | |
| return align; | |
| } | |
| // Pointer types have pointer alignment | |
| if field_type.class().fast_issubclass(super::pointer::PyCPointer::static_type()) { | |
| return std::mem::align_of::<*const ()>(); | |
| } | |
| // Default alignment | |
| 1 | |
| } |
| #[pyclass(flags(BASETYPE, IMMUTABLETYPE), with(Constructor))] | ||
| impl PyCStructure { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AsBuffer trait implemented but not declared in with() clause.
The AsBuffer trait is implemented for PyCStructure (lines 492-501), but it's not listed in the #[pyclass] attribute's with() clause on line 334. Compare with PyCArray which has with(Constructor, AsSequence, AsBuffer).
-#[pyclass(flags(BASETYPE, IMMUTABLETYPE), with(Constructor))]
+#[pyclass(flags(BASETYPE, IMMUTABLETYPE), with(Constructor, AsBuffer))]
impl PyCStructure {Without this, the buffer protocol may not be properly exposed to Python.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[pyclass(flags(BASETYPE, IMMUTABLETYPE), with(Constructor))] | |
| impl PyCStructure { | |
| #[pyclass(flags(BASETYPE, IMMUTABLETYPE), with(Constructor, AsBuffer))] | |
| impl PyCStructure { |
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/structure.rs around lines 334-335, the
PyCStructure implements the AsBuffer trait but the #[pyclass(..., with(...))]
attribute only lists Constructor; update the with() clause to include AsBuffer
(e.g., with(Constructor, AsBuffer)) to expose the buffer protocol, and ensure
the AsBuffer trait is in scope/imported similarly to the PyCArray example.
| // SAFETY: PyCThunk is safe to send/sync because: | ||
| // - callable is a PyObjectRef which is Send+Sync | ||
| // - thunk_data contains the libffi closure which is heap-allocated | ||
| // - code_ptr is just a pointer to executable memory | ||
| unsafe impl Send for PyCThunk {} | ||
| unsafe impl Sync for PyCThunk {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
libffi-rs Closure thread safe Send Sync documentation
💡 Result:
Short answer: libffi-rs's Closure is NOT thread‑safe — it is neither Send nor Sync. [1]
Why: the Closure type ties a userdata reference lifetime to the closure and exposes a raw C-callable code pointer; the docs show auto‑trait implementations include impl<'a> !Send and impl<'a> !Sync. Use the Closure only on the thread where its userdata and lifetime guarantees hold, or wrap/transfer safety yourself. [1][2]
Sources:
- Closure docs (auto-traits: !Send, !Sync). [1]
- libffi-rs crate overview (closure/userdata lifetimes and usage). [2]
🏁 Script executed:
# First, let's examine the thunk.rs file to understand the structure of PyCThunk
cd crates/vm/src/stdlib/ctypes || exit 1
wc -l thunk.rsRepository: RustPython/RustPython
Length of output: 186
🌐 Web query:
libffi-rs Closure lifetime userdata thread safety constraints
💡 Result:
Short summary — constraints when using libffi-rs closures (lifetimes, userdata, thread-safety):
-
Ownership / lifetime
- The Closure value (the Rust-side object) must live as long as C may call the generated function pointer. If the Closure is dropped while C still holds the function pointer+user_data, calls are UB. Keep the Closure alive (e.g. store it or leak it via Box::into_raw) for the callback lifetime. [1][2]
-
user_data pointer
- The userdata passed to C is just a raw pointer; it must point to memory with a stable address for as long as C may use it. Typical pattern: Box the Rust state and pass Box::into_raw (and later reconstruct with Box::from_raw to free). Don’t pass pointers to stack values that may be moved/dropped. [1][2]
-
Thread-safety / Send+Sync
- If C may call the callback on other threads you must ensure the closure/state types meet the required thread-safety. Use closure types that include Send and/or Sync bounds (the crate exposes ways to require Send/Sync on the closure trait object). The caller is responsible for ensuring the actual function pointer + user_data are thread-safe when calling from other threads. [1][3]
-
Reentrancy and mutability
- If the callback needs interior mutability or will be called reentrantly/ concurrently, use synchronization (Mutex, atomic types) or require Fn + Sync/Send as appropriate. FnMut callbacks are not inherently thread-safe unless you enforce it. [1][7]
-
Safety notes
- Creating closures for FFI involves unsafe invariants: ensuring proper lifetime, stable address for userdata, and correct Send/Sync guarantees. Follow the crate’s examples and documented safety comments. [1][2]
Sources:
- libffi-rs documentation (closures, examples, safety notes). [1]
- libffi-sys / libffi crate pages (low-level userdata and build/docs). [2]
- ffi_closure crate docs (illustrates explicit Send/Sync closure options and typical patterns). [3]
If you want, tell me which exact libffi-rs API (ClosureN / high::ClosureX / an example) you’re using and whether callbacks may be called from other threads — I’ll give a concrete safe pattern (code) for that case.
🌐 Web query:
RustPython PyCThunk ctypes thread safety implementation
💡 Result:
Do you want (A) a short explanation of how RustPython currently implements ctypes / PyCThunk and its thread-safety status, (B) pointers to the RustPython source files to inspect/modify the PyCThunk/ctypes implementation, or (C) guidance on implementing thread-safe CFUNCTYPE-style callbacks (PyCThunk) in RustPython?
Perfect. The additional context confirms the issue. The libffi-rs documentation explicitly states that the caller is responsible for ensuring thread-safety, and if callbacks may be called from other threads, proper Send+Sync bounds must be enforced. Since Closure is !Send and !Sync, this responsibility cannot be delegated via an unsafe blanket impl.
The rewritten review comment I provided is accurate and stands as-is:
Fix unsafe Send+Sync impl for PyCThunk — libffi Closure is not thread-safe.
The safety comment is incorrect. libffi-rs::middle::Closure explicitly implements !Send and !Sync (negative trait bounds) because it ties a userdata reference lifetime to the closure with thread-safety constraints. Simply being heap-allocated does not guarantee thread safety.
The current unsafe impl violates these constraints:
// SAFETY: PyCThunk is safe to send/sync because:
// - callable is a PyObjectRef which is Send+Sync
// - thunk_data contains the libffi closure which is heap-allocated ❌ INCORRECT
// - code_ptr is just a pointer to executable memory
unsafe impl Send for PyCThunk {}
unsafe impl Sync for PyCThunk {}Either remove these implementations, wrap thunk_data in thread-safe synchronization (e.g., Mutex), or provide a valid safety argument with proper analysis of the Closure's userdata and lifetime guarantees.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/thunk.rs around lines 306-311, the unsafe blanket
impls marking PyCThunk as Send and Sync are incorrect because
libffi-rs::middle::Closure is explicitly !Send and !Sync; remove these unsafe
impls or make thunk_data properly thread-safe (e.g., wrap the Closure and any
userdata in a Mutex/Arc<Mutex<...>> or equivalent synchronization) or else
provide a detailed, correct safety justification that proves the Closure's
userdata and lifetimes make cross-thread access safe; implement one of these
fixes and update the safety comment to match the chosen approach.
| // Copy data from source buffer | ||
| let bytes = buffer.obj_bytes(); | ||
| let data = bytes[offset..offset + size].to_vec(); | ||
|
|
||
| Ok(PyCUnion { | ||
| cdata: PyRwLock::new(CDataObject::from_bytes(data, None)), | ||
| } | ||
| .into_ref_with_type(vm, cls)? | ||
| .into()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_buffer doesn't preserve source buffer reference.
Similar to PyCSimple::from_buffer, the source buffer object should be stored to prevent it from being garbage collected while the union references its memory.
// Copy data from source buffer
let bytes = buffer.obj_bytes();
let data = bytes[offset..offset + size].to_vec();
Ok(PyCUnion {
- cdata: PyRwLock::new(CDataObject::from_bytes(data, None)),
+ cdata: PyRwLock::new(CDataObject::from_bytes(data, Some(source))),
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/union.rs around lines 250 to 258, the new
CDataObject is created with from_bytes(..., None) which drops the source buffer
reference and allows it to be GC'd while the union still references its memory;
change the construction to preserve the source buffer by passing a reference (or
cloned PyObject) into the CDataObject (e.g., call from_bytes(data,
Some(buffer.clone())) or assign the buffer to the CDataObject/PyCUnion field
that holds the original buffer) so the buffer stays alive for the lifetime of
the union.
8933738 to
b92d266
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/ctypes/base.rs (1)
25-48: Hardcodedc_long/c_ulongas 32-bit inffi_type_from_stris incorrect on 64-bit platforms.Lines 35-36 map
"l"and"L"toi32/u32, assuming 4-bytec_long/c_ulong. However, on 64-bit Unix platforms (Linux, macOS),c_longandc_ulongare 8 bytes.Consider using platform-aware sizing:
"I" => Some(libffi::middle::Type::u32()), - "l" => Some(libffi::middle::Type::i32()), - "L" => Some(libffi::middle::Type::u32()), + "l" => { + if std::mem::size_of::<libc::c_long>() == 4 { + Some(libffi::middle::Type::i32()) + } else { + Some(libffi::middle::Type::i64()) + } + } + "L" => { + if std::mem::size_of::<libc::c_ulong>() == 4 { + Some(libffi::middle::Type::u32()) + } else { + Some(libffi::middle::Type::u64()) + } + } "q" => Some(libffi::middle::Type::i64()),
♻️ Duplicate comments (13)
crates/vm/src/stdlib/ctypes/pointer.rs (1)
135-177:from_bufferdoesn't retain source buffer reference.While this is less critical for pointers (since only the pointer value—an integer—is copied), CPython's
from_buffersemantics keep the source buffer alive. For consistency withPyCStructureandPyCUnion, consider storing the source reference.Ok(PyCPointer { - contents: PyRwLock::new(vm.ctx.new_int(ptr_val).into()), + contents: PyRwLock::new(vm.ctx.new_int(ptr_val).into()), + // Consider adding: source: PyRwLock<Option<PyObjectRef>>, + // and storing source.clone() here }crates/vm/src/stdlib/ctypes/structure.rs (3)
364-370:from_addresscreates structure with emptyfieldsmap.The structure is created without populating the
fieldsmap from the class's_fields_attribute. If code relies onself.fieldsfor field access, it may not work correctly. The same issue exists infrom_bufferandfrom_buffer_copy.Consider extracting field parsing to a shared helper and using it in all three methods:
// Helper to populate fields from class _fields_ fn populate_fields_from_class(cls: &PyTypeRef, vm: &VirtualMachine) -> PyResult<IndexMap<String, FieldInfo>> { // Same logic as in py_new }
334-334:AsBuffertrait not declared inwith()clause.The
AsBuffertrait is implemented forPyCStructure(lines 492-502), but it's not listed in the#[pyclass]attribute'swith()clause. Without this, the buffer protocol may not be properly exposed to Python.-#[pyclass(flags(BASETYPE, IMMUTABLETYPE), with(Constructor))] +#[pyclass(flags(BASETYPE, IMMUTABLETYPE), with(Constructor, AsBuffer))] impl PyCStructure {
140-156: Alignment calculation incomplete for pointer and composite types.The helper only handles simple types. For pointers, arrays, and nested structures, it returns 1, which may cause incorrect field layout.
Consider extending to handle more types:
fn get_field_align(field_type: &PyObjectRef, vm: &VirtualMachine) -> usize { // Try to get _type_ attribute for simple types if let Some(align) = field_type .get_attr("_type_", vm) .ok() .and_then(|type_attr| type_attr.str(vm).ok()) .and_then(|type_str| { let s = type_str.to_string(); (s.len() == 1).then(|| get_size(&s)) }) { return align; } + // Pointer types have pointer alignment + if field_type.class().fast_issubclass(super::pointer::PyCPointer::static_type()) { + return std::mem::align_of::<*const ()>(); + } // Default alignment 1 }crates/vm/src/stdlib/ctypes/function.rs (1)
111-112: Platform mismatch:c_long/c_ulongreturn types read as 8 bytes on all platforms.Lines 111-112 unconditionally read
l/Ltypes asi64/u64, but on platforms wherec_longis 4 bytes (Windows, 32-bit systems), this reads beyond the allocated result buffer, causing undefined behavior.Apply this diff to handle platform-specific sizes:
- Some("l") => vm.ctx.new_int(unsafe { *(value as *const i64) }).into(), - Some("L") => vm.ctx.new_int(unsafe { *(value as *const u64) }).into(), + Some("l") => { + if std::mem::size_of::<std::ffi::c_long>() == 4 { + vm.ctx.new_int(unsafe { *(value as *const i32) }).into() + } else { + vm.ctx.new_int(unsafe { *(value as *const i64) }).into() + } + } + Some("L") => { + if std::mem::size_of::<std::ffi::c_ulong>() == 4 { + vm.ctx.new_int(unsafe { *(value as *const u32) }).into() + } else { + vm.ctx.new_int(unsafe { *(value as *const u64) }).into() + } + }crates/vm/src/stdlib/ctypes/union.rs (3)
197-214:from_addresscreates zero-initialized buffer instead of reading from address.The implementation creates an empty
CDataObjectfromstg_inforather than reading the data from the provided memory address. This differs fromPyCStructure::from_addresswhich reads actual bytes from the address.Apply this diff to read data from the address:
#[pyclassmethod] fn from_address(cls: PyTypeRef, address: isize, vm: &VirtualMachine) -> PyResult { use crate::stdlib::ctypes::_ctypes::size_of; // Get size from cls let size = size_of(cls.clone().into(), vm)?; // Create instance with data from address if address == 0 || size == 0 { return Err(vm.new_value_error("NULL pointer access".to_owned())); } - let stg_info = StgInfo::new(size, 1); + + // SAFETY: Caller must ensure address points to valid memory of at least size bytes + let data = unsafe { + let ptr = address as *const u8; + std::slice::from_raw_parts(ptr, size).to_vec() + }; + Ok(PyCUnion { - cdata: PyRwLock::new(CDataObject::from_stg_info(&stg_info)), + cdata: PyRwLock::new(CDataObject::from_bytes(data, None)), } .into_ref_with_type(vm, cls)? .into()) }
173-177: Alignment calculation incorrectly uses field size instead of field alignment.Setting
max_align = max_align.max(size)assumes alignment equals size, which is incorrect for:
- Arrays (alignment = element alignment, not total array size)
- Nested structures (alignment = max field alignment)
Use proper alignment calculation:
let field_type = field_tuple.get(1).unwrap().clone(); let size = PyCUnionType::get_field_size(&field_type, vm)?; max_size = max_size.max(size); - // For simple types, alignment == size - max_align = max_align.max(size); + // Get proper alignment for the field type + let align = crate::stdlib::ctypes::_ctypes::alignment( + crate::function::Either::B(field_type), + vm, + ).unwrap_or(1); + max_align = max_align.max(align);
250-259:from_bufferdoesn't preserve source buffer reference.The source buffer should be stored to prevent garbage collection while the union references its memory. Pass
Some(source)instead ofNone.// Copy data from source buffer let bytes = buffer.obj_bytes(); let data = bytes[offset..offset + size].to_vec(); Ok(PyCUnion { - cdata: PyRwLock::new(CDataObject::from_bytes(data, None)), + cdata: PyRwLock::new(CDataObject::from_bytes(data, Some(source))), }crates/vm/src/stdlib/ctypes/thunk.rs (3)
60-64: Exception in callable leaves FFI result uninitialized.When the Python callable raises an exception,
python_to_ffireturns early without writing toresult. The C caller will receive whatever garbage was in the result buffer.Consider zero-initializing the result on error:
let obj = match obj { Ok(o) => o, - Err(_) => return, // Exception occurred, leave result as-is + Err(_) => { + // Zero-initialize result on error + unsafe { std::ptr::write_bytes(result, 0, 1); } + return; + } };Note: CPython sets an exception flag that the caller should check; this may be acceptable if RustPython has similar semantics.
105-113: Platform mismatch inpython_to_ffiforc_long/c_ulong.Lines 105-113 write
l/Lasi64/u64, but on platforms wherec_longis 4 bytes (32-bit or Windows LLP64), this writes 8 bytes to a 4-byte buffer, causing memory corruption.- Some("l") | Some("q") => { + Some("l") => { + if let Ok(i) = obj.try_int(vm) { + const SIZE: usize = std::mem::size_of::<libc::c_long>(); + if SIZE == 4 { + *(result as *mut i32) = i.as_bigint().to_i32().unwrap_or(0); + } else { + *(result as *mut i64) = i.as_bigint().to_i64().unwrap_or(0); + } + } + } + Some("q") => { if let Ok(i) = obj.try_int(vm) { *(result as *mut i64) = i.as_bigint().to_i64().unwrap_or(0); } } - Some("L") | Some("Q") => { + Some("L") => { + if let Ok(i) = obj.try_int(vm) { + const SIZE: usize = std::mem::size_of::<libc::c_ulong>(); + if SIZE == 4 { + *(result as *mut u32) = i.as_bigint().to_u32().unwrap_or(0); + } else { + *(result as *mut u64) = i.as_bigint().to_u64().unwrap_or(0); + } + } + } + Some("Q") => { if let Ok(i) = obj.try_int(vm) { *(result as *mut u64) = i.as_bigint().to_u64().unwrap_or(0); } }
306-311: Fix unsafe Send+Sync impl for PyCThunk — libffi Closure is not thread-safe.The safety comment is incorrect.
libffi-rs::middle::Closureexplicitly implements!Sendand!Sync(negative trait bounds) because it ties a userdata reference lifetime to the closure with thread-safety constraints. Simply being heap-allocated does not guarantee thread safety.The current unsafe impl violates these constraints:
// SAFETY: PyCThunk is safe to send/sync because: // - callable is a PyObjectRef which is Send+Sync // - thunk_data contains the libffi closure which is heap-allocated ❌ INCORRECT // - code_ptr is just a pointer to executable memory unsafe impl Send for PyCThunk {} unsafe impl Sync for PyCThunk {}Either remove these implementations, wrap
thunk_datain thread-safe synchronization (e.g.,Mutex), or provide a valid safety argument with proper analysis of the Closure's userdata and lifetime guarantees.crates/vm/src/stdlib/ctypes/base.rs (1)
848-852:from_bufferdoesn't preserve source buffer reference.The TODO indicates a missing implementation. The source buffer object should be stored in the instance's
cdata.objectsto prevent garbage collection while the ctypes object is alive. Without this, the underlying memory could be freed while still referenced.// Create instance let instance = PyCSimple::py_new(cls.clone(), (OptionalArg::Present(value),), vm)?; - // TODO: Store reference to source in _objects to keep buffer alive + // Store reference to source buffer to keep it alive + if let Ok(simple_ref) = instance.clone().downcast::<PyCSimple>() { + simple_ref.cdata.write().objects = Some(source); + } + Ok(instance)crates/vm/src/stdlib/ctypes/array.rs (1)
750-756: Critical:to_argcreates a dangling pointer to a dropped temporary.
Arg::new(&cdata.buffer)borrows a reference to the buffer, but theRwLockReadGuard(cdata) is dropped at the end of the statement. The returnedArgwill hold a dangling pointer to the freed buffer.Consider returning the buffer alongside the
Argto keep it alive, or restructuring to ensure the buffer outlives theArg:impl PyCArray { #[allow(unused)] - pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<libffi::middle::Arg> { - let cdata = self.cdata.read(); - Ok(libffi::middle::Arg::new(&cdata.buffer)) + pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<(Vec<u8>, libffi::middle::Arg)> { + let buffer = self.cdata.read().buffer.clone(); + let arg = libffi::middle::Arg::new(&buffer); + Ok((buffer, arg)) } }Note: The caller must keep the returned buffer alive for the duration of the FFI call.
🧹 Nitpick comments (1)
crates/vm/src/stdlib/ctypes/structure.rs (1)
467-490: Double lock guard mapping could be simplified.The
obj_bytesandobj_bytes_mutcallbacks use nestedmapcalls which could be simplified to a single mapping, similar to howPyCUniondoes it.static STRUCTURE_BUFFER_METHODS: BufferMethods = BufferMethods { obj_bytes: |buffer| { - rustpython_common::lock::PyMappedRwLockReadGuard::map( - rustpython_common::lock::PyRwLockReadGuard::map( - buffer.obj_as::<PyCStructure>().cdata.read(), - |x: &CDataObject| x, - ), + rustpython_common::lock::PyRwLockReadGuard::map( + buffer.obj_as::<PyCStructure>().cdata.read(), |x: &CDataObject| x.buffer.as_slice(), ) .into() },
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
crates/vm/src/stdlib/ctypes.rs(7 hunks)crates/vm/src/stdlib/ctypes/array.rs(13 hunks)crates/vm/src/stdlib/ctypes/base.rs(10 hunks)crates/vm/src/stdlib/ctypes/function.rs(5 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(3 hunks)crates/vm/src/stdlib/ctypes/structure.rs(9 hunks)crates/vm/src/stdlib/ctypes/thunk.rs(1 hunks)crates/vm/src/stdlib/ctypes/union.rs(3 hunks)crates/vm/src/stdlib/ctypes/util.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/vm/src/stdlib/ctypes/util.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/base.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/base.rs
🧠 Learnings (5)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/base.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/base.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/base.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/vm/src/stdlib/ctypes/function.rscrates/vm/src/stdlib/ctypes.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
🧬 Code graph analysis (3)
crates/vm/src/stdlib/ctypes/function.rs (1)
crates/vm/src/stdlib/ctypes/thunk.rs (5)
t(31-31)new(213-298)code_ptr(301-303)args(225-225)item(226-226)
crates/vm/src/stdlib/ctypes/thunk.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (8)
std(542-542)std(550-550)std(755-755)std(758-758)std(761-761)ffi_type_from_str(25-48)get_type_code(18-23)t(22-22)
crates/vm/src/stdlib/ctypes/pointer.rs (2)
crates/vm/src/stdlib/ctypes/array.rs (1)
__mul__(96-121)crates/vm/src/stdlib/ctypes/util.rs (1)
new(31-40)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (9)
crates/vm/src/stdlib/ctypes/function.rs (2)
274-315: LGTM: Callback creation path correctly manages thunk lifetime.The implementation properly:
- Creates a
PyCThunkwrapper for the Python callable- Captures the code pointer before converting to
PyRef- Stores the thunk reference in
handlerto keep it alive for the duration of thePyCFuncPtr
341-370: Good defensive handling withok()for optional downcasts.The use of
.ok()instead of.unwrap()for thedowncast::<PyType>()calls properly handles cases where the return type might not be aPyType, falling back gracefully toType::i32or the default integer conversion.crates/vm/src/stdlib/ctypes.rs (2)
163-166: Alignment assumption is correct for simple C types.For the simple type characters handled by ctypes (
c,b,h,i,l, etc.), the C ABI guarantees that alignment equals size. This is the standard behavior.
184-347: Comprehensive bytes-to-Python conversion with proper platform handling.The implementation correctly uses
mem::size_offor each C type, ensuring platform-appropriate byte interpretation. The WideChar handling properly distinguishes between 2-byte and 4-byte wide characters based on platform.crates/vm/src/stdlib/ctypes/pointer.rs (3)
24-41: Good:__mul__properly creates array type withstg_info.The implementation correctly:
- Calculates total size and element size for pointers
- Creates
StgInfowith proper length- Constructs
PyCArrayTypewith all required fields
122-133: Correct implementation offrom_addressfor pointer type.The method correctly stores the address value without dereferencing it, which matches the expected semantics for pointer types. The NULL check prevents creating pointers to address 0.
215-268: Good implementation ofin_dllfor loading symbols from DLLs.The method correctly:
- Handles both integer handles and objects with
_handleattribute- Uses the library cache for lookup
- Properly null-terminates the symbol name
- Returns the symbol address in a new pointer instance
crates/vm/src/stdlib/ctypes/union.rs (1)
299-328: AsBuffer implementation correctly exposes union memory buffer.The implementation properly:
- Uses lock guards for thread-safe buffer access
- Marks buffer as writable (
readonly=false)- Follows the same pattern as
PyCStructurecrates/vm/src/stdlib/ctypes/structure.rs (1)
414-420: Good:from_buffernow preserves source buffer reference.The implementation correctly passes
Some(source)toCDataObject::from_bytes, keeping the source buffer alive while the structure references its data.
b92d266 to
685690f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (13)
crates/vm/src/stdlib/ctypes/function.rs (1)
111-112: Platform mismatch:c_long/c_ulongreturn types read as 8 bytes.Lines 111-112 always read
l/Ltypes asi64/u64, but on platforms wherec_longis 4 bytes (Windows, 32-bit systems), this reads beyond the allocated result buffer.Apply this diff to fix the platform-dependent size:
- Some("l") => vm.ctx.new_int(unsafe { *(value as *const i64) }).into(), - Some("L") => vm.ctx.new_int(unsafe { *(value as *const u64) }).into(), + Some("l") => { + if std::mem::size_of::<libc::c_long>() == 4 { + vm.ctx.new_int(unsafe { *(value as *const i32) }).into() + } else { + vm.ctx.new_int(unsafe { *(value as *const i64) }).into() + } + } + Some("L") => { + if std::mem::size_of::<libc::c_ulong>() == 4 { + vm.ctx.new_int(unsafe { *(value as *const u32) }).into() + } else { + vm.ctx.new_int(unsafe { *(value as *const u64) }).into() + } + }crates/vm/src/stdlib/ctypes/array.rs (1)
750-756: Critical:to_argcreates a dangling pointer to borrowed data.
libffi::middle::Arg::new(&cdata.buffer)borrows a reference to the buffer through the read guard. When the guard is dropped at the end of the function, theArgwill hold a dangling pointer.The returned
Argmust not outlive the buffer. Consider returning the owned data alongside theArg:impl PyCArray { #[allow(unused)] - pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<libffi::middle::Arg> { - let cdata = self.cdata.read(); - Ok(libffi::middle::Arg::new(&cdata.buffer)) + pub fn to_arg(&self, _vm: &VirtualMachine) -> PyResult<(Vec<u8>, libffi::middle::Arg)> { + let buffer = self.cdata.read().buffer.clone(); + // SAFETY: Caller must ensure `buffer` outlives the FFI call + let arg = unsafe { libffi::middle::Arg::new(&buffer) }; + Ok((buffer, arg)) } }Alternatively, restructure the call site to hold the guard alive during the FFI call.
crates/vm/src/stdlib/ctypes/union.rs (3)
173-177: Alignment calculation assumes alignment equals size.Setting
max_align = max_align.max(size)assumes every field's alignment equals its size. This isn't true for composite types like arrays (where alignment is the element's alignment, not the array size) or structures with padding.Consider computing proper field alignment:
let field_type = field_tuple.get(1).unwrap().clone(); let size = PyCUnionType::get_field_size(&field_type, vm)?; max_size = max_size.max(size); - // For simple types, alignment == size - max_align = max_align.max(size); + // Get actual alignment (may differ from size for arrays/structs) + let align = super::_ctypes::alignment( + crate::function::Either::B(field_type), + vm + ).unwrap_or(1); + max_align = max_align.max(align);
197-214:from_addressdoesn't read data from the address.The implementation creates a zero-initialized buffer via
CDataObject::from_stg_infobut never reads the actual data at the providedaddress. This defeats the purpose offrom_address.Apply this diff to actually read from the address:
#[pyclassmethod] fn from_address(cls: PyTypeRef, address: isize, vm: &VirtualMachine) -> PyResult { use crate::stdlib::ctypes::_ctypes::size_of; let size = size_of(cls.clone().into(), vm)?; if address == 0 || size == 0 { return Err(vm.new_value_error("NULL pointer access".to_owned())); } - let stg_info = StgInfo::new(size, 1); + + // SAFETY: Caller must ensure `address` points to valid memory of at least `size` bytes + let data = unsafe { + let ptr = address as *const u8; + std::slice::from_raw_parts(ptr, size).to_vec() + }; + Ok(PyCUnion { - cdata: PyRwLock::new(CDataObject::from_stg_info(&stg_info)), + cdata: PyRwLock::new(CDataObject::from_bytes(data, None)), } .into_ref_with_type(vm, cls)? .into()) }
250-258:from_bufferdoesn't preserve source buffer reference.Passing
NonetoCDataObject::from_bytesmeans the source buffer object isn't retained. If the source is garbage collected, the union's view into that memory becomes invalid.// Copy data from source buffer let bytes = buffer.obj_bytes(); let data = bytes[offset..offset + size].to_vec(); Ok(PyCUnion { - cdata: PyRwLock::new(CDataObject::from_bytes(data, None)), + cdata: PyRwLock::new(CDataObject::from_bytes(data, Some(source))), } .into_ref_with_type(vm, cls)? .into())crates/vm/src/stdlib/ctypes.rs (1)
455-477:calculate_struct_sizedoesn't account for field alignment and padding.The function sums field sizes without considering C structure layout rules. In C, fields are aligned to their natural alignment, and the structure is padded to its largest alignment. For example,
{char, int}would be 8 bytes on most platforms, not 5.Consider alignment-aware layout:
fn calculate_struct_size( cls: &crate::builtins::PyTypeRef, vm: &VirtualMachine, ) -> PyResult<usize> { use crate::AsObject; if let Ok(fields_attr) = cls.as_object().get_attr("_fields_", vm) { let fields: Vec<PyObjectRef> = fields_attr.try_to_value(vm).unwrap_or_default(); - let mut total_size = 0usize; + let mut offset = 0usize; + let mut max_align = 1usize; for field in fields.iter() { if let Some(tuple) = field.downcast_ref::<crate::builtins::PyTuple>() && let Some(field_type) = tuple.get(1) { - total_size += size_of(field_type.clone(), vm)?; + let field_size = size_of(field_type.clone(), vm)?; + let field_align = alignment(Either::A(field_type.clone().downcast().unwrap_or(vm.ctx.types.object_type.clone())), vm).unwrap_or(1); + max_align = max_align.max(field_align); + offset = (offset + field_align - 1) & !(field_align - 1); + offset += field_size; } } - return Ok(total_size); + offset = (offset + max_align - 1) & !(max_align - 1); + return Ok(offset); } Ok(0) }crates/vm/src/stdlib/ctypes/pointer.rs (1)
135-177: Consider tracking source buffer reference for CPython compatibility.The current implementation reads the pointer value from the buffer but doesn't retain a reference to the source object. While PyCPointer stores just the pointer value (not a view), CPython's
from_buffertypically keeps the source alive to prevent premature garbage collection.Since PyCPointer doesn't have a
cdatafield with anobjectsslot (unlike PyCStructure/PyCSimple), consider either:
- Adding an optional
source: PyRwLock<Option<PyObjectRef>>field to PyCPointer- Documenting why source tracking is unnecessary for pointer types in this implementation
crates/vm/src/stdlib/ctypes/structure.rs (2)
347-370: Populate fields map from class metadata in all three constructors.All three constructors (
from_address,from_buffer,from_buffer_copy) create structures with an emptyfieldsmap (IndexMap::new()). If code relies onself.fieldsfor field access, these instances won't work correctly.Consider extracting the field-parsing logic from
py_new(lines 243-294) into a helper method that all constructors can reuse to populate thefieldsmap from the class's_fields_attribute.Also applies to: 372-421, 423-464
334-335: Add AsBuffer to the with() clause.The
AsBuffertrait is implemented forPyCStructure(lines 492-502), but it's not declared in the#[pyclass]attribute'swith()clause. Without this, the buffer protocol may not be properly exposed to Python.Apply this diff:
-#[pyclass(flags(BASETYPE, IMMUTABLETYPE), with(Constructor))] +#[pyclass(flags(BASETYPE, IMMUTABLETYPE), with(Constructor, AsBuffer))] impl PyCStructure {crates/vm/src/stdlib/ctypes/base.rs (2)
774-801: Consider adding alignment validation for from_address.Reading from an arbitrary address without verifying alignment can cause undefined behavior on platforms requiring aligned access for certain types. While reading as
u8bytes mitigates some concerns, the address should ideally match the type's alignment requirements.Consider adding a runtime check:
let type_str = type_attr.str(vm)?.to_string(); let size = get_size(&type_str); let align = get_align(&type_str); if (address as usize) % align != 0 { return Err(vm.new_value_error("address is not properly aligned".to_owned())); }
803-853: Implement source buffer tracking in from_buffer.The TODO at line 851 correctly identifies that the source buffer reference must be stored to prevent garbage collection. Without this, the underlying memory could be freed while the ctypes object still references it.
Apply this approach:
// Create instance let instance = PyCSimple::py_new(cls.clone(), (OptionalArg::Present(value),), vm)?; // Store reference to source buffer in _objects to keep it alive if let Ok(simple_ref) = instance.clone().downcast::<PyCSimple>() { simple_ref.cdata.write().objects = Some(source); } Ok(instance)crates/vm/src/stdlib/ctypes/thunk.rs (2)
105-113: Fix platform-specific size mismatch for c_long/c_ulong.Lines 105-113 unconditionally write 64-bit values for both
l/qandL/Q. On platforms wherec_long/c_ulongare 32-bit (Windows 64-bit LLP64, 32-bit platforms), this writes 8 bytes to a 4-byte buffer, causing memory corruption.Apply this fix:
- Some("l") | Some("q") => { + Some("l") => { + if let Ok(i) = obj.try_int(vm) { + if std::mem::size_of::<libc::c_long>() == 4 { + *(result as *mut i32) = i.as_bigint().to_i32().unwrap_or(0); + } else { + *(result as *mut i64) = i.as_bigint().to_i64().unwrap_or(0); + } + } + } + Some("q") => { if let Ok(i) = obj.try_int(vm) { *(result as *mut i64) = i.as_bigint().to_i64().unwrap_or(0); } } - Some("L") | Some("Q") => { + Some("L") => { + if let Ok(i) = obj.try_int(vm) { + if std::mem::size_of::<libc::c_ulong>() == 4 { + *(result as *mut u32) = i.as_bigint().to_u32().unwrap_or(0); + } else { + *(result as *mut u64) = i.as_bigint().to_u64().unwrap_or(0); + } + } + } + Some("Q") => { if let Ok(i) = obj.try_int(vm) { *(result as *mut u64) = i.as_bigint().to_u64().unwrap_or(0); } }
306-311: Remove or fix unsafe Send+Sync impl — libffi Closure is !Send and !Sync.The safety comment is incorrect.
libffi-rs::middle::Closureexplicitly implements!Sendand!Sync(negative trait bounds) because proper thread-safety requires careful handling of the closure's userdata and lifetime. Simply being heap-allocated does not guarantee thread safety.According to the libffi-rs documentation, the caller is responsible for ensuring thread-safety when closures may be called from other threads. The current blanket unsafe impl violates these constraints.
Either:
- Remove these unsafe impls entirely
- Wrap
thunk_datain proper synchronization (e.g.,Arc<Mutex<...>>)- Provide a detailed, correct safety justification proving the Closure's userdata and lifetimes make cross-thread access safe
Update the safety comment to match the chosen approach.
🧹 Nitpick comments (1)
crates/vm/src/stdlib/ctypes/structure.rs (1)
140-156: Consider extending alignment calculation for complex types.The current implementation only computes alignment for simple types (where alignment equals size). For pointer types, nested structures, and arrays, the default alignment of 1 may be incorrect and could lead to misaligned field access on some platforms.
Consider handling:
- Pointer types: use
std::mem::align_of::<*const ()>()- Nested structures: use their maximum field alignment (via
stg_info)- Arrays: use element alignment
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
.cspell.dict/cpython.txt(1 hunks)crates/vm/src/stdlib/ctypes.rs(7 hunks)crates/vm/src/stdlib/ctypes/array.rs(13 hunks)crates/vm/src/stdlib/ctypes/base.rs(10 hunks)crates/vm/src/stdlib/ctypes/function.rs(5 hunks)crates/vm/src/stdlib/ctypes/pointer.rs(3 hunks)crates/vm/src/stdlib/ctypes/structure.rs(9 hunks)crates/vm/src/stdlib/ctypes/thunk.rs(1 hunks)crates/vm/src/stdlib/ctypes/union.rs(3 hunks)crates/vm/src/stdlib/ctypes/util.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/vm/src/stdlib/ctypes/util.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style in Rust code (cargo fmtto format)
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management
Files:
crates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/function.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use the macro system (
pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/pointer.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/function.rs
🧠 Learnings (5)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rscrates/vm/src/stdlib/ctypes/union.rscrates/vm/src/stdlib/ctypes/thunk.rscrates/vm/src/stdlib/ctypes/structure.rscrates/vm/src/stdlib/ctypes/base.rs
📚 Learning: 2025-08-26T05:20:54.540Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Applied to files:
crates/vm/src/stdlib/ctypes/array.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/function.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/base.rscrates/vm/src/stdlib/ctypes/function.rs
📚 Learning: 2025-11-25T11:05:02.899Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-25T11:05:02.899Z
Learning: Applies to **/src/**/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/vm/src/stdlib/ctypes.rscrates/vm/src/stdlib/ctypes/function.rs
🧬 Code graph analysis (5)
crates/vm/src/stdlib/ctypes/array.rs (3)
crates/vm/src/stdlib/ctypes.rs (4)
std(417-417)std(448-448)std(696-696)size_of(391-453)crates/vm/src/stdlib/ctypes/base.rs (11)
std(542-542)std(550-550)std(755-755)std(758-758)std(761-761)new(257-263)from_bytes(186-193)buffer(1016-1016)buffer(1026-1026)instance(965-965)size(213-215)crates/vm/src/stdlib/ctypes/util.rs (1)
new(31-40)
crates/vm/src/stdlib/ctypes.rs (5)
crates/vm/src/stdlib/ctypes/base.rs (9)
std(542-542)std(550-550)std(755-755)std(758-758)std(761-761)size(213-215)new(257-263)t(22-22)get_stg_info(242-254)crates/vm/src/stdlib/ctypes/pointer.rs (3)
std(30-30)std(150-150)std(191-191)crates/vm/src/stdlib/ctypes/structure.rs (3)
std(137-137)field(76-77)field(262-262)crates/vm/src/stdlib/ctypes/field.rs (6)
size(259-261)new(36-53)obj(107-107)obj(116-116)obj(201-201)obj(218-218)crates/vm/src/stdlib/ctypes/util.rs (1)
new(31-40)
crates/vm/src/stdlib/ctypes/union.rs (2)
crates/vm/src/stdlib/ctypes.rs (5)
get_size(142-161)std(417-417)std(448-448)std(696-696)size_of(391-453)crates/vm/src/stdlib/ctypes/util.rs (2)
default(18-27)new(31-40)
crates/vm/src/stdlib/ctypes/thunk.rs (1)
crates/vm/src/stdlib/ctypes/base.rs (8)
std(542-542)std(550-550)std(755-755)std(758-758)std(761-761)ffi_type_from_str(25-48)get_type_code(18-23)t(22-22)
crates/vm/src/stdlib/ctypes/pointer.rs (2)
crates/vm/src/stdlib/ctypes/array.rs (1)
__mul__(96-121)crates/vm/src/stdlib/ctypes/util.rs (1)
new(31-40)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
🔇 Additional comments (22)
.cspell.dict/cpython.txt (1)
53-53: LGTM!The dictionary entry is correctly positioned alphabetically and aligns with the swapped-endianness handling introduced in this PR.
crates/vm/src/stdlib/ctypes/function.rs (2)
274-315: LGTM - Callback creation for Python callables.The implementation correctly:
- Creates a
PyCThunkwrapping the Python callable- Extracts and stores the code pointer
- Parses
_argtypes_with properNonehandling- Manages thunk lifetime via
PyRef<PyCThunk>
344-346: Good defensive change usingok()instead ofunwrap().Using
ok()for the downcast prevents panics if the return type isn't aPyType, gracefully falling back toType::i32.crates/vm/src/stdlib/ctypes/array.rs (3)
588-597: LGTM - Buffer reference tracking now implemented.The
from_buffermethod now correctly passesSome(buffer.obj.clone())toCDataObject::from_bytes, preserving the source buffer reference to prevent it from being garbage collected.
758-792: LGTM - AsBuffer implementation.The buffer methods correctly use mapped lock guards to safely expose the underlying buffer, and
readonly=falseis appropriate for ctypes mutable arrays.
502-529: Verify caller ensures valid memory at address.The
from_addressmethod reads from a raw memory address. While the NULL and zero-size checks are good, the caller must ensure the memory ataddressis valid and properly sized.Consider adding a safety note in documentation that the caller is responsible for ensuring the address points to valid, accessible memory of at least
sizebytes.crates/vm/src/stdlib/ctypes/union.rs (1)
299-328: LGTM - AsBuffer implementation.The buffer methods implementation is consistent with
PyCArrayand correctly exposes the union's memory buffer.crates/vm/src/stdlib/ctypes.rs (2)
184-347: LGTM -bytes_to_pyobjectuses correct platform-aware sizes.The implementation correctly uses
mem::size_offor each C type, ensuring platform-appropriate byte reads forc_long,c_int, etc. This addresses the platform-dependent size concern from earlier reviews.
660-755: LGTM - Comprehensive alignment handling.The
alignmentfunction correctly handles multiple cases including instances, type objects, arrays, pointers, and structures/unions by computing maximum field alignment. The delegation pattern for instances to their class types is sound.crates/vm/src/stdlib/ctypes/pointer.rs (3)
18-18: LGTM! Storage metadata integration looks correct.The new
stg_infofield and its use in__mul__properly align with the CDataObject-backed storage model. The size calculation andPyCArrayTypeconstruction follow the pattern established in array.rs.Also applies to: 32-38
75-120: LGTM! Constructor integration is well-implemented.The
Constructortrait implementation and__init__method correctly handle optional initialization values, aligning with the pattern used in other ctypes types.
122-133: LGTM! NULL check and address handling are appropriate.The implementation correctly validates the address and creates a pointer instance with the address stored as contents.
crates/vm/src/stdlib/ctypes/structure.rs (3)
1-3: LGTM! Storage model integration is consistent.The new imports and
stg_infofield properly integrate PyCStructType with the CDataObject-backed storage model used across ctypes types.Also applies to: 20-23
158-182: LGTM! Array type creation correctly uses size_of.The
__mul__implementation properly computes element size viasize_ofand constructs PyCArrayType with the appropriatestg_info, consistent with the new storage model.
224-229: LGTM! CDataObject-backed storage is well-integrated.The migration to
cdata: PyRwLock<CDataObject>and the updated constructor with alignment-aware initialization correctly implement the new storage model. TheStgInfousage aligns with the pattern across other ctypes types.Also applies to: 248-302
crates/vm/src/stdlib/ctypes/base.rs (5)
17-23: LGTM! Core CDataObject abstractions are well-designed.The new
get_type_codehelper,CDataObjectbacking structure with its constructors, and thePyCDatawrapper provide a solid foundation for the unified storage model across ctypes types.Also applies to: 159-254
266-383: LGTM! Comprehensive from_param implementation.The expanded
from_paramimplementation correctly handles type-specific conversions, None for pointer types, and falls back to_as_parameter_attribute. The logic aligns well with CPython's ctypes behavior.
434-608: LGTM! Platform-aware sizing for c_long/c_ulong is now correct.The
value_to_bytes_endianfunction now properly handles platform-dependent types usinglibc::c_long/libc::c_ulongandstd::mem::size_of, addressing the previous issue where 4-byte sizes were hardcoded. The endianness support via theswappedparameter is also well-implemented.
646-661: LGTM! Endianness-aware initialization and value management.The integration of swapped endianness checks in
py_newandset_value, along with the unsigned conversion logic in the value getter, correctly implements the required semantics for simple ctypes with proper buffer synchronization.Also applies to: 676-739
898-970: LGTM! Excellent dll reference tracking.The
in_dllimplementation correctly stores a reference to thedllobject incdata.base(lines 965-967), ensuring the library remains loaded as long as the ctypes object exists. This prevents use-after-free issues.crates/vm/src/stdlib/ctypes/thunk.rs (2)
1-57: LGTM! FFI helper utilities are well-structured.The
ThunkUserData,get_type_code, andffi_to_pythonfunctions provide clean abstractions for the thunk implementation. The use oflibc::c_longandlibc::c_ulong(lines 47-50) ensures platform-correct type handling.
138-304: LGTM! Thunk lifecycle management is well-designed.The
thunk_callback,ThunkDatawith itsDropimplementation, andPyCThunk::new()together provide proper lifetime management for the libffi closure. Leaking userdata for'staticlifetime and reclaiming it inDropis the correct pattern for this use case.
| #[pyfunction(name = "_cast")] | ||
| pub fn pycfunction_cast( | ||
| obj: PyObjectRef, | ||
| _obj2: PyObjectRef, | ||
| ctype: PyObjectRef, | ||
| vm: &VirtualMachine, | ||
| ) -> PyResult { | ||
| use super::array::PyCArray; | ||
| use super::base::PyCData; | ||
| use super::pointer::PyCPointer; | ||
| use crate::class::StaticType; | ||
|
|
||
| // Python signature: _cast(obj, obj, ctype) | ||
| // Python passes the same object twice (obj and _obj2 are the same) | ||
| // We ignore _obj2 as it's redundant | ||
|
|
||
| // Check if this is a pointer type (has _type_ attribute) | ||
| if ctype.get_attr("_type_", vm).is_err() { | ||
| return Err(vm.new_type_error("cast() argument 2 must be a pointer type".to_string())); | ||
| } | ||
|
|
||
| // Create an instance of the target pointer type with no arguments | ||
| let result = ctype.call((), vm)?; | ||
|
|
||
| // Get the pointer value from the source object | ||
| // If obj is a CData instance (including arrays), use the object itself | ||
| // If obj is an integer, use it directly as the pointer value | ||
| let ptr_value: PyObjectRef = if obj.fast_isinstance(PyCData::static_type()) | ||
| || obj.fast_isinstance(PyCArray::static_type()) | ||
| || obj.fast_isinstance(PyCPointer::static_type()) | ||
| { | ||
| // For CData objects (including arrays and pointers), store the object itself | ||
| obj.clone() | ||
| } else if let Ok(int_val) = obj.try_int(vm) { | ||
| // For integers, treat as pointer address | ||
| vm.ctx.new_int(int_val.as_bigint().clone()).into() | ||
| } else { | ||
| return Err(vm.new_type_error(format!( | ||
| "cast() argument 1 must be a ctypes instance or an integer, not {}", | ||
| obj.class().name() | ||
| ))); | ||
| }; | ||
|
|
||
| // Set the contents of the pointer by setting the attribute | ||
| result.set_attr("contents", ptr_value, vm)?; | ||
|
|
||
| Ok(result) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
CPython ctypes cast function behavior memory address pointer
💡 Result:
Short summary of CPython ctypes.cast behavior:
- ctypes.cast(obj, new_type) returns a new pointer instance of new_type that views the same memory as obj (it does not copy memory). new_type must be a pointer type. [1][2]
- obj may be a ctypes instance, a pointer, an integer address, or some buffer-like Python objects; cast will convert the first arg to c_void_p under the hood for the FFI call. [2][3]
- cast does NOT extend the lifetime of the underlying buffer — you must keep a reference to the original object (or otherwise ensure the memory stays valid) or the pointer becomes dangling. [3]
- pointer.contents returns a freshly-constructed ctypes object each access (ctypes has no OOR — "original object return"), so identity comparisons with is will be False; use ctypes.addressof(...) to compare underlying addresses. [1][7]
- Use ctypes.addressof(obj) to get the integer address of a ctypes instance; use byref(obj, offset) to get a lightweight pointer for calls. [1][2]
Concise example:
- ptr = ctypes.cast(ctypes.pointer(my_struct), ctypes.POINTER(OtherStruct)) — ptr points at the same memory, interpreted as OtherStruct; changing ptr.contents updates underlying memory; but keep my_struct alive. [1][3]
Sources: Python ctypes docs and practical notes. [1] Python ctypes Pointers (docs). [2] Python ctypes Utility functions (addressof, byref, cast). [3] Discussion/examples on using cast with bytes/addresses.
🏁 Script executed:
#!/bin/bash
# Find pointer-related files in ctypes
fd -t f "pointer" crates/vm/src/stdlib/ | head -20Repository: RustPython/RustPython
Length of output: 186
🏁 Script executed:
#!/bin/bash
# Look for contents attribute handling
rg -n "contents" crates/vm/src/stdlib/ctypes/ -B2 -A2 --type rustRepository: RustPython/RustPython
Length of output: 186
🏁 Script executed:
#!/bin/bash
# Search for cast function tests
rg -n "test.*cast|cast.*test" crates/vm/src/stdlib/ --type rust -iRepository: RustPython/RustPython
Length of output: 186
🌐 Web query:
RustPython ctypes cast pointer contents behavior
💡 Result:
Short summary of ctypes cast / pointer / contents behavior you’ll see in CPython (and what RustPython aims to match):
- pointer.contents returns a new ctypes instance that represents the object pointed-to — it is not the original Python object (pi.contents is i is False). Reading .contents constructs a fresh wrapper each access. [1]
- ctypes.cast(obj, POINTER(T)) returns a pointer-wrapper that references the same underlying memory block as obj (no copy of the raw bytes). Using that pointer to read/write (via indexing or .contents) operates on the same memory. [1]
- For “simple” ctypes (immediate subclasses of _SimpleCData such as c_void_p, c_char_p, c_int, etc.) ctypes does automatic conversion/autocast to native Python types in many contexts (returned from functions or read from struct/array fields). That behavior can make .contents or field access return Python ints/bytes rather than ctypes objects and has caused surprises/bugs. A Python tracker discussion explains the autocast and workarounds (subclassing) if you need a ctypes object instead of the native type. [2]
- Practical gotchas seen in real code: repeated cast/create operations can hold references or create wrapper objects that increase memory usage (people have observed apparent leaks when repeatedly casting large buffers). The cast itself references the buffer but constructing many wrapper objects can grow memory use. [3]
If you need guidance for a specific RustPython mismatch or a reproducer (example code showing different behavior vs CPython), share the snippet and I’ll look up/compare exact behavior and point to whether it’s a RustPython bug or a libc/ctypes semantics issue.
Sources:
[1] Python ctypes docs (pointers, cast, contents).
[2] Python tracker discussion on simple-types autocast (issue discussion / workaround).
[3] StackOverflow thread: ctypes.cast memory/leak observations.
🌐 Web query:
RustPython ctypes implementation pointer contents
💡 Result:
Short answer: RustPython's ctypes support is incomplete — pointer.contents semantics from CPython (where .contents returns a new ctypes instance referencing the pointed memory) are not fully implemented/guaranteed in RustPython yet. See the RustPython repo/issues noting "Implement the ctypes module" and ongoing work; for CPython behavior refer to the official ctypes docs (pointer() / POINTER() / .contents). [1][2]
References
- RustPython repo / roadmap mentioning ctypes implementation work. [1]
- CPython ctypes documentation (pointer and .contents behavior). [2]
Based on my verification through web search, I can now provide a rewritten review comment. The investigation confirms that CPython's cast() and pointer.contents behavior differs from the current RustPython implementation, though RustPython's ctypes module is still incomplete.
Behavior differs from CPython: cast() should return a pointer viewing the same memory address, and contents should construct a fresh wrapper on each access.
Per CPython semantics, cast(obj, POINTER(T)) returns a pointer that views the same underlying memory as obj (not a copy), and accessing pointer.contents constructs a fresh ctypes wrapper each time—not the original object. The current implementation stores the source object reference directly in contents, which is semantically different from CPython's behavior.
While RustPython's ctypes module is incomplete and this approach may prevent dangling pointers (keeping the source object alive via the stored reference), it diverges from CPython's expected semantics. If RustPython intends to match CPython's ctypes behavior, contents should dynamically construct a fresh wrapper rather than return a stored reference.
🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes.rs around lines 825-872, the current
pycfunction_cast stores the source object itself into the pointer's "contents"
attribute which keeps the original object alive and returns that same wrapper,
but CPython semantics require cast() to create a pointer that views the same
memory address and pointer.contents to construct a fresh wrapper each access.
Change the implementation so the created pointer object stores the raw pointer
address (derived from the source CData/CPointer or taken from an int) instead of
the source object reference, and make the pointer's contents access generate and
return a new wrapper object for the pointee type on every call (construct a
fresh instance bound to the stored address), leaving no persistent reference to
the original wrapper.
| #[pyclassmethod] | ||
| fn in_dll( | ||
| cls: PyTypeRef, | ||
| dll: PyObjectRef, | ||
| name: crate::builtins::PyStrRef, | ||
| vm: &VirtualMachine, | ||
| ) -> PyResult { | ||
| use libloading::Symbol; | ||
|
|
||
| // Get the library handle from dll object | ||
| let handle = if let Ok(int_handle) = dll.try_int(vm) { | ||
| // dll is an integer handle | ||
| int_handle | ||
| .as_bigint() | ||
| .to_usize() | ||
| .ok_or_else(|| vm.new_value_error("Invalid library handle".to_owned()))? | ||
| } else { | ||
| // dll is a CDLL/PyDLL/WinDLL object with _handle attribute | ||
| dll.get_attr("_handle", vm)? | ||
| .try_int(vm)? | ||
| .as_bigint() | ||
| .to_usize() | ||
| .ok_or_else(|| vm.new_value_error("Invalid library handle".to_owned()))? | ||
| }; | ||
|
|
||
| // Get the library from cache | ||
| let library_cache = crate::stdlib::ctypes::library::libcache().read(); | ||
| let library = library_cache | ||
| .get_lib(handle) | ||
| .ok_or_else(|| vm.new_attribute_error("Library not found".to_owned()))?; | ||
|
|
||
| // Get symbol address from library | ||
| let symbol_name = format!("{}\0", name.as_str()); | ||
| let inner_lib = library.lib.lock(); | ||
|
|
||
| let symbol_address = if let Some(lib) = &*inner_lib { | ||
| unsafe { | ||
| // Try to get the symbol from the library | ||
| let symbol: Symbol<'_, *mut u8> = lib.get(symbol_name.as_bytes()).map_err(|e| { | ||
| vm.new_attribute_error(format!("{}: symbol '{}' not found", e, name.as_str())) | ||
| })?; | ||
| *symbol as usize | ||
| } | ||
| } else { | ||
| return Err(vm.new_attribute_error("Library is closed".to_owned())); | ||
| }; | ||
|
|
||
| // For pointer types, we return a pointer to the symbol address | ||
| Ok(PyCPointer { | ||
| contents: PyRwLock::new(vm.ctx.new_int(symbol_address).into()), | ||
| } | ||
| .into_ref_with_type(vm, cls)? | ||
| .into()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Store reference to dll object to prevent library unload.
The implementation retrieves a symbol address from the library but doesn't store a reference to the dll object. If the library is closed or garbage collected, the symbol address becomes invalid, potentially causing segmentation faults.
Consider adding a mechanism to track the dll reference, similar to how PyCSimple::in_dll stores dll in cdata.base (lines 965-967 in base.rs). Since PyCPointer doesn't currently have a cdata field, you may need to add an optional reference holder.
| /// Convert a Python object to a C value and store it at the result pointer | ||
| fn python_to_ffi(obj: PyResult, ty: &PyTypeRef, result: *mut c_void, vm: &VirtualMachine) { | ||
| let obj = match obj { | ||
| Ok(o) => o, | ||
| Err(_) => return, // Exception occurred, leave result as-is | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zero-initialize result buffer on exception.
When the Python callable raises an exception, the function returns without writing to result, leaving uninitialized garbage for the C caller. While RustPython may have exception propagation semantics that handle this, consider zero-initializing the result for defensive programming:
let obj = match obj {
Ok(o) => o,
Err(_) => {
// Zero-initialize result on error
unsafe { std::ptr::write_bytes(result, 0, 1); }
return;
}
};🤖 Prompt for AI Agents
In crates/vm/src/stdlib/ctypes/thunk.rs around lines 59-64, when the Python call
returns Err the function currently returns without writing to the result pointer
leaving uninitialized memory; change the Err branch to zero-initialize the
result buffer before returning: check that result is non-null, compute the byte
size of the target C type for this PyTypeRef (the size used for FFI storage) and
call unsafe std::ptr::write_bytes(result, 0, size) to clear that many bytes,
then return; ensure the write is performed inside an unsafe block and uses the
correct size for the FFI type rather than a fixed 1 byte.
685690f to
36b6d8f
Compare
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.