-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Update opcode from 3.13.7
#6156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
_opcode. Update dis from 3.13.7opcode from 3.13.7
| Self::is_valid(opcode) | ||
| && matches!( | ||
| opcode, | ||
| 63 | 66 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where the magic numbers came from? will this be changed if we change instruction enum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I just ran:
import opcode
for name in (
"hasarg",
"hasconst",
"hasname",
"hasjump",
"hasjrel",
"hasjabs",
"hasfree",
"haslocal",
"hasexc",
):
lst = getattr(opcode, name)
print(f"{name}\n{lst}")And copied the results xD
But it's calculated based on https://github.com/python/cpython/blob/bcee1c322115c581da27600f2ae55e5439c027eb/Include/internal/pycore_opcode_metadata.h#L966-L1190
I've put a comment at https://github.com/ShaharNaveh/RustPython/blob/a7b9356da2a85cbc02df1f0fac9eda6a26869b6d/stdlib/src/opcode.rs#L46-L47 stating this:)
We should have our own metadata table, but I think that's a bit out of scope for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At
RustPython/compiler/core/src/bytecode.rs
Lines 534 to 541 in aa0eb4b
| #[derive(Debug, Copy, Clone, PartialEq, Eq)] | |
| #[repr(u8)] | |
| pub enum Instruction { | |
| Nop, | |
| /// Importing by name | |
| ImportName { | |
| idx: Arg<NameIdx>, | |
| }, |
I tried doing something like:
#[repr(u8)]
pub enum Instruction {
Nop = 30,
ImportName {
idx: Arg<NameIdx>,
} = 75,
...
}Based on:
Nop: https://github.com/python/cpython/blob/bcee1c322115c581da27600f2ae55e5439c027eb/Include/opcode_ids.h#L43ImportName: https://github.com/python/cpython/blob/bcee1c322115c581da27600f2ae55e5439c027eb/Include/opcode_ids.h#L88
But this change alone was enough to break RustPython entirely.
@youknowone can you please point me to where any of:
RustPython/compiler/core/src/bytecode.rs
Lines 796 to 815 in aa0eb4b
| impl From<Instruction> for u8 { | |
| #[inline] | |
| fn from(ins: Instruction) -> Self { | |
| // SAFETY: there's no padding bits | |
| unsafe { std::mem::transmute::<Instruction, Self>(ins) } | |
| } | |
| } | |
| impl TryFrom<u8> for Instruction { | |
| type Error = crate::marshal::MarshalError; | |
| #[inline] | |
| fn try_from(value: u8) -> Result<Self, crate::marshal::MarshalError> { | |
| if value <= u8::from(LAST_INSTRUCTION) { | |
| Ok(unsafe { std::mem::transmute::<u8, Self>(value) }) | |
| } else { | |
| Err(crate::marshal::MarshalError::InvalidBytecode) | |
| } | |
| } | |
| } |
are being called? I've tried looking into it and got a bit lost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for explanation. Since we occasionally break bytecode compatibility, the numbers need to be regenerated.
Could you also attach the updating code around enum Instruction not to let it be desynced?
The From<Instruction> part is used form marshal. cc @coolreader18
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we occasionally break bytecode compatibility
Is there a reason for that? or it's something that's desired but not implemented yet?
the numbers need to be regenerated. Could you also attach the updating code around
enum Instructionnot to let it be desynced?
I haven't written any code to do that (yet, but was planning to). As I just changed those two enum members and I couldn't even do cargo run :/
The
From<Instruction>part is used form marshal. cc @coolreader18
ty:) will look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youknowone I have opened #6174 so we can discuss the next steps there and to not block this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for that? or it's something that's desired but not implemented yet?
No specific reason. We just add or remove instructions by following CPython update. The order changing is unintended side effect. Python doesn't guarantee bytecode compatibiltity anyway.
Does giving specific numbers to each instruction make it easier to be maintained? I guess so.
After #6174, then can we rewrite 63 | 66 to Instruction::Abc as u8 | Instruction::Def as u8 and so on?
|
finally we get |
| 'INSTRUMENTED_POP_JUMP_IF_FALSE': 251, | ||
| 'INSTRUMENTED_POP_JUMP_IF_NONE': 252, | ||
| 'INSTRUMENTED_POP_JUMP_IF_NOT_NONE': 253, | ||
| 'JUMP': 256, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know it can be over 256, huh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are pseudo opcodes:)
Real opcodes can't be more than 256
| Self::is_valid(opcode) | ||
| && matches!( | ||
| opcode, | ||
| 63 | 66 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for that? or it's something that's desired but not implemented yet?
No specific reason. We just add or remove instructions by following CPython update. The order changing is unintended side effect. Python doesn't guarantee bytecode compatibiltity anyway.
Does giving specific numbers to each instruction make it easier to be maintained? I guess so.
After #6174, then can we rewrite 63 | 66 to Instruction::Abc as u8 | Instruction::Def as u8 and so on?
|
For some reason I can't comment on the thread of #6156 (comment) so answering it here:
Yes! it also makes it easier for writing the helpers of:
but that wouldn't be needed as I intend #6174 to generate those automatically (just like CPython does it)
Well, yes you could. but I thought that we could have an API that looks similar to this: // This is inside the `_opcode` module
fn has_arg(opcode: i32) -> bool {
Instruction::try_from(opcode).map_or(false, |v| !v.is_pseudo() && v.has_arg())
}where:
are all auto generated. |
youknowone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general.
I am worrying about the newly introduced magic numbers. I wish it could be quickly resolved in next patch
I'll send a quick patch to resolve it in the next couple of days |
No description provided.