Skip to content

Conversation

@ShaharNaveh
Copy link
Collaborator

No description provided.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 17, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ShaharNaveh ShaharNaveh changed the title Impl _opcode. Update dis from 3.13.7 Update opcode from 3.13.7 Sep 18, 2025
@ShaharNaveh ShaharNaveh marked this pull request as ready for review September 18, 2025 14:53
Self::is_valid(opcode)
&& matches!(
opcode,
63 | 66
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where the magic numbers came from? will this be changed if we change instruction enum?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just ran:

import opcode

for name in (
    "hasarg",
    "hasconst",
    "hasname",
    "hasjump",
    "hasjrel",
    "hasjabs",
    "hasfree",
    "haslocal",
    "hasexc",
):
    lst = getattr(opcode, name)
    print(f"{name}\n{lst}")

And copied the results xD


But it's calculated based on https://github.com/python/cpython/blob/bcee1c322115c581da27600f2ae55e5439c027eb/Include/internal/pycore_opcode_metadata.h#L966-L1190

I've put a comment at https://github.com/ShaharNaveh/RustPython/blob/a7b9356da2a85cbc02df1f0fac9eda6a26869b6d/stdlib/src/opcode.rs#L46-L47 stating this:)


We should have our own metadata table, but I think that's a bit out of scope for this PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At

#[derive(Debug, Copy, Clone, PartialEq, Eq)]
#[repr(u8)]
pub enum Instruction {
Nop,
/// Importing by name
ImportName {
idx: Arg<NameIdx>,
},

I tried doing something like:

 #[repr(u8)] 
 pub enum Instruction { 
     Nop = 30, 
     ImportName { 
         idx: Arg<NameIdx>, 
     } = 75, 
     ...
}

Based on:

But this change alone was enough to break RustPython entirely.

@youknowone can you please point me to where any of:

impl From<Instruction> for u8 {
#[inline]
fn from(ins: Instruction) -> Self {
// SAFETY: there's no padding bits
unsafe { std::mem::transmute::<Instruction, Self>(ins) }
}
}
impl TryFrom<u8> for Instruction {
type Error = crate::marshal::MarshalError;
#[inline]
fn try_from(value: u8) -> Result<Self, crate::marshal::MarshalError> {
if value <= u8::from(LAST_INSTRUCTION) {
Ok(unsafe { std::mem::transmute::<u8, Self>(value) })
} else {
Err(crate::marshal::MarshalError::InvalidBytecode)
}
}
}

are being called? I've tried looking into it and got a bit lost

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for explanation. Since we occasionally break bytecode compatibility, the numbers need to be regenerated.
Could you also attach the updating code around enum Instruction not to let it be desynced?

The From<Instruction> part is used form marshal. cc @coolreader18

Copy link
Collaborator Author

@ShaharNaveh ShaharNaveh Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we occasionally break bytecode compatibility

Is there a reason for that? or it's something that's desired but not implemented yet?

the numbers need to be regenerated. Could you also attach the updating code around enum Instruction not to let it be desynced?

I haven't written any code to do that (yet, but was planning to). As I just changed those two enum members and I couldn't even do cargo run :/

The From<Instruction> part is used form marshal. cc @coolreader18

ty:) will look into it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youknowone I have opened #6174 so we can discuss the next steps there and to not block this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for that? or it's something that's desired but not implemented yet?

No specific reason. We just add or remove instructions by following CPython update. The order changing is unintended side effect. Python doesn't guarantee bytecode compatibiltity anyway.

Does giving specific numbers to each instruction make it easier to be maintained? I guess so.

After #6174, then can we rewrite 63 | 66 to Instruction::Abc as u8 | Instruction::Def as u8 and so on?

@youknowone
Copy link
Member

finally we get _opcode, great!

'INSTRUMENTED_POP_JUMP_IF_FALSE': 251,
'INSTRUMENTED_POP_JUMP_IF_NONE': 252,
'INSTRUMENTED_POP_JUMP_IF_NOT_NONE': 253,
'JUMP': 256,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know it can be over 256, huh

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are pseudo opcodes:)
Real opcodes can't be more than 256

Self::is_valid(opcode)
&& matches!(
opcode,
63 | 66
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for that? or it's something that's desired but not implemented yet?

No specific reason. We just add or remove instructions by following CPython update. The order changing is unintended side effect. Python doesn't guarantee bytecode compatibiltity anyway.

Does giving specific numbers to each instruction make it easier to be maintained? I guess so.

After #6174, then can we rewrite 63 | 66 to Instruction::Abc as u8 | Instruction::Def as u8 and so on?

@ShaharNaveh
Copy link
Collaborator Author

ShaharNaveh commented Sep 25, 2025

For some reason I can't comment on the thread of #6156 (comment) so answering it here:

Does giving specific numbers to each instruction make it easier to be maintained? I guess so.

Yes! it also makes it easier for writing the helpers of:

  • is_valid
  • has_arg
  • has_const
  • has_name
  • etc...

but that wouldn't be needed as I intend #6174 to generate those automatically (just like CPython does it)

After #6174, then can we rewrite 63 | 66 to Instruction::Abc as u8 | Instruction::Def as u8 and so on?

Well, yes you could. but I thought that we could have an API that looks similar to this:

// This is inside the `_opcode` module
fn has_arg(opcode: i32) -> bool {
  Instruction::try_from(opcode).map_or(false, |v| !v.is_pseudo() && v.has_arg())
}

where:

  • Instruction::try_from
  • instruction.is_pseudo()
  • instruction.has_arg()

are all auto generated.

Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general.

I am worrying about the newly introduced magic numbers. I wish it could be quickly resolved in next patch

@youknowone youknowone merged commit 3a6fda4 into RustPython:main Oct 5, 2025
11 checks passed
@ShaharNaveh
Copy link
Collaborator Author

I am worrying about the newly introduced magic numbers. I wish it could be quickly resolved in next patch

I'll send a quick patch to resolve it in the next couple of days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants