Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 17 additions & 9 deletions benches/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,28 +4,36 @@ These are some files to determine performance of rustpython.

## Usage

Running `cargo bench` from the root of the repository will start the benchmarks. Once done there will be a graphical
Running `cargo bench` from the root of the repository will start the benchmarks. Once done there will be a graphical
report under `target/criterion/report/index.html` that you can use use to view the results.

To view Python tracebacks during benchmarks, run `RUST_BACKTRACE=1 cargo bench`. You can also bench against a
`cargo bench` supports name matching to run a subset of the benchmarks. To
run only the sorted microbenchmark, you can run:

```shell
cargo bench sorted
```

To view Python tracebacks during benchmarks, run `RUST_BACKTRACE=1 cargo bench`. You can also bench against a
specific installed Python version by running:

```shell
$ PYTHON_SYS_EXECUTABLE=python3.7 cargo bench
PYTHON_SYS_EXECUTABLE=python3.13 cargo bench
```

### Adding a benchmark

Simply adding a file to the `benchmarks/` directory will add it to the set of files benchmarked. Each file is tested
Simply adding a file to the `benchmarks/` directory will add it to the set of files benchmarked. Each file is tested
in two ways:

1. The time to parse the file to AST
2. The time it takes to execute the file

### Adding a micro benchmark

Micro benchmarks are small snippets of code added under the `microbenchmarks/` directory. A microbenchmark file has
Micro benchmarks are small snippets of code added under the `microbenchmarks/` directory. A microbenchmark file has
two sections:

1. Optional setup code
2. The code to be benchmarked

Expand All @@ -39,8 +47,8 @@ a_list = [1,2,3]
len(a_list)
```

Only `len(a_list)` will be timed. Setup or benchmarked code can optionally reference a variable called `ITERATIONS`. If
present then the benchmark code will be invoked 5 times with `ITERATIONS` set to a value between 100 and 1,000. For
Only `len(a_list)` will be timed. Setup or benchmarked code can optionally reference a variable called `ITERATIONS`. If
present then the benchmark code will be invoked 5 times with `ITERATIONS` set to a value between 100 and 1,000. For
example:
Comment on lines +50 to 52
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify accuracy: does the harness invoke benchmarks exactly 5 times?

The microbenchmark example uses 5 * ITERATIONS in its own code, but the harness’s number of invocations may differ. Past learnings note ITERATIONS is injected at runtime, not the exact repeat count. Please confirm before locking this to “5 times.” If uncertain, prefer “multiple times.”

Proposed wording if needed:

-Only `len(a_list)` will be timed. Setup or benchmarked code can optionally reference a variable called `ITERATIONS`. If
-present then the benchmark code will be invoked 5 times with `ITERATIONS` set to a value between 100 and 1,000. For
+Only `len(a_list)` will be timed. Setup or benchmarked code can optionally reference a variable called `ITERATIONS`. If
+present, then the benchmark code will be invoked multiple times with `ITERATIONS` set to a value between 100 and 1,000. For

Run this script to locate the harness logic that sets ITERATIONS and confirm the count/values:


🏁 Script executed:

#!/bin/bash
# Find where ITERATIONS is injected and how many distinct values are used.
rg -nC3 -S '\bITERATIONS\b' benches | sed -n '1,200p'
rg -nC2 -S 'microbenchmark|microbenchmarks' benches

Length of output: 9032


Clarify README: harness invokes benchmarks multiple times, not exactly five

Verified by inspecting benches/microbenchmarks.rs that there is no hard-coded “5” invocations—ITERATIONS is injected for each run based on Criterion’s input values, and the harness loops over a range of iteration counts rather than invoking exactly five times.

Please update benches/README.md accordingly:

-Only `len(a_list)` will be timed. Setup or benchmarked code can optionally reference a variable called `ITERATIONS`. If
-present then the benchmark code will be invoked 5 times with `ITERATIONS` set to a value between 100 and 1,000. For
+Only `len(a_list)` will be timed. Setup or benchmarked code can optionally reference a variable called `ITERATIONS`. If
+present, then the benchmark code will be invoked multiple times with `ITERATIONS` set to a value between 100 and 1,000. For

Relevant implementation details:

  • In benches/microbenchmarks.rs, the harness checks for ITERATIONS in each benchmark and injects it (lines ~65–67), but does not fix the count at five.
  • Criterion’s BenchmarkGroup then runs the benchmark for each input value, driving multiple invocations rather than exactly five.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Only `len(a_list)` will be timed. Setup or benchmarked code can optionally reference a variable called `ITERATIONS`. If
present then the benchmark code will be invoked 5 times with `ITERATIONS` set to a value between 100 and 1,000. For
example:
Only `len(a_list)` will be timed. Setup or benchmarked code can optionally reference a variable called `ITERATIONS`. If
present, then the benchmark code will be invoked multiple times with `ITERATIONS` set to a value between 100 and 1,000. For
example:
🧰 Tools
🪛 LanguageTool

[grammar] ~50-~50: There might be a mistake here.
Context: ...rence a variable called ITERATIONS. If present then the benchmark code will be ...

(QB_NEW_EN)


[grammar] ~51-~51: There might be a mistake here.
Context: ...et to a value between 100 and 1,000. For example: ```python obj = [i for i in ra...

(QB_NEW_EN)

🤖 Prompt for AI Agents
In benches/README.md around lines 50 to 52, the README incorrectly states the
harness invokes benchmarks exactly five times; update the text to clarify that
the harness injects ITERATIONS per run and then Criterion runs the benchmark
across a range of input values (so benchmarks are invoked multiple times driven
by those inputs rather than a fixed count of five). Mention that
benches/microbenchmarks.rs injects ITERATIONS when present (lines ~65–67) and
that BenchmarkGroup executes the benchmark for each input value, and adjust the
example wording to reflect multiple runs over a range instead of “5 times.”


```python
Expand All @@ -49,7 +57,7 @@ obj = [i for i in range(ITERATIONS)]

`ITERATIONS` can appear in both the setup code and the benchmark code.

## MacOS setup
## MacOS setup

On MacOS you will need to add the following to a `.cargo/config` file:

Expand All @@ -63,4 +71,4 @@ rustflags = [

## Benchmark source

- https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/nbody-python3-2.html
- <https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/nbody-python3-2.html>
9 changes: 9 additions & 0 deletions benches/microbenchmarks/sorted.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
from random import random, seed
seed(0)

unsorted_list = [random() for _ in range(5 * ITERATIONS)]

# ---

# Setup code only runs once so do not modify in-place
sorted(unsorted_list)
Loading