Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Dec 10, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Significantly expanded documentation coverage for Python built-in types, exceptions, modules, and system interfaces, including previously undocumented classes and methods.
  • Chores

    • Enhanced documentation generation infrastructure with improved type discovery and automated update handling across branches.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 10, 2025

Walkthrough

This PR updates the documentation generation workflow, expanding builtin type discovery in the Python generation script, and adding comprehensive documentation entries for Python built-in types and related constructs in the static data file. The workflow now supports branch-specific updates with updated file paths under crates/doc/.

Changes

Cohort / File(s) Summary
Workflow updates
.github/workflows/update-doc-db.yml
Elevates permissions to write contents; adds ref input for target branch specification; relocates sparse-checkout and artifact paths from crates/rustpython-doc to crates/doc; adds commit-and-push step for non-main branches with git configuration and conditional logic
Documentation generation script
crates/doc/generate.py
Expands builtin type discovery by dynamically including all public types from types module; adds isinstance check to filter entries with non-string raw_doc values
Static documentation data
crates/doc/src/data.inc.rs
Adds comprehensive docstring entries to the DB hash map for multiple built-in types including DynamicClassAttribute, GenericAlias, Method, MethodDescriptor, _coroutine, _generator, _async_generator, cell, Property, range, slice, staticmethod, zip_longest, and others

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • crates/doc/src/data.inc.rs: Large volume of new static documentation entries requiring verification of completeness and accuracy
  • .github/workflows/update-doc-db.yml: Path and configuration changes across multiple steps; conditional logic for branch-specific commits
  • crates/doc/generate.py: Logic changes are straightforward, but verify that dynamic type discovery covers intended types and isinstance filtering doesn't exclude valid entries

Possibly related PRs

Suggested reviewers

  • coolreader18

Poem

🐰 Hop along through crates of doc,
New types discovered, round the clock!
From builtins deep to modules tall,
We've gathered knowledge, one and all!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'doc db to include types.*' directly refers to the main change: expanding documentation database to include types from the types module.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch doc-db

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone added the skip:ci Skip running the ci label Dec 10, 2025
@youknowone youknowone removed the skip:ci Skip running the ci label Dec 10, 2025
@youknowone youknowone marked this pull request as ready for review December 10, 2025 08:44
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
crates/doc/generate.py (1)

212-217: isinstance(entry.raw_doc, str) guard is good; consider aligning the type annotation

Tightening the filter to entry.raw_doc is not None and isinstance(entry.raw_doc, str) is a solid defensive fix to avoid feeding non‑string docs into inspect.cleandoc. Given this, you might optionally relax DocEntry.raw_doc’s annotation from str | None to something like object | None (or typing.Any) so the type hints better reflect that non‑string values can flow through and are then narrowed at the comprehension.

.github/workflows/update-doc-db.yml (1)

104-107: Add error handling for git operations and explicit remote specification.

The git push on line 107 uses no explicit remote or branch reference. While this typically works with the current checkout, consider:

  1. Adding explicit error handling if the push fails
  2. Using git push origin HEAD:${{ inputs.ref || github.ref_name }} for clarity and robustness

Apply this diff to improve reliability and clarity:

         if [ -n "$(git status --porcelain)" ]; then
           git add crates/doc/src/data.inc.rs
           git commit -m "Update doc DB for CPython ${{ inputs.python-version }}"
-          git push
+          git push origin HEAD:${{ inputs.ref || github.ref_name }}
         fi
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c71c78c and b2367ac.

📒 Files selected for processing (3)
  • .github/workflows/update-doc-db.yml (5 hunks)
  • crates/doc/generate.py (2 hunks)
  • crates/doc/src/data.inc.rs (17 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • crates/doc/generate.py
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/doc/src/data.inc.rs
🔇 Additional comments (4)
crates/doc/generate.py (1)

180-208: Including public types.* in builtin_types looks correct; confirm how keys are consumed downstream

The dir(types) + isinstance(obj, type) + leading-underscore filter is a reasonable way to pull in ModuleType, FunctionType, etc., without dragging in non-type objects. One thing to double‑check: these entries are still keyed as ("builtins", typ.__name__) (Lines 206‑208), so the docs end up under builtins.FunctionType, not types.FunctionType. If the goal is to expose types.* docs, make sure the Rust side (or later processing) aliases types.<Name> to these builtins.<Name> entries rather than expecting this script to emit types.* keys directly.

crates/doc/src/data.inc.rs (1)

4053-4074: Documentation additions look good overall.

The new documentation entries for Python built-in types (DynamicClassAttribute, GenericAlias, NotImplementedType, SimpleNamespace, UnionType, async_generator, coroutine, generator, frame, cell, code, module, traceback, various descriptor types, etc.) are comprehensive and consistent with CPython's documentation.

The entries correctly document:

  • PEP 585 (GenericAlias) and PEP 604 (UnionType) type annotations
  • Generator/coroutine protocol methods (send, throw, close)
  • Descriptor protocol methods (__get__, __set__, __delete__)
  • Internal type attributes (e.g., gi_yieldfrom, cr_await, ag_await)

Also applies to: 4343-4369, 4805-4826, 5117-5138, 5553-5578, 5750-5778, 5852-5872, 6131-6177, 6219-6246, 6424-6444, 6553-6574, 6642-6695, 6872-6931, 7004-7094, 7490-7511, 7622-7643

.github/workflows/update-doc-db.yml (2)

72-75: Clarify JSON merge strategy for multi-platform artifacts.

The workflow merges JSON files from three different platforms (Ubuntu, Windows, macOS) using jq -s "add". This operator merges JSON objects by key, with later values overriding earlier ones. Verify that:

  1. Generated docs are identical or compatible across all platforms, or
  2. Platform-specific overrides are intentional and expected
  3. Key collisions are handled correctly

Consider adding a brief comment documenting the expected merge behavior, or verify that CI validates output consistency across platforms.


100-100: Verify the condition logic prevents commits on intended branches.

The condition github.ref != 'refs/heads/main' && inputs.ref != 'main' prevents commits when either the current branch is main OR when inputs.ref is explicitly set to main. Since inputs.ref defaults to an empty string, the step will run on non-main branches unless explicitly overridden with ref: main.

Confirm this logic matches your intent—particularly that commits should not run on the main branch even if explicitly triggered via workflow dispatch.

Comment on lines +99 to +108
- name: Commit and push (non-main branches only)
if: github.ref != 'refs/heads/main' && inputs.ref != 'main'
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
if [ -n "$(git status --porcelain)" ]; then
git add crates/doc/src/data.inc.rs
git commit -m "Update doc DB for CPython ${{ inputs.python-version }}"
git push
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find .github -name "update-doc-db.yml" -type f

Repository: RustPython/RustPython

Length of output: 100


🏁 Script executed:

cat -n .github/workflows/update-doc-db.yml | head -120

Repository: RustPython/RustPython

Length of output: 4365


🏁 Script executed:

ls -la .github/workflows/ | grep -i update

Repository: RustPython/RustPython

Length of output: 132


🏁 Script executed:

find . -name "*.yml" -o -name "*.yaml" | grep -i update

Repository: RustPython/RustPython

Length of output: 102


🏁 Script executed:

git ls-files .github/workflows/ | head -20

Repository: RustPython/RustPython

Length of output: 264


🏁 Script executed:

ls -la .github/ 2>/dev/null || echo "Checking git tracked files"; git ls-files .github/ | head -20

Repository: RustPython/RustPython

Length of output: 887


🏁 Script executed:

git ls-files | grep -E "\.github.*\.(yml|yaml)$" | sort

Repository: RustPython/RustPython

Length of output: 287


🏁 Script executed:

rg -l "Commit and push (non-main branches only)" --type yaml

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

rg "AUTO_COMMIT_PAT" --type yaml

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

rg "Update doc DB for CPython" --type yaml

Repository: RustPython/RustPython

Length of output: 47


🌐 Web query:

GitHub Actions sparse checkout git operations authentication

💡 Result:

Summary + quick examples

  1. Use actions/checkout sparse options to do sparse-checkout in a workflow:
  • inputs: sparse-checkout (newline-separated patterns), sparse-checkout-cone-mode (true/false), fetch-depth, filter (partial clone).
    Example:
    • uses: actions/checkout@v6
      with:
      fetch-depth: 0
      sparse-checkout: |
      .github
      src
      sparse-checkout-cone-mode: true
      (cite: actions/checkout). [1]
  1. Authenticated git operations in the job
  • By default actions/checkout persists an auth token into local git config so subsequent git commands (fetch, push) run authenticated. That token is the job-scoped GITHUB_TOKEN. Use persist-credentials: true to keep it (default). To push from a workflow you also need appropriate token permissions (contents: write) — set repository permissions in the workflow or use a PAT (secret) if you need broader scopes or create new workflow runs. GITHUB_TOKEN expires at job end and actions-triggered events from it don’t re-run workflows. [2]

Example push with GITHUB_TOKEN:
permissions:
contents: write
jobs:
push:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
persist-credentials: true
fetch-depth: 0
sparse-checkout: |
README.md
- run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
# make changes...
git add .
git commit -m "Update"
git push origin HEAD:main

If you need to use a PAT (for cross-repo write, finer scopes, or to trigger new workflows), store it as a secret and set actions/checkout.token or configure git remote HTTPS with that PAT. [1][2]

  1. Caveats & troubleshooting
  • Sparse/partial-clone behavior depends on Git and runner versions; some git versions and combinations of partial-clone/filter vs sparse-checkout have known issues (see issues where sparse may fall back to full checkout). If you see all files fetched, try changing cone-mode, fetch-depth, or use full fetch-depth: 0 and explicit git sparse-checkout commands on the runner. Test runner git version and update if needed. [1][3]

References
[1] actions/checkout README (sparse-checkout inputs & usage).
[2] GitHub Docs — GITHUB_TOKEN and using it for authentication in workflows.
[3] actions/checkout GitHub issues (sparse-checkout / git-version caveats).


🏁 Script executed:

git ls-files .github/ 2>/dev/null | head -50

Repository: RustPython/RustPython

Length of output: 509


🏁 Script executed:

test -d .github && find .github -type f | head -30 || echo ".github directory not found in current repo"

Repository: RustPython/RustPython

Length of output: 509


Ensure the AUTO_COMMIT_PAT secret is configured in repository settings with write access.

The merge job requires the AUTO_COMMIT_PAT secret to authenticate git push operations. Configure this secret in your repository settings with appropriate permissions to push commits to non-main branches. The workflow correctly uses the full repository checkout (without sparse checkout) for the merge and commit step, so git operations will work as expected.

🤖 Prompt for AI Agents
.github/workflows/update-doc-db.yml lines 99-108: the workflow assumes a
push-able token but doesn't ensure one is provided; configure the
AUTO_COMMIT_PAT secret in the repository settings with repo write permissions
and update the workflow to use it when pushing (for example export a GIT_TOKEN
env from secrets.AUTO_COMMIT_PAT and run git remote set-url origin
https://x-access-token:${GIT_TOKEN}@github.com/${{ github.repository }}.git
before git push); also ensure the workflow has repository write permissions
enabled so the push succeeds.

"builtins.bytes_iterator.__str__" => "Return str(self).",
"builtins.bytes_iterator.__subclasshook__" => "Abstract classes can override this to customize issubclass().\n\nThis is invoked early on by abc.ABCMeta.__subclasscheck__().\nIt should return True, False or NotImplemented. If it returns\nNotImplemented, the normal algorithm is used. Otherwise, it\noverrides the normal algorithm (and the outcome is cached).",
"builtins.callable" => "Return whether the object is callable (i.e., some kind of function).\n\nNote that classes are callable, as are instances of classes with a\n__call__() method.",
"builtins.cell" => "Create a new cell object.\n\n contents\n the contents of the cell. If not specified, the cell will be empty,\n and \nfurther attempts to access its cell_contents attribute will\n raise a ValueError.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor formatting issue in the cell docstring.

There's a stray newline in the documentation string: "and \nfurther" breaks the sentence mid-flow. This likely originates from the source docstring or the generation process in generate.py.

Since this is a generated file, consider investigating whether this malformation comes from CPython's docstring or if there's a processing issue in the generator script.

🤖 Prompt for AI Agents
In crates/doc/src/data.inc.rs around line 6086, the generated docstring for
"builtins.cell" contains a stray newline and space breaking "and \nfurther" into
two lines; update the generator (generate.py) to normalize docstring whitespace
for generated output—either fix the source docstring if it's malformed in
CPython, or in the generator collapse internal newlines and multiple spaces into
single spaces (e.g., join wrapped lines with a single space while preserving
paragraph breaks), then regenerate the file so the doc becomes "and further
attempts..." without the stray "\n" and extraneous space.

Copy link
Collaborator

@ShaharNaveh ShaharNaveh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks for fixing the workflow as well!

@youknowone youknowone merged commit 90717e5 into main Dec 10, 2025
13 checks passed
@youknowone youknowone deleted the doc-db branch December 10, 2025 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants