Skip to content

url2pathname() doesn't handle URL query or fragment components #136874

@barneygale

Description

@barneygale

Bug report

Bug description:

urllib.request.url2pathname() incorrectly treats URL query (?a=b&c=d) and fragment (#anchor) components as part of the URL path

>>> from urllib.request import url2pathname
>>> url2pathname('file://localhost/etc/hosts?foo=bar#badgers', require_scheme=True)
'/etc/hosts?foo=bar#badgers'  # expected '/etc/hosts'

I think they should be silently discarded as they have no bearing on the filesystem path (similar to how we discard the netloc if it's a local hostname).

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions