Skip to content

TYP: np.char.array overloads not totally accurate for unicode arg #29376

@MarcoGorelli

Description

@MarcoGorelli

Describe the issue:

import numpy as np
from typing import reveal_type

reveal_type(np.char.array('foo', unicode=False))

outputs

t.py:4: note: Revealed type is "numpy._core.defchararray.chararray[builtins.tuple[Any, ...], numpy.dtype[numpy.str_]]"

I'd have expected

t.py:4: note: Revealed type is "numpy._core.defchararray.chararray[builtins.tuple[Any, ...], numpy.dtype[numpy.bytes_]]"

I think the issue is

@overload
def array(
obj: U_co,
itemsize: int | None = ...,
copy: bool = ...,
unicode: L[False] = ...,
order: _OrderKACF = ...,
) -> _CharArray[str_]: ...

The default for unicode is None, not Literal[False]. It seems the idea is:

  • unicode=True: return np.str_ type
  • unicode=False: return np.bytes_ type
  • unicode=None` (default): return either of the above, depending on the input

Spotted this while trying out https://github.com/MarcoGorelli/fix-overload-defaults

Reproduce the code example:

see above

Error message:

Python and NumPy Versions:

2.4.0.dev0+git20250714.cc92651
3.12.11 | packaged by conda-forge | (main, Jun 4 2025, 14:45:31) [GCC 13.3.0]

Type-checker version and settings:

mypy 1.16

Additional typing packages.

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions