Skip to content

BUG: Error in MaskedArray with StringDType when setting fill_value #29421

@phdparedes

Description

@phdparedes

Describe the issue:

Description of the Error

I encountered an error when having a MaskedArray with dtype set as StringDType and trying to set a fill_value other than None (or using the filled(fill_value) method).

The error seems to come from a data type check of the fill_value before setting it to the MaskedArray here:

elif isinstance(fill_value, str) and (ndtype.char not in 'OSVU'):

Possible fix

Currently it is just admitting OSVU as valid numpy.dtype.char types, when in https://numpy.org/doc/stable/reference/generated/numpy.dtype.kind.html
the T character code for StringDType could be admissible.

I tested rudimentarily that including T in OSVU for the check works just fine.

Reproduce the code example:

# Here the masked array is not created because `fill_value=''` (or any other string) is not accepted.

import numpy as np
from numpy.dtype import StringDType

strdt_ma = np.ma.MaskedArray(['zero', 'one', 'two', '', 'four'], mask=[False, False, False, True, False], fill_value='', dtype=StringDType(na_object='', coerce=True))


# Here the masked array is created successfully because `fill_value = None`, but fails when an string value is used

strdt_ma = np.ma.MaskedArray(['zero', 'one', 'two', '', 'four'], mask=[False, False, False, True, False], fill_value=None, dtype=StringDType(na_object='', coerce=True))

print(strdt_ma.dtype,strdt_ma.dtype.char)
strdt_ma.fill_value = '' # fails for any string
print(strdt_ma.filled('N/A')) # fails for any string
print(strdt_ma.filled()) # the missing value is cast as '?'
strdt_ma.filled()[3] == None # False
strdt_ma.filled()[3] =='?' # True

Error message:

Traceback (most recent call last)
Cell In[242], line 1
----> 1 strdt_ma = np.ma.MaskedArray(['zero', 'one', 'two', '', 'four'], mask=[False, False, False, True, False], fill_value='', dtype=StringDType(na_object='', coerce=True))

File /opt/anaconda3/envs/py313/lib/python3.13/site-packages/numpy/ma/core.py:3017, in MaskedArray.__new__(cls, data, mask, dtype, copy, subok, ndmin, fill_value, keep_mask, hard_mask, shrink, order)
   3015 # But don't run the check unless we have something to check.
   3016 if fill_value is not None:
-> 3017     _data._fill_value = _check_fill_value(fill_value, _data.dtype)
   3018 # Process extra options ..
   3019 if hard_mask is None:

File /opt/anaconda3/envs/py313/lib/python3.13/site-packages/numpy/ma/core.py:504, in _check_fill_value(fill_value, ndtype)
    501 elif isinstance(fill_value, str) and (ndtype.char not in 'OSVU'):
    502     # Note this check doesn't work if fill_value is not a scalar
    503     err_msg = "Cannot set fill value of string with array of dtype %s"
--> 504     raise TypeError(err_msg % ndtype)
    505 else:
    506     # In case we want to convert 1e20 to int.
    507     # Also in case of converting string arrays.
    508     try:

TypeError: Cannot set fill value of string with array of dtype StringDType(na_object='')

Python and NumPy Versions:

import sys, numpy; print(numpy.__version__); print(sys.version)
2.3.1
3.13.5 | packaged by Anaconda, Inc. | (main, Jun 12 2025, 11:23:37) [Clang 14.0.6 ]

Runtime Environment:

No response

Context for the issue:

Recently I started to work with numpy>=2 to be able to use the functionality of variable width string arrays, and in so avoid unwanted string truncation during manipulation. I ran into this issue when trying to use MaskedArray with the StringDType and noticed the restriction in the fill_value definition.

In conclusion, the issue should be solved by adding T dtype character code to the accepted list in _check_fill_value() function.

PD: I'm a long-time user and first time contributor, I don't mind submitting the change myself following the contributor guidelines https://numpy.org/devdocs/dev/index.html#development-process-summary but I'm not sure if there is anything else to do to be allowed to contribute. Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions