Skip to content

Bug: pd.array fails on Arkouda-backed Categorical with dtype specified #5335

@ajpotts

Description

@ajpotts

Summary

Calling pd.array on an Arkouda-backed Categorical fails with a NotImplementedError when a target dtype is explicitly provided. The failure occurs because pd.array ultimately attempts to iterate over the Categorical, but Arkouda's Categorical.__iter__ intentionally disallows iteration to prevent implicit data transfer from the server.

This prevents a valid and expected conversion path for Arkouda-backed categoricals.

Reproduction

import arkouda as ak
import pandas as pd
from arkouda.pandas import Categorical

pd.array(
    Categorical(ak.array(["a", "a", "b"])),
    dtype="ak_int64"
)

Observed Behavior

The call fails with:

NotImplementedError: Categorical does not support iteration.
To force data transfer from server, use to_ndarray

Relevant stack trace excerpt:

File .../pandas/core/construction.py:321, in array
    return cls._from_sequence(data, dtype=dtype, copy=copy)

File .../_arkouda_array.py:122, in _from_sequence
    return cls(ak_array(scalars, dtype=dtype, copy=copy))

File .../pdarraycreation.py:250, in array
    a = list(a)

File .../categorical.py:629, in __iter__
    raise NotImplementedError(...)

Expected Behavior

pd.array(Categorical(...), dtype=...) should either:

  1. Use a non-iterative conversion path for Arkouda-backed Categorical, or
  2. Explicitly and safely materialize data via a supported method (e.g., to_ndarray()), or
  3. Fail earlier with a clearer error indicating that this conversion is unsupported with guidance on alternatives.

In particular, Pandas-style construction should not implicitly rely on Python iteration for Arkouda extension types.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions