Skip to content

Normalize datetime and string dtypes to match NumPy/Pandas defaults #5327

@ajpotts

Description

@ajpotts

Summary

Recent CI failures indicate dtype mismatches and unsupported dtype errors that appear to be triggered by changes in NumPy/Pandas defaults or dependency versions. Code that passed previously now fails due to stricter or different dtype inference, particularly for datetimes, strings, and pandas Index dtypes.

Observed Failures

  • datetime64[us] vs datetime64[ns] mismatches
  • Errors like:
    • ValueError: dtype datetime64[us] is unsupported
    • ValueError: dtype str is unsupported
  • Pandas test failures where index dtypes differ:
    • object vs StringDtype
  • Groupby aggregation error:
    • numeric_only accepts only Boolean values

Likely Cause

  • CI environment pulled newer NumPy and/or Pandas versions (e.g. Pandas 3.x, as indicated by Pandas4Warning)
  • Upstream defaults or inference behavior changed
  • Arkouda currently assumes narrower dtype sets or exact dtype matches

Proposed Fix

  1. Normalize datetime and timedelta inputs

    • Cast all datetime64[*]datetime64[ns]
    • Cast all timedelta64[*]timedelta64[ns]
    • Use kind-based checks instead of exact dtype equality
  2. Broaden string dtype acceptance

    • Accept NumPy unicode (U), bytes (S), object-of-str, and pandas StringDtype
    • Convert consistently to Arkouda string
  3. Align pandas Index construction

    • Prefer pd.Index(data) and allow pandas to infer dtype
    • Avoid forcing StringDtype unless explicitly required
  4. Validate numeric_only arguments

    • Ensure only bool | None are accepted
    • Normalize numpy.bool_ to Python bool
  5. Optional stability improvement

    • Pin NumPy/Pandas versions in CI to avoid silent behavior changes

Expected Outcome

  • Restore test stability across CI
  • Make dtype handling robust to upstream NumPy/Pandas changes
  • Reduce future breakage from default or inference shifts

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions