-
Notifications
You must be signed in to change notification settings - Fork 97
Open
Description
Summary
Recent CI failures indicate dtype mismatches and unsupported dtype errors that appear to be triggered by changes in NumPy/Pandas defaults or dependency versions. Code that passed previously now fails due to stricter or different dtype inference, particularly for datetimes, strings, and pandas Index dtypes.
Observed Failures
datetime64[us]vsdatetime64[ns]mismatches- Errors like:
ValueError: dtype datetime64[us] is unsupportedValueError: dtype str is unsupported
- Pandas test failures where index dtypes differ:
objectvsStringDtype
- Groupby aggregation error:
numeric_only accepts only Boolean values
Likely Cause
- CI environment pulled newer NumPy and/or Pandas versions (e.g. Pandas 3.x, as indicated by
Pandas4Warning) - Upstream defaults or inference behavior changed
- Arkouda currently assumes narrower dtype sets or exact dtype matches
Proposed Fix
-
Normalize datetime and timedelta inputs
- Cast all
datetime64[*]→datetime64[ns] - Cast all
timedelta64[*]→timedelta64[ns] - Use kind-based checks instead of exact dtype equality
- Cast all
-
Broaden string dtype acceptance
- Accept NumPy unicode (
U), bytes (S), object-of-str, and pandasStringDtype - Convert consistently to Arkouda
string
- Accept NumPy unicode (
-
Align pandas Index construction
- Prefer
pd.Index(data)and allow pandas to infer dtype - Avoid forcing
StringDtypeunless explicitly required
- Prefer
-
Validate
numeric_onlyarguments- Ensure only
bool | Noneare accepted - Normalize
numpy.bool_to Pythonbool
- Ensure only
-
Optional stability improvement
- Pin NumPy/Pandas versions in CI to avoid silent behavior changes
Expected Outcome
- Restore test stability across CI
- Make dtype handling robust to upstream NumPy/Pandas changes
- Reduce future breakage from default or inference shifts
Metadata
Metadata
Assignees
Labels
No labels