16 Dec 21:40

ajpotts

a7a063a

Release Notes v2025.12.16 Latest

Latest

Arkouda v2025.12.16

This release continues Arkouda’s push toward full NumPy and pandas compatibility, with major progress on multi-dimensional arrays, pandas ExtensionArray support, distributed performance, and developer tooling cleanup.

Supported environments and dependencies

This release was tested in CI with the following language versions:

Python: 3.10, 3.11, 3.12, 3.13
Chapel: 2.4.0, 2.5.0, 2.6.0

Notable dependency requirements

Runtime dependencies include:

NumPy ≥ 2.0
pandas ≥ 1.4.0, excluding 2.2.0 (!= 2.2.0)
pyarrow ≥ 6.0.1, < 21.0.0
tables (PyTables) ≥ 3.10.0
h5py ≥ 3.7.0
typeguard pinned to 2.10.0

For the full list of dependencies (including optional dev tools such as pytest, Sphinx, and linters), see pyproject.toml.

Highlights

Multi-Dimensional Array Expansion

Multi-dimensional support is now significantly more complete across the API:

Multi-dimensional support added to or enhanced:
- ak.uniform, ak.poisson (#4956, #5023)
- ak.value_counts (#4962)
- ak.matmul, including fixes for correctness and stability (#5009, #5027, #5042)
Fixed Chapel instantiation limits for 3+ dimensions (#4227)
Reorganized broadcasting logic and internals (#4978, #4737, #4737)

Distributed Performance & Algorithms

New repartitionByHash API for distributed workflows (#4500)
Adopted Chapel standard sort for distributed sorting (#5039)
Refactored FeistelShuffle into innerArray for better performance (#5069)
Performance improvements to cumSum / cumProd (#4810)

pandas Integration & ExtensionArray Progress

New Arkouda accessor for pandas Index (#5074, #5110)
pandas DataFrame accessor for Arkouda (#4983)
Renamed ArkoudaBaseArray → ArkoudaExtensionArray (#5001)
ExtensionArray API improvements: _from_sequence (#5078), copy (#5076), argsort (#4993)
Refactored factorize to remove pandas dependency (#4940)
Registered extension dtypes (#4946)

Developer Experience & CI Modernization

CI updated to support Chapel 2.6; dropped Chapel 2.0–2.3 (#4986, #4991)
Automated CI build improvements (#4892, #4893)
Improved Makefile structure and debug ergonomics (#5128, #5133)
Configurable compiled Arkouda dimensionality via make (#5091)
Updated Arrow / Parquet handling, including Arrow <19 compatibility (#5146, #5164)

Tooling Cleanup & Code Quality

Removed isort, darglint, and pydocstyle (#5060, #5072)
Reduced ruff ignores and resolved formatting issues (#4979, #4980, #4982)
Fixed mypy issues and improved type precision (#5093)
Removed deprecated tests and legacy code paths (#5031)

Bug Fixes & Correctness

Fixed edge-case failures for small sizes (size <= 10) (#5054, #5052, #5045)
Fixed ak.array negative number handling (#4984)
Fixed concatenate(axis=1) behavior (#5030)
Fixed CSV parsing for quoted and multiline records (#5080)
Improved numerical consistency with NumPy (allclose) (#2956)

Full Changelog
v2025.09.30...v2025.12.16

Auto-Generated Release Notes

What's Changed

Closes #4973: Benchmark for segPdarrayIndex by @1RyanK in #4974
Adds multi-dim capability to the uniform random number generator by @drculhane in #4956
Resolves Issue 4810, cumSum and cumProd performance by @drculhane in #4819
Closes #4949: x.take by @ajpotts in #4950
Closes #4102: pdaarraymanipulation module needs unit tests by @ajpotts in #4977
remove D104 ignore code by @ajpotts in #4980
Closes #2689: Value mismatch with numpy on int(arr) // -float(arr) by @1RyanK in #4976
Closes #4902: Resolve E203 errors by @ajpotts in #4979
Closes #4981: Resolve D200 Errors by @ajpotts in #4982
Improve CI to automate builds (Part 2) by @jabraham17 in #4892
improve extension array base class by @ajpotts in #4927
Closes #4954: versioneer.py reference setup.cfg which was removed. by @ajpotts in #4955
Drop Chapel 2.0-2.3 from the CI build images and add 2.6 by @jabraham17 in #4986
Closes #4962, adds multi-dim to value_counts by @drculhane in #4988
Improve CI to automate builds (Part 3) by @jabraham17 in #4893
Reorganizes the broadcast functions in numpy and array_api by @drculhane in #4978
Closes #4994: Use innerArray record in Repartion.chpl by @1RyanK in #4995
Fix docs build by @jabraham17 in #4998
Part 3 of: #4348 Remove DAR103 errors by @ajpotts in #4992
Closes #5001: rename ArkoudaBaseArray to ArkoudaExtensionArray by @ajpotts in #5007
Closes #4984: ak.array with negative numbers still has problems by @1RyanK in #4985
Closes #4993: ArkoudaArray.argsort by @ajpotts in #4999
Closes #4227: Hitting Chapel's default instantiation limit when compiling for 3 or more dimensions by @1RyanK in #5002
Add chapel 2.6 to CI by @jabraham17 in #4991
Closes #4500: Create a repartitionByHash function by @1RyanK in #5005
Closes 4737 - renames the registered name broadcast in BroadcastMsg.chpl to gbbroadcast by @drculhane in #5014
Closes #2956: Implement allclose function by @jaketrookman in #5000
Add numNodes to server config to support colocale-based runs by @e-kayrakli in #5021
Closes #4940: Refactor factorize to avoid pandas reference by @ajpotts in #4941
Adds multi-dimensionality to ak.matmul to match numpy by @drculhane in #5009
Part 1 of #5008 upgrade to typeguard 4.3.0 by @ajpotts in #5012
Adds a fix to ak.matmul multi-dim by @drculhane in #5027
Temporarily disable pytables installation test in the CI by @ajpotts in #5036
fix some formatting errors identified by the pre-commit config by @ajpotts in #5034
Closes #5031: remove deprecated tests by @ajpotts in #5032
Aligns ak.poisson's handling of lam and size to that of numpy by @drculhane in #5023
Closes 5040, fixing a bug in multi-dim matrix multiplication by @drculhane in #5042
Closes #5054: test_set_jth when size==10 by @ajpotts in #5055
Fix sort benchmark names in benchmark v2 by @e-kayrakli in #5037
Closes #5028: Fix a small issue in interpretAsBytes by @1RyanK in #5029
Closes #5052: test_randint_array_dtype_multi_dim fails when size==10 by @ajpotts in #5053
Closes #5030: concatenate gives unexpected zeros when axis=1 by @1RyanK in #5066
Closes #5064 ak.cast overloads for more precise type checking by @ajpotts in #5065
Closes #5045: test_is_sorted fails when size<=10 by @ajpotts in #5046
Closes #5056: GroupBy.min to handle segment of all nans in skipNaN mode by @ajpotts in #5057
Closes #5049: Errors from pdarrayclass.del at end of unit test runs. by @ajpotts in #5051
More ruff ignore codes by @ajpotts in #5047
Omit runs folder from pre-commit checks by @ajpotts in #5048
Closes #5072: remove darglint and pydocstyle by @ajpotts in #5073
Closes #5062: ak.array overloads for more precise type checking by @ajpotts in #5063
Closes #5067: ak.generic_msg overloads for more precise type checking by @ajpotts in #5068
Closes #4569: Simplify and extend logic in doBigIntBinOpvv and doBigIntBinOpvvBoolReturn by @1RyanK in #4593
Closes #5060 remove isort by @ajpotts in #5061
Closes #5069: Investigate moving FeistelShuffle from list Repartition to innerArray by @1RyanK in #5070
Closes #5093: Fix mypy problems by @1RyanK in #5094
Closes #5078: ArkoudaExtensionArray._from_sequence by @ajpotts in #5079
Cl...

Contributors

e-kayrakli, ajpotts, and 5 other contributors

Assets 2

30 Sep 19:54

ajpotts

v2025.09.30

e077dc4

Release Notes v2025.09.30

Release Notes

This release introduces several major new features, performance improvements, and bug fixes across Arkouda’s Python and Chapel codebases.
Highlights include the new pandas ExtensionArray implementation, expanded random number generation features, and improvements to Parquet I/O performance.

Supported environments and dependencies

This release was tested in CI with the following language versions:

Python: 3.10, 3.11, 3.12, 3.13
Chapel: 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.4.0, 2.5.0

Notable dependency requirements

Runtime dependencies include:

NumPy ≥ 2.0
pandas ≥ 1.4.0, excluding 2.2.0 (!= 2.2.0)
pyarrow ≥ 6.0.1, < 21.0.0
tables (PyTables) ≥ 3.10.0
h5py ≥ 3.7.0
typeguard pinned to 2.10.0

For the full list of dependencies (including optional dev tools such as pytest, Sphinx, and linters), see pyproject.toml.

Major Changes

Implemented pandas ExtensionArray for Arkouda (Closes #4597, #4907, #4876, #4947) by @ajpotts

Added ak.rand to match np.random.rand (Closes #4736) by @drculhane

Added ak.shares_memory function (Closes #3284) by @ajpotts

Added ak.errstate context manager for error handling (Closes #3286) by @ajpotts

Added ak.Index.sort_values (Closes #3177) by @ajpotts

Added ak.fabs (Closes #4921) by @1RyanK

Added ascending argument to ak.argsort (Closes #4782) by @ajpotts

Improved Parquet read performance, especially for multiple column reads (Closes #4906) by @e-kayrakli

Enabled multi-dim output for ak.random.standard_exponential (Closes #4924) by @drculhane

Added destructors for Chapel-side and Python-side RNGs (Closes #4898) by @drculhane

Minor Changes

Expanded axis validation standardization across array API functions (Closes #4831, #4858, #4909, #4932) by @drculhane

Improved docstrings (Closes #3941, #3942, #4852, #4849, #4853, #4947) by @ajpotts, @1RyanK

Added global seed support for reproducibility (Closes #4777, #4726) by @drculhane

Improved shuffle benchmark with Feistel and alternatives (Closes #4818, #4845, #4787) by @1RyanK

Improved benchmark framework (Closes #4811, #4814, #4808, #4816, #4856) by @ajpotts

Added pytest-benchmark dependency (Closes #4821) by @jabraham17

Improved CI builds: Chapel 2.5 support, automated builds, Dockerfile fixes (Closes #4783, #4891, #4910, #4908) by @jaketrookman, @jabraham17

Added pyproject.toml for modern packaging (Closes #4209) by @ajpotts

Refined multi-dim build to reduce size (Closes #4791) by @ajpotts

Improved nbytes handling for bigint arrays (Closes #4850, #4896) by @1RyanK

Improved command registration (Closes #4953) by @e-kayrakli

Bug Fixes

Fixed ak.where for Categorical (Closes #4881) by @1RyanK

Fixed ak.randint behavior for bool (Closes #4872) by @1RyanK

Fixed conversion of numpy bigint zeros producing empty arrays (Closes #4884) by @1RyanK

Fixed cumsum vs cumulative_sum typo (Closes #4804) by @drculhane

Fixed handling of size/shape in ak.random.poisson (Closes #4916) by @drculhane

Fixed common type promotion in concat and stack (Closes #4889) by @drculhane

Fixed benchmark issues: average rate always zero, array_transfer.dat not populating, io_benchmark parsing (Closes #4824, #4863, #4862) by @ajpotts

Fixed doc build failures with Chapel 2.5.0 (Closes #4838) by @ajpotts

Fixed clang bitshift issue (Closes #4894) by @1RyanK

Fixed MaxArrayDims incorrectness (Closes #4565) by @1RyanK

Fixed negative server return values in rare cases (Closes #4157) by @ajpotts

Fixed intermittent test failures (test_set_uint) (Closes #4153) by @ajpotts

Fixed delGeneratorMsg bug (Closes #4933) by @ajpotts

Fixed PT003, T201, E127, Flake8 errors (Closes #4806, #4874, #4903, #4871) by @ajpotts

Fixed doctest failures in random and client modules (Closes #4798, #4860) by @ajpotts, @drculhane

Auto-generated release notes

What's Changed

Closes 4801, adds an assert to a line in tests/numpy/random_test.py by @drculhane in #4802
Closes 4804, fixing a "cumsum" vs "cumulative_sum" typo that was in PR 4755 by @drculhane in #4805
Closes #4597: implement pandas ExtensionArray for arkouda by @ajpotts in #4598
Closes #4806: Fix PT003 error by @ajpotts in #4807
Adds ak.rand function to match np.random.rand by @drculhane in #4736
Closes 4726, standardizing use of seed in unit tests by @drculhane in #4748
Closes #4808: Expand pytest.N usage in benchmarks_v2 by @ajpotts in #4809
Add pytest-benchmark dependency to setup.py by @jabraham17 in #4821
Closes #4816: sort benchmark_v2/datdir/configs/field_lookup_map.json… by @ajpotts in #4817
Closes #3284: shares_memory by @ajpotts in #4823
Closes #4818: Shuffle benchmark by @1RyanK in #4825
Closes #4798: doctest for random.generator module by @ajpotts in #4799
Closes #4791: trim down multi-dim build by @ajpotts in #4792
Fixes #4838: make doc fails when building with chapel 2.5.0 by @ajpotts in #4839
Closes #4811: parse results for benchmark_v2/split_benchmark.py by @ajpotts in #4813
Closes #4814: Refactor benchmarks/multiIO.py to new benchmark framework by @ajpotts in #4815
Closes 4828, fixes some parametrizations by @drculhane in #4829
Fixes #4824: In io_benchmark.py read Average rate is always zero by @ajpotts in #4826
Closes #4782: add ascending argument to argsort by @ajpotts in #4784
Closes #4787: Shuffling alternative by @1RyanK in #4789
Closes 4831, standardizes axis checking for cases where axis can only be an integer by @drculhane in #4834
update CI to use chapel 2.5 by @jaketrookman in #4783
Closes #4845: Add Feistel to Shuffle Benchmark by @1RyanK in #4846
Improved docstrings in client module by @ajpotts in #4849
Closes #4853: Problem in shuffle docstring by @1RyanK in #4854
Closes #4860: remove apollo skips on doctest unit tests by @ajpotts in #4861
Closes #4856: optional benchmark_v2 submodule by @ajpotts in #4859
Flake8 errors by @ajpotts in #4871
Closes #3941: ak.hist_all docs example produces an error by @ajpotts in #4866
Closes #4874 T201 errors by @ajpotts in #4875
Closes #4863: 16-array Average rate always returns zero under benchm… by @ajpotts in #4865
Part 2 of #4348: Remove the DAR103 errors from the docstrings by @ajpotts in #4852
Closes 4777, creating a global seed by @drculhane in #4848
Closes #4862: array_transfer.dat not populating with benchmark_v2 by @ajpotts in #4864
Closes #4850: nbytes is meaningless for bigint arrays by @1RyanK in #4851
Incorporates standard for axis validation to array_api/manipulation_functions.py by @drculhane in #4858
Closes 4894: clang issue with bitshifts by @1RyanK in #4895
Closes #4896: Update calc_num_bytes for bigint by @1RyanK in #4897
Improve CI to automate builds (Part 1) by @jabraham17 in #4891
Closes #3286: errstate by @ajpotts in #4886
Implements destructors for chapel-side and python-side rngs by @drculhane in #4898
Fix push for build-CI-container by @jabraham17 in #4908
Closes #4872: ak.randint doesn't behave the same for bool as numpy by @1RyanK in #4873
Closes #4876 improve extension array api take functions by @ajpotts in #4882
Fixes common type promotion in array_api concat and stack by @drculhane in #4889
Closes #4903: Resolve E127 Errors by @ajpotts in #4904
Closes #3177 Index.sort_values by @ajpotts in #3235
Fixes handling of size/shape parameter in ak.random.poisson by @drculhane in #4916
Closes #4921: fabs by @1RyanK in #4922
Standardizes axis validation and handling in array_api/statistical_functions.py by @drculhane in #4909
Fix paths in new CI image Dockerfile by @jabraham17 in #4910
Closes #4836: --saveU...

Contributors

e-kayrakli, ajpotts, and 4 other contributors

Assets 2

20 Aug 17:22

ajpotts

v2025.08.20

147a08b

Release Notes v2025.08.20

Introduction

This release delivers a mix of new functionality, performance improvements, infrastructure updates, and ongoing work to align Arkouda more closely with NumPy and modern Python standards.

Highlights include:

New array operations and utilities (Strings.argsort, Categorical.argsort, isnumeric, deepcopy for ak.array, max_bits_list, and a new searchsorted implementation).
Major system-level improvements such as MergeShuffle, repartitionByLocale, enhanced checkpointing (including a server heartbeat and bigint array support), and better configuration utilities.
Expanded test coverage and benchmarking, with many benchmarks refactored for maintainability and consistency.
Significant documentation work: missing docstrings filled in, doctests added, and adoption of NumPy-style docstring conventions with ruff-based linting.
CI and infrastructure updates to improve reliability, including fixes for intermittent failures, expanded multi-dimensional test support, and branch migration from master to main.
A number of important bug fixes addressing auto-checkpointing, Arrow dependency compatibility, type hinting, and CI stability.

Together, these changes improve Arkouda’s stability, usability, and developer experience, while continuing to advance its alignment with NumPy semantics.

Major Changes

Improvements to checkpointing of server state:
- Add heartbeat on the server (#4750, PR #4750)
- Checkpoint pdarrays of bigints (#4771, PR #4771)
- ak.get_registration_config() and tweaks (#4752, PR #4752)
Add MergeShuffle option to shuffle (#4075, PR #4094)
Create a repartitionByLocale function (#4498, PR #4647)
Create a max_bits_list function (#4621, PR #4622)
Add deep copying to ak.array (#4691, PR #4741)
Add isnumeric function (#2915, PR #4694)
Add Strings.argsort (#4642, PR #4643)
Add Categorical.argsort (#4724, PR #4727)
Improvement performance for searchsorted (#4656, PR #4656)

Minor Changes

Benchmarks

Updates/refactors to: encoding, no_op, str-locality, split, setops_multiarray, scan, substring_search, flatten, reduce, io, in1d, bigint_conversion, csvIO, sort_cases, setops, groupby, parquet-fixed-strings, str_locality (#3567, #3573, #4670, #4672, #4679, #3575, #3581, #3568, #3574, #3572, #3571, #4675, #4684, #3578, #3577, #3570, #4682, #3579, PR #4610, PR #4662, PR #4671, PR #4673, PR #4680, PR #4664, PR #4665, PR #4611, PR #4663, PR #4652, PR #4651, PR #4676, PR #4685, PR #4650, PR #4646, PR #4613, PR #4683, PR #4712)
Reformat benchmark results.py to parse all benchmark results (#4606, PR #4751)
Create a benchmark for ak.find (#4743, PR #4745)
Strip out correctness_only mode from benchmarks (#4706, PR #4719)
Create a pytest.N for benchmark_v2/conftest.py (#4704, PR #4735)

Documentation

Adds missing docstrings across multiple modules (match, matcher, index, groupby, alignment, categorical, client dtypes, dataframe, logger, installers.py) (#4639, #4640, #4635, #4634, #4629, #4630, #4631, #4632, #4638, #4530, PR #4639, PR #4640, PR #4635, PR #4634, PR #4629, PR #4630, PR #4631, PR #4632, PR #4638, PR #4531)
Improved docstring rendering for accessor module (#4759, PR #4762)
Add doctest to series module (#4270, PR #4761)
Fix doc duplication issue for reorged modules (#4372, PR #4757)
Switch pydocstyle to NumPy convention (#4518, PR #4518)
Add ruff docstring linting (#4763, PR #4764)
Strip ak.connect from examples (#4779, PR #4780)

CI / Testing / Infra

Set max-parallel for multi-dim tests in CI (#4696, PR #4697)
Reactivate pytest timeout for unit tests (PR #4689)
Parameterize size in test_multi_col_merge (#4713, PR #4714)
Update CI to use a slim build for multi-dim testing (#4778, PR #4778)
Switch branch from master to main + port PRs (#4733, PR #4734, PR #4740)
Skip testing of auto-checkpoints.py on unsupported hardward (PR #4707)

Other

Remove redundant cumsum, align cumulative ops to NumPy (#4749, PR #4755)

Bug Fixes

Fix auto-checkpointing failure (#4700, PR #4700)
Fix mypy 17.0.0 errors (#4715, PR #4716)
Patch Arrow dependency issue (PR #4769)
Fix intermittent CI failures during driver compilation (#4688, PR #4776)
Fix DAR103 errors in docstrings (#4348, PR #4772)
Fix type hint issue with numeric_scalars (#2528, PR #4767)
Temporarily add ignore codes to .pre-commit-config.yaml (#4793, PR #4794)
Deletion of SegString’s values and offsets + more (PR #4773)
Support ak.array for empty strings (#4725, PR #4732)

Auto-generated release notes

What's Changed

Closes #4692: tolist to match numpy by @ajpotts in #4693
Closes #3567: Update encoding_benchmark by @ajpotts in #4610
adds missing docstrigs to the match module by @ajpotts in #4639
add missing docstring to the matcher module by @ajpotts in #4640
Closes #3573: Update no_op_benchmark by @ajpotts in #4662
Closes #4677: refactor small-str-groupby by @ajpotts in #4678
Closes #4670: update str-locality benchmark by @ajpotts in #4671
Closes #4672: update split_benchmark.py by @ajpotts in #4673
Closes #4679: refactor setops_multiarray benchmark by @ajpotts in #4680
Fix auto-checkpointing failure by @vasslitvinov in #4700
Closes #3575: Update scan_benchmark by @ajpotts in #4664
Closes #3581: Update substring_search_benchmark by @ajpotts in #4665
Closes #4696: set max-parallel for multi-dim tests in CI by @ajpotts in #4697
Closes #3568: Update flatten_benchmark by @ajpotts in #4611
Revert "Temporarily don't timeout pytest for debugging" by @ajpotts in #4689
Closes #3574: Update reduce_benchmark by @ajpotts in https://github....

Contributors

vasslitvinov, ajpotts, and 4 other contributors

Assets 2

22 Jul 17:53

ajpotts

v2025.07.03

88f2a83

Release Notes v2025.07.03

Arkouda v2025.07.03

We're excited to announce a feature-packed release of Arkouda with enhanced NumPy compatibility, powerful new array functions, performance improvements, CI tooling, and major documentation progress.

Features

Array Functions

Added: append, argsort, diff, eye, newaxis, nextafter, percentile, quantile, repeat, result_type, take, tile, vecdot, xp.trapz
(#2998, #3000, #3003, #3004, #3292, #3755, #4393, #4418, #4419, #4458, #4483, #4484, #4502, PR #4101, PR #4127, PR #4146, PR #4219, PR #4361, PR #4393), PR #4394, PR #4418, PR #4419, PR #4552)
Improved: ak.diff, ak.nextafter, ak.repeat, ak.reshape, ak.take, ak.tile, ak.argsort
(#2998, #3000, #3004, #3755, #4101, #4146, #4147, #4165, #4394, #4418, #4419, #4458, PRs #4101, #4146, #4394, #4552)
Axis and Broadcasting Enhancements:
- Axis support in ak.mean, ak.var, ak.std (#4425, PR #4442)
- Negative axis handling in ak.squeeze, ak.repeat, ak.argmin, ak.argmax (#4407, #4421, PR #4406, PR #4408)

Checkpointing and Logging

Introduced experimental checkpointing of server state, with support for numeric arrays and automatic checkpointing triggered by memory limits or idle time.
(#2384, PRs #3915, #4391, #4549, PR #4592, #4644)
Improved logging behavior:
- Logs can now be redirected to a file using the server’s logging mechanism (PR #4152)
- Reduced use of throws in logging routines (PR #4433)

Project Infrastructure

Upgraded Apache Arrow to 19.0.1 for compatibility and stability improvements
(#3981, PRs #3982, #4342, PR #4359)

Other

Introduced ak.apply, ak.result_type (now with bigint support), and ak.searchsorted
(#3005, #4483,#4235, PRs #3963, #4214, PR #4440, #4484)
Added ak.coargsort(ascending=...) keyword argument
(#4464, PR #4467)
Added standard gamma distribution function to ak.random
(#3846, PR #4089)

API Enhancements and Compatibility

API Enhancements and Compatibility

Improved NumPy 2.0 compatibility:
- Upgraded numpy dependency to 2.0.0 (#4098, PR #4188, PR #4213)
- Added or aligned: ak.can_cast, ak.sign, ak.result_type, ak.dtype, ak.vecdot, ak.eye, ak.dot, ak.arange, ak.transpose, ak.hstack, ak.where, ak.full, ak.reshape
  (#3329, #3337, #4092, #4165, #4312, #4321, #4555, #4468, PR #4105, PR #4116, PR #4174, PR #4224, PR #4472, PR #4522, PR #4556)
- Improved parameter alignment to NumPy (ak.eye, ak.where, ak.histogram, etc.) (#4096, PR #4078, PR #4482)
- Enabled bool as alias for bool_; enhanced dtype detection for builtins bool, float, int (#4186, #4627, PR #4187, PR #4628)
  (#3337, #3329, #3337, #3981, #4092, #4096, #4105, #4116, #4124, #4165, #4186, #4188, #4213, #4224, #4321, #4312, #4468, #4481, #4483, #4501, #4520, #4555, #4556, #4552, #4627; PRs #4078, #4103, #4174, #4185, #4390, #4213, #4505, #4522, #4628)
Reorganized modules into dedicated numpy/, scipy/ directories for API clarity
(PRs #4103, #4185, #4390)
Miscellaneous API additions and improvements:
- ak.coargsort now supports ascending= keyword (#4464, PR #4467)
- comm_diagnostics now returns a DataFrame (#3970, PR #3971)
DataFrame and Merge Improvements
- DataFrame.merge now supports left_on and right_on
  (#4234, PR #4240)
- ak.merge now supports merging on Categorical columns
  (#4313, PR #4344)
- Fixed DataFrame.__getitem__ dispatch behavior during merges
  (#4360, PR #4362)

Performance Improvements

Improved performance and stability in ak.permutation, distributed array creation, and sorting
(#3974, PRs #3975, #4242)

Deprecations and Refactors

Removed deprecated or obsolete features:
- Removed deprecated functions including lookup function and legacy server utilities
  (#4308, #4375, PRs #4309, #4374, #4376)
- Old registerND annotations removed from remaining modules
  (#3721, #3723, PR #3986)
Refactored and modernized core logic:
- ak.arange now uses instantiateAndRegister (#4382, PR #4383)
- Improved logic for binopvv and binopvs (#4459, #4460, PRs #4462, #4563)
- Reverted ak.zeros behavior to previous default (PR #4141)
- Refactored import and module layout (__init__.py, sort module, CHPL_HOME independence)
  (PRs #3972, #4453, #4551)
Simplified internals and extended platform support:
- Logic cleanup in parse_single_value, HistogramMsg, toSymEntry, and PrivateSpace domain usage
  (#4147, PRs #4150, #4176, #4180, PR #4427, #4633)
Added internal or system-level functionality:
- repartitionByLocaleString and repartitionByHashString server functions
  (#4497, #4499, PRs #4557, #4617)
- Set union function for Strings arrays (#4244, PR #4245)
- Compatibility module for Time.totalMicroseconds() (PR #4142)
- Added missing __all__ to ensure symbol export consistency (#4426, PR #4427)

Benchmark Refactor

Refactored benchmark infrastructure and running mode handling
(#3964, PRs #4358, #4373, #4385, #4471)
Improved and extended benchmark suite:
- Added where benchmark (#4581, [PR #4591](https://github.com/Bears...

Contributors

lydia-duncan, e-kayrakli, and 8 other contributors

Assets 2

13 Jan 16:06

ajpotts

v2025.01.13

a3aa4c3

Release Notes v2025.01.13

Bug Fixes

Issues #3931 and #3933: fixes bug in the Makefile preventing make install-arrow from successfully completing on some systems.
Issue #3947: fixes bug where reshape was failing for a single integer argument.

Major changes

Issues #3939 and #3957: refactors of the Makefile to streamline offline arkouda builds
Issue #3960: creates a comm_diagnostics module for querying comm diagnostic statistics.

Minor changes

Issue #3929: Adds chapel 2.1, 2.2 to the github CI
Issue #3911: minor performance improvement to reduction module
Issues #3881, #3882, and #3872: Completes the refactoring of all functions in EfuncMsg.chpl to the new interface.

Auto-generated release notes

Closes #3929: Add chapel 2.1, 2.2 to CI by @ajpotts in #3930
Part of #3931 bug in make install deps by @ajpotts in #3932
Closes #3939: install-deps to work offline by @ajpotts in #3940
Part 1 of #3933: failing make install-arrow by @ajpotts in #3936
Part 2 of #3933 failing make install arrow by @ajpotts in #3944
Part of #3911 reduction performance improvements by @ajpotts in #3914
Closes 3943 issue with reshape by @drculhane in #3947
Part 3 of #3933: failing make install-arrow by @ajpotts in #3946
Part 2 of #3957: simplify offline builds by @ajpotts in #3959
Part 4 of #3957: simplify offline builds by @ajpotts in #3965
Closes 3881 3882 3872 etc by @drculhane in #3937
Read multiple row groups in Parquet files correctly by @jhh67 in #3950
Revert "Read multiple row groups in Parquet files correctly" by @ajpotts in #3969
Closes #3960: python interface for CommDiagnostics by @ajpotts in #3966

New Contributors

@jhh67 made their first contribution in #3950

Full Changelog: v2024.12.06...v2025.01.13

Contributors

jhh67, ajpotts, and drculhane

Assets 2

07 Dec 02:31

ajpotts

v2024.12.06

9eba2ce

Release Notes v2024.12.06

Bug Fixes

Issue #3870 - fixes bug in reshape for bigint type
Issue #3821 - fixes bug in stridable indexing of Strings in multilocale
PR #3804 - fixes sparseMatToPdarray test failures for distributed arrays
PR #3857 - fixes file location reporting in register-commands.py
Issue #3842 - fixes mypy CI failures

Major changes

PRs #3840, #3877 - adds Sparse Matrix creation from Pdarrays
Issues #3823, #3827 - adds flatten function
Issues #3782, #3851, #3820 - adds flip function
Issue #3300 - adds shape function
Issue #3904 - adds function to return list of all compiled dimensions available
Issues #3886, #3813, #3866, #3855, #3809, and PRs #3874, #3854, #3847,#3845, #3841, #3832, #3799, #3878 - refactor and improve server side message argument handling and convert modules to new framework

Minor changes

Numpy Alignment
- Issues #3868, #3884, #3781 - code reorganization to align with numpy
- Issue #3864 - max and min of bool to return bool to match numpy
- Issue #3714 - pdarray.shape returns a tuple
- Issue #3283 - adds mixed types to work with histogram2d and match the return dtypes with Numpy
Issues #3839, #3560, #3796 - refactor benchmarks to use pytest framework and add to CI.
Issue #3815, PRs #3880, #3812, #3926, #3912, #3802 - unit test improvements
Issues #3902, #3896, #3818, #3883, #3887 - reduce warnings
Issue #3708 - refactors array_api to call functions from arkouda.pdarray_creation
PRs #3814 and #3826 - performance improvements to array function
PR #3862 - updated the hdf5 download link in the Makefile
Issue #3905 - assert_equivalent compares shapes of pdarrays
PR #3818 - improves documentation for LINUX_INSTALL
Issue #3849 - adds SortingAlgorithm enum to all in sorting module

Auto-generated release notes

#3802: sporadic failures of test_assert_frame_equal_check_exact by @ajpotts in #3808
Closes #3796: Add benchmarks to CI by @ajpotts in #3810
Closes #3283: histogram2d between different dtypes by @jaketrookman in #3763
Closes #3811: Roll back test change to determine impact on testing by @bmcdonald3 in #3812
Fix performance regression in array transfer performance by @jeremiah-corrado in #3814
Closes #3782: flip function to match numpy by @ajpotts in #3791
Closes #3815: Disable client_test for nightly due to machine issues by @stress-tess in #3817
small instruction fix by @ItsQuinnMoore in #3818
Closes #3820: bug in flip multi-local by @ajpotts in #3822
remove Commands.chpl from tree by @jeremiah-corrado in #3799
Fix array transfer performance regression by @jeremiah-corrado in #3826
Closes #3714: pdarray.shape should be a tuple by @ajpotts in #3803
Fix sparseMatToPdarray test failures for distributed arrays by @jeremiah-corrado in #3804
Closes #3827: rename flatten to split by @ajpotts in #3828
Fixes #3821: Bug in stridable indexing of Strings in multilocale by @stress-tess in #3830
Closes #3823 flatten function to match numpy by @ajpotts in #3825
Closes 3818 -- eliminates warning messages about tilde vs not by @drculhane in #3829
Refactor SparseMatrixMsg to use automated registration by @jeremiah-corrado in #3832
Closes #3781 move random module to numpy submodule by @ajpotts in #3835
Closes #3842: Fixes mypy CI failures by @stress-tess in #3843
Part of argTypeReductionMessage refactor by @ajpotts in #3845
Refactor arg type reduction message pt2 by @ajpotts in #3847
Closes #3560 Update argsort_benchmark by @ajpotts in #3838
Creating Sparse Matrix from Pdarrays by @ShreyasKhandekar in #3840
Closes #3849: Add SortingAlgorithm enum to __all__ by @stress-tess in #3853
Fixes #3851: Error when running string flip by @stress-tess in #3852
Support where-clause evaluation in registration annotations by @jeremiah-corrado in #3841
Closes #3861: Update hdf5 download link in Makefile by @stress-tess in #3862
Part 3 of argTypeReductionMessage refactor by @ajpotts in #3854
Fix broken error reporting for Chapel 2.0 in register-commands.py by @jeremiah-corrado in #3857
Closes 3809 moves trig and hyp fns to new interface by @drculhane in #3863
Closes #3868: move squeeze functionality to arkouda.numpy. by @ajpotts in #3869
Part of #3708: array_api to call functions from arkouda.pdarray_crea… by @ajpotts in #3758
Fix performance regression in reductions benchmark by @jeremiah-corrado in #3874
Optimize creation of sparrays from pdarrays by @ShreyasKhandekar in #3877
Eliminates duplicates in tests/numpy/numeric_test.py by @drculhane in #3880
LayoutCS deprecation warning fix by @jeremiah-corrado in #3883
Closes #3884: Remove _squeeze function by @ajpotts in #3885
Fix build error caused by #3883 by @jeremiah-corrado in #3887
Closes #3855: refactor boolReductionMsg by @ajpotts in #3876
Closes 3813 and 3866 -- moves several new functions to the new interface (abs, square, all exp and log, isnan, isinf, isfinite) by @drculhane in #3873
Closes #3896 PytestUnknownMarkWarning for pytest.mark.skip_if_nl_grea… by @ajpotts in #3897
Closes #3300: shape function by @ajpotts in #3900
Closes #3886 refactor idx reduction msg by @ajpotts in #3889
Closes #3902: truth value of an empty array DeprecationWarning by @ajpotts in #3903
Closes 3878 - refactors rounding functions to new interface, pulls hash function into their own procs by @drculhane in #3898
Part of #3839 new benchmarks to output performance graph format by @ajpotts in #3894
Closes #3864 max and min of bool to return bool like numpy by @ajpotts in #3901
Closes #3870: bug in reshape for bigint type by @ajpotts in #3907
Closes #3904: function to return list of all compiled dimensions avai… by @ajpotts in #3909
Closes #3905: assert_equivalent to compare shapes of pdarrays by @ajpotts in #3906
Closes #3912: failing unit test test_is_locally_sorted_multidim unde… by @ajpotts in #3913
Closes #3926: OverMemoryLimitError in pdarrayclass_test by @ajpotts in #3927

New Contributors

@ItsQuinnMoore made their first contribution in #3818

Full Changelog: v2024.10.02...v2024.12.06

Contributors

ajpotts, ItsQuinnMoore, and 6 other contributors

Assets 2

03 Oct 00:12

stress-tess

v2024.10.02

a44dd0f

Release Notes v2024.10.02

Bug Fixes

Issue #3762 - Fix dataframe groupby aggregations when keys contain NaNs
Issues #3658, #3650, #3654, #3783, #3784, #3788 and PR #3386 - Fix IO bugs including:
- reading segarrays containing NaNs and empty segments with hdf5 and parquet
- reading dataframes containing uint and int segarray columns
- CSV address sanitizer "use after free" memory issues
Issues #3648, #3676, #3682, #3679, #3687, #3666 - Fix multidimensional bugs in sorting, nonzero, repeat, flatten, and unflatten
Issue #3367 - Fixes racy condition in SegHead function
Issue #3468 - Fixes round trip discrepancies for Index with Categorical values
Issue #3649 - Fixes bitshift failures
Issue #3467 - Fixes indexing error in DataFrame instantiation

Major Updates

Issues #3628, #3703 - Drop python 3.8 support
Issue #3355 - Pins scipy<=1.13.1
Issues #3332, #3334, #3351, #3360, #3417, #3419, #3504, #3613, #3695, #3769, #3767, #3711 and PRs #3363, #3368, #3379 -parquet optimizations:
- Added fixed length flag for string reads
- Read strings and byte sizes in batches
- Simplified source code
Issues #3336, #3362, #3183, #3364, #3226, #3523, #3278,#3373, #3372, #3627 - Improve random module with a focus on numpy alignment. Adding:
- exponential, lognormal, logistic
- multidimensional functionality to Random module
Issues #3294, #3639, #3665, #3709 - Improve testing and add delete function for multidimensional arrays
Issues #3425, #3526, #3632, #3656, #3631, #3718, #3720, #3722, #3771, #3657 and PRs #3345, #3358, #3359, #3371, #3518, #3474, #3521, #3525, #3590, #3606, #3603, #3685, #3672, #3691, #3789, #3773, #3786, #3634, #3671, #3655, #3697 - Refactor and improve server side message argument handling
PRs #3516, #3593, #3745 - Add initial implementation of sparse matrix functionality including matrix multiplication, fill_vals, and to_pdarray

Minor Updates

Issues #2978, #3702 - Strip out ArrayView (replaced by multidimensional pdarray functionality)
Issue #3302 - Adds GroupBy.head
Issue #3326 - Adds DataFrame.assign
Issue #3510, #3511 - Update DataFrame.to_pandas and Series.to_pandas to handle categoricals
Issue #3293, #3428 - Add putmask functionality
Issue #3297 - Adds array_equal
Issue #3742 - move numeric module to arkouda.numpy
Issues #3289, #3288, #3291, #3295, #3299, #3298, #3301, #3287, #3296 - Modify dtypes for better numpy alignment
- rename bool to bool_, align with numpy scalar type, remove translate_np_dtype
Issues #3259, #3265, #3267, #3271, #3275, #3269, #3263, #3273, #3261, #3385, #3400, #3403, #3409, #3440, #3445, #3457, #3448, #3452, #3454, #3459, #3461, #3463, #3465, #3407, #3442, #3446, #3405, #3411, #3389, #3212, #3145, #3144, #3143, #3231, #3441, #3447, #3458, #3443, #3462, #3466 , #3455, #3444, #3438, #3450, #3460, #3464 , #3388, #3430, #3624 , #3453, #3413, #3646, #3402, #3439, #3669, #3415, #3421 - Transitions to new testing suite including updating make test
Issues #3508, #3748, #3759, #3727, #3378 - Updates documentation including:
- chapel tutorial, installation docs, and documentation about memory pressure during server builds
Issues #3793, #3798, #3797 and PR #3730 - Updates to benchmarks

Auto-Generated Release Notes

Closes #3308 Unify file permissions by @ajpotts in #3309
Closes #3332: Split Parquet code into multiple files by @bmcdonald3 in #3333
temporary fix for #3355: pin scipy<=1.13.1 to avoid CI failures by @ajpotts in #3356
Closes #3334, #3351: Simplify server side string code and added fixed length by @bmcdonald3 in #3335
Ignore new Parquet object files by @bmcdonald3 in #3363
Closes #3336, #3362: Reuse random number generation loop structure by @stress-tess in #3352
Closes #3259: deprecate test/scipy/scipy_test.py and special_test.py by @ajpotts in #3260
Closes #3265 deprecate tests/numeric_test by @ajpotts in #3266
Closes #3267 deprecate tests/dtypes_test by @ajpotts in #3268
Closes #3271 deprecate tests/index_test by @ajpotts in #3272
Closes #3275 deprecate tests/categorical_test by @ajpotts in #3277
Closes #3360: Reduce code duplication in Parquet read code with templates by @bmcdonald3 in #3361
Closes #3183, #3364: Add exponential distribution and aggregation to random generator loop by @stress-tess in #3310
Simplify Command Map by @jeremiah-corrado in #3345
Adds arkouda.testing module by @ajpotts in #3186
Closes #3269 deprecate tests/datetime_test by @ajpotts in #3270
Remove string.doFormat, replacing with string.format by @jeremiah-corrado in #3365
Closes #3302 GroupBy.head by @ajpotts in #3324
Refactor MessageArgs by @jeremiah-corrado in #3358
Closes Ticket #3263: deprecate tests/dataframe_test by @ajpotts in #3264
Closes #3273 deprecate tests/series_test by @ajpotts in #3274
Remove number of files multiplication for IO benchmark by @bmcdonald3 in #3368
3231 unique unit tests by @drculhane in #3258
Adds missing numpy dtypes by @ajpotts in #3330
Closes #3367 racy condition in SegHead function by @ajpotts in #3369
Closes #3261 deprecate tests/numpy by @ajpotts in #3262
Closes #3375: Cleanup indexof1d code by @stress-tess in #3377
Disable Parquet multi row group test until resolved by @bmcdonald3 in #3379
Closes #3326 DataFrame.assign by @ajpotts in #3327
Refactor SymbolTable and error handling by @jeremiah-corrado in #3359
Resolves #3294 - Add numpy-like delete function by @jeremiah-corrado in #3321
Closes #3281 rename bool to bool_ to match numpy by @ajpotts in #3282
Fixes #3392: Fix mypy CI failures by @stress-tess in #3394
Resolve CSV Asan "use after free" memory issues by @ShreyasKhandekar in #3386
Closes #3376 more numpy imports by @ajpotts in #3381
Closes #3385 groupby_test.py by @ajpotts in #3397
Closes #3400 deprecate alignment_tests.py by @ajpotts in #3401
Closes #3403 deprecate bigint_agg_test.py by @ajpotts in #3404
Closes #3409 deprecate tests/client_dtypes_test.py by @ajpotts in #3410
Closes #3417: Separate Parquet string read code from generic read function by @bmcdonald3 in #3418
Closes #3419: Remove intertwined list column and string column byte calculation logic by @bmcdonald3 in #3420
Closes #3425: Improve Msg Function Registration for Module Tracking by @bmcdonald3 in #3424
Closes #3226: Adds parameterization to test_shuffle and test_permutation by @drculhane in #3320
Closes #3414 deprecate compare_test.py by @ajpotts in #3416
Closes #3293: Add putmask by @drculhane in #3370
add-path modification for building on Horizon by @brandon-neth in #3423
Closes #3405 deprecate tests/bitops_test.py by @ajpotts in #3406
Closes #3411 deprecate tests/client_test.py by @ajpotts in #3412
Automated command registration by @jeremiah-corrado in #3371
Closes #3475 make fails when lib and lib64 directories are both present by @ajpotts in #3503
Closes #3504: Improve Parquet Integration: Stop Using Array Views by @bmcdonald3 in #3505
Remove support for pre-2.0 versions of Chapel by @jeremiah-corrado in #3477
Adds skip configuration for multidimensional histogram test by @brandon-neth in #3506
Updates PROTOs pdarray_creation_test by @drculhane in #3393
Closes #3514-add pandas-stubs to arkouda-env-dev.yml by @ajpotts in #3515
Stop requiring manual installation of chapel-py to register commands by @jeremiah-corrado in #3518
Closes #3407 deprecate tests/check.py by @ajpotts in #34...

Contributors

ajpotts, jabraham17, and 6 other contributors

Assets 2

21 Jun 19:30

stress-tess

v2024.06.21

cf6eeac

Release Notes v2024.06.21

Bug Fixes

Issues #3074, #3234 - Fix bug reading Segarrays from parquet files
Issues #3001, #3185 - Fix broadcast bugs involving nans and Strings
Issue #3156 - Fixes Categorical.sort_values bug
Issues #3311, #3112 - Fix Parquet multi column byte writing and Parquet string column free
Issue #3115 - Fixes non-deterministic sparse_sum failure
Issue #3089 - Avoids out of memory crashes caused by in intents on makeDistArray
Issue #3009 and PRs #3232, #3316 - Improve performance of indexof1d and fix handling of null values
Issues #3158, #3222 - Fix print bugs involving Dataframe or Series containing a Segarray

Major Updates

PR #3303 - Drops support for Chapel 1.31
Issues #3343, #3346 - Pin numpy < 2.0 and python < 3.12.4
Issue #3148 - Updates IO functions to always return a dictionary
PRs #3238, #3314 and Issue #3347 - Reimplements CSV read to increase performance
Issue #3108 - Adds groupby.sample and dataframe.groupby.sample
Issue #2893 - Changes the behavior of dataframe.GroupBy.count to align with pandas
Issues #3086, #3118, #3245, #3322, #3167 and PRs #3110, #3280 - Add updates to Random module:
- Adds choice, poisson, normal to random number generators
PRs #3242, #3305, #3160, #3223, #3237, #3142 - Improvements to Array API:
- Add documentation for Array API functions
- Add implementations ofvstack, clip, diff,pad and missing stats, search, and sort functions to Array API module
- Compatibility improvements for Xarray chunk-manager
Issues #3213, #3206, #3202, #3208, #3217, #3188 - Add Index and MultiIndex properties:
- Including levels,equals, names, ndim, etc
Issues #3050, #3192, #3128, #3196, #3198, #3200, #3130, #3123, #3194 - Work on proto tests:
- Improvements to tests for dataframe, dtypes, groupby, io,numeric, symbol_table
- Adds make-proto-tests command and updates our CI to run it

Minor Updates

Issues #3006, #3007 - Add median and count_nonzero
Issues #3079, #3080 - Add sum and += for boolean pdarrays
PRs #3221, #3211 - Add NYC taxi tutorial from CUG 2024

Auto-Generated Release Notes

Closes #3068 add doc strings for numpy imports by @ajpotts in #3077
Add a random sampling with support for a weights array by @jeremiah-corrado in #3110
Closes #3112: Fix Parquet string column free by @bmcdonald3 in #3113
Closes #3115: Fix non-deterministic sparse_sum failure by @stress-tess in #3117
Closes #3086: Add choice to random number generators by @stress-tess in #3114
Closes #3118: Move choice implementation into arkouda by @stress-tess in #3138
Closes #2947 change the name of the class dataframe.GroupBy by @ajpotts in #3146
Avoid a warning about mismatched parSafe settings for list initialization by @lydia-duncan in #3149
Closes #3116 remove DataFrame._columns by @ajpotts in #3147
Closes #3124-dataframe.pyi-file and Closes #3097 numpy import docs at module level by @ajpotts in #3141
Closes #3135 Update scipy/special_test by @ajpotts in #3137
3050 groupby etc by @drculhane in #3111
multidimensional array bug fixes by @jeremiah-corrado in #3142
Closes #3123-make-proto-tests by @ajpotts in #3126
Closes #2893 dataframe.GroupBy.count to align with pandas by @ajpotts in #3125
Closes #3051 Update akscipy_test by @ajpotts in #3136
Fixes #3158: Dataframe containing a Segarray .__str__() bug by @stress-tess in #3161
Closes #3089: Avoid OOM Crashes caused due to in intents on makeDistArray by @ShreyasKhandekar in #3163
Resolve deprecation warning about not using 'new' in dmapped expressions by @jeremiah-corrado in #3162
Closes #3079 and #3080: Sum and Plus Equal of Boolean Arrays by @jaketrookman in #3154
Closes #3108: Add groupby.sample and dataframe.groupby.sample by @stress-tess in #3157
Closes #3174: loosens type return restrictions of sum by @stress-tess in #3175
Fixes #3001: nan broadcast bug by @stress-tess in #3173
Dataframe Indexing by @brandon-neth in #3109
Closes 3190 add mypy.ini by @ajpotts in #3191
Closes #3192 PROTO_tests/tests/dtypes_test.py is failing by @ajpotts in #3193
Fixes #3156:Categorical.sort_values bug by @stress-tess in #3168
Closes #3148: Update IO functions to always return a dictionary by @stress-tess in #3164
Re # 3128 fixes errors and omissions in PROTO-tests version of datafr… by @drculhane in #3139
3130 numeric test slight revamp by @drculhane in #3151
1D implementations of median and count_nonzero by @drculhane in #3187
Closes #3196 PROTO_tests/tests/symbol_table.py failing by @ajpotts in #3197
Closes #3198 PROTO_tests/tests/io_test.py failing by @ajpotts in #3199
Closes #3200 PROTO_tests/tests/dataframe_test.py failing by @ajpotts in #3201
Closes #3204 is_numeric to handle Index and Series type by @ajpotts in #3205
Closes #3206 MultiIndex.levels by @ajpotts in #3207
Array-API slice Assignment by @jeremiah-corrado in #3166
Implement missing stats, search and sort functions for Array API by @jeremiah-corrado in #3160
Closes #3202 Index.inferred_type by @ajpotts in #3203
Closes #3208-Index.equals by @ajpotts in #3209
Closes #3194 add proto tests to CI by @ajpotts in #3195
Add benchmark for for CSV Read and write perf by @ShreyasKhandekar in #3189
Fixes #3185: strings broadcast bug by @stress-tess in #3210
Closes #3167: Add normal to random number generators by @stress-tess in #3180
Add NYC taxi tutorial from CUG 2024 by @bmcdonald3 in #3211
Fix jupyter notebook formatting by @bmcdonald3 in #3221
Closes #3009: indexof1d to handle null values by @stress-tess in #3169
Compatibility improvements for Xarray chunk-manager by @jeremiah-corrado in #3223
Closes #3215: Index.__get__item can accept a list by @ajpotts in #3216
Closes #3217: MultiIndex.get_level_values by @ajpotts in #3218
Move some definitions from ArrowFunctions header to source by @e-kayrakli in #3236
Reduce file size for csvIO benchmark by @ShreyasKhandekar in #3239
Part of #3229: CI failures due to indexof1d by @stress-tess in #3232
Fixes #3074: Bug reading segarrays from parquet files by @stress-tess in #3233
Closes #3227 add pandas stubs library by @ajpotts in #3228
Closes #3213 Index properties by @ajpotts in #3214
Add implementations of clip, diff,pad to Array API module by @jeremiah-corrado in #3237
Closes #3188 multi index.equals by @ajpotts in #3225
Fixes #3222: series of segarray print bug by @stress-tess in #3240
Fixing a missing iloc usage by @brandon-neth in #3243
Closes #3249: Fix issue with finding incorrect conftest file for proto tests by @bmcdonald3 in #3250
Fixes #3234: segarray with empty segments and nans parquet bug by @stress-tess in #3241
Array API Documentation by @jeremiah-corrado in #3242
Fixes #3252: proto test_segarray_read failure with multi-locale by @stress-tess in #3254
Closes #3255 move numeric.floor to numpy module by @ajpotts in #3257
Remove single-column cases from multi-col-merge test. by @brandon-neth in #3248
Benchmark Display P...

Contributors

lydia-duncan, e-kayrakli, and 8 other contributors

Assets 2

19 Apr 20:37

stress-tess

v2024.04.19

8ac2645

Release Notes v2024.04.19

Bug Fixes

PR #3091 - Fixes Parquet double reads to properly account for null values
Issue #3087 - Fixes bug when reading non-float parquet columns with null values
Issue #3088 and PR #3090 - Fix an off by 1 bug in sparse_sum_helper

Major Updates

Issue #3083 - Optimizes Parquet Strings read
Issues #3033, #3054 - Optimize CSV write
Issues #3020, #3040 - Adds nan functions to DataFrame and Series
- isna, notna, dropna, ...
Issues #3071, #3084 - Add permutation and shuffle to random number generators
Issue #3030 - Creates numpy subdirectory as part of the alignment effort
PRs #3056, #3093, #3070, #3072 - Improves and adds Array API functionality including manipulation and set functions

Minor Updates

PR #3076 - Adds support for large string Parquet type
Issue #3092 - Adds support for TLS token authentication
Issue #3045 - Adds map method to Index
Issue #3065 - Adds count to DataFrame
Issue #2913 - Adds isdecimal to Strings
Issue #3002 - Adds clip to pdarray
Issue #3062 - Enhances arkouda metrics capability

Auto-Generated Release Notes

Closes #3030 numpy alignment directory structure by @ajpotts in #3038
Closes #3040 isna and notna for series by @ajpotts in #3048
Closes #3033: Optimize CSV write by @stress-tess in #3053
Closes #3063: Fix deprecation warnings by @stress-tess in #3064
Closes #2913: add isdecimal by @jaketrookman in #3015
Closes #3045 index.map by @ajpotts in #3057
Array API Set functions by @jeremiah-corrado in #3070
Fix --print-used-modules for functions registered with @arkouda.registerND by @jeremiah-corrado in #3072
Closes #3054: Dynamically switch to batching for larger csv writes by @stress-tess in #3061
Closes #3002: Add ak.clip functionality by @drculhane in #3043
enhance arkouda metrics capability by @hokiegeek2 in #3067
Partially addresses issue #3050, updating PROTO tests. by @drculhane in #3075
Add support for large string Parquet type by @bmcdonald3 in #3076
Closes #3071: Add permutation to our generators by @stress-tess in #3078
Closes #3065 DataFrame.count by @ajpotts in #3081
Part of #3088: Generate seed for sparse sum test by @stress-tess in #3090
Fix Parquet double reads to properly account for null values by @bmcdonald3 in #3091
Array API Manipulation Function Improvements by @jeremiah-corrado in #3056
Closes #3084: Add shuffle to random number generators by @stress-tess in #3085
Pdarray indexing by @jeremiah-corrado in #3093
Fixes #3087: Failure reading non-float parquet columns with null values by @stress-tess in #3094
Closes #3083: Optimize Parquet string read code by @bmcdonald3 in #3082
add support for TLS token authentication by @hokiegeek2 in #3096
Fixes #3088: sparse sum nightly failures by @stress-tess in #3098
Closes #3020 dataframe.dropna by @ajpotts in #3101

New Contributors

@drculhane made their first contribution in #3043

Full Changelog: v2024.03.18...v2024.04.19

Contributors

hokiegeek2, ajpotts, and 5 other contributors

Assets 2

18 Mar 22:51

stress-tess

v2024.03.18

e07f70e

Release Notes v2024.03.18

Bug Fixes

Issue #3035 - Fixes inconsistent results when broadcasting with empty segments
Issue #2939 - Fixes TypeError in DataFrame.reset_index
Issue #2966 - Fixes error when pip installing from a tar file
Issue #2897 - Fixes bug where DataFrame.corr returns DataFrame without index
PR #3021 - Adds SegArray optimization and benchmark bug fix

Major Updates

Issue #2958 - Renames akstats to akscipy
Issue #2942 - Removes DataFrame.sorted
Issue #3024 and PR #2976 - Add sparse sum helper to util with merge based and sort based workflows
Issues #2993, #3008, #3017 - Add a random subfolder and stateful Generator objects
Issue #2974 - Adds Series.map
Issue #3019 - Adds outer join option to DataFrame merge
PRs #2936, #2967, #3014, #3027 - Improve Array API functionality specifically adding stats and manipulation functions

Minor Updates

Issue #2929 - Updates DataFrame.size to match pandas
Issues #2906, #2945 - Add shift operators between 2 bool pdarrays and between a combination bool and int64 pdarrays
Issues #2916, #2919 - Add isspace and capitalize to Strings
Issue #3023 - Adds to_markdown to DataFrame and Series
Issue #2957 - Adds Dot Function
Issue #2960 - Adds memory_usage functions
Issue #2924 - Updates DataFrame documentation
Issue #2896 - Updates DataFrame columns to return an Index
Issue #2952 - Makes Chapel 1.33 release default for CI testing
Issue #2985 - Updates libzmq version in Makefile
Issue #2981 - adds LICENSES folder including the licenses for numpy, pandas, and scipy
Issues #2969, #2971, #2977, #2989 - Update failing proto_tests

Auto-Generated Release Notes

Add sort compat modules for new sorting algorithm by @bmcdonald3 in #2941
Closes #2906 shift operator for boolean vectors by @ajpotts in #2944
Closes #2916 add isspace for pdarrays by @ajpotts in #2946
Closes #2949: Add compat modules for 1.34 by @bmcdonald3 in #2950
Closes #2952: Make Chapel 1.33 release default for CI testing by @bmcdonald3 in #2951
Remove deprecation warnings about domain(?) vs. domain by @jeremiah-corrado in #2953
Closes #2919 add capitalize to pdarrays by @ajpotts in #2948
Closes #2945: add Shift Operators for Boolean and Int64 by @jaketrookman in #2954
Closes #2924 update pydoc strings for arkouda dataframe by @ajpotts in #2943
Closes #2942 bug in sorted by @ajpotts in #2955
Closes #2958 rename akstats to akscipy by @ajpotts in #2959
Closes #2929-dataframe-size-to-match-pandas by @ajpotts in #2961
Closes #2963-PROTO_tests-tests-akscipy-unit-tests-failing by @ajpotts in #2964
Fixes #2966: pip install from tar error by @stress-tess in #2968
Array API manipulation functions by @jeremiah-corrado in #2936
Add sparse sum helper to util by @stress-tess in #2976
Closes #2969 PROTO_tests/tests/client_test.py unit tests failing by @ajpotts in #2970
Closes #2971 PROTO_tests/tests/dtypes_test.py unit tests failing by @ajpotts in #2972
Closes #2977 PROTO_tests/tests/setops_test.py unit tests failing by @ajpotts in #2979
Address Random.choice deprecation by @jeremiah-corrado in #2983
Closes #2985: Update libzmq version in Makefile by @stress-tess in #2986
Resolve formatting issue in NumpyDType server docs by @jeremiah-corrado in #2992
Closes #2981 add licenses by @ajpotts in #2988
Closes #2989 PROTO_tests/tests/pdarray_creation_test.py has failing test by @ajpotts in #2990
Remove deprecation messages for reader/writer locking default change by @jeremiah-corrado in #2987
Closes #2994: Remove upper bound on pandas version by @stress-tess in #2995
Closes #2896 DataFrame columns should return an Index by @ajpotts in #2962
Closes #2993: Create random subfolder and foundation for generator by @stress-tess in #2997
Closes #2957: Dot Function by @jaketrookman in #2996
Closes #2934 DataFrame.unregister_dataframe_by_name string return typ… by @ajpotts in #3011
Closes #2897 DataFrame.corr returns dataframe without index by @ajpotts in #3012
Closes #2939 DataFrame.reset_index by @ajpotts in #3013
Array API stats functions by @jeremiah-corrado in #2967
Support for creating Array API objects from numpy arrays by @jeremiah-corrado in #3014
Closes #2960 int version of memory_usage by @ajpotts in #3018
SegArray optimization & bug fix by @brandon-neth in #3021
Closes #3019 Add outer join option for dataframe merge by @ajpotts in #3022
Closes #3031: Update Arkouda for upcoming Chapel 2.0 release by @bmcdonald3 in #3032
Closes #3008: Add generator sym entry and stateful uniform distribution by @stress-tess in #3016
Closes #2974 Series.map by @ajpotts in #3010
Closes #3024: Add merge based workflow and update sort workflow for sparse sum helper by @stress-tess in #3025
Remove I/O-locking and randomStream.skipTo deprecation messages by @jeremiah-corrado in #3037
Closes #3023 to_markdown by @ajpotts in #3026
Array API improvements by @jeremiah-corrado in #3027
Fixes #3035 - Inconsistent results when broadcasting with empty segments by @stress-tess in #3039
Closes #3017: Add documentation for our random number generation by @stress-tess in #3044

Full Changelog: v2024.02.02...v2024.03.18

Contributors

ajpotts, bmcdonald3, and 4 other contributors

Assets 2

Releases: Bears-R-Us/arkouda

Release Notes v2025.12.16

Arkouda v2025.12.16

Supported environments and dependencies

Notable dependency requirements

Highlights

Multi-Dimensional Array Expansion

Distributed Performance & Algorithms

pandas Integration & ExtensionArray Progress

Developer Experience & CI Modernization

Tooling Cleanup & Code Quality

Bug Fixes & Correctness

What's Changed

Contributors

Uh oh!

Release Notes v2025.09.30

Release Notes

Supported environments and dependencies

Notable dependency requirements

Major Changes

Minor Changes

Bug Fixes

What's Changed

Contributors

Uh oh!

Release Notes v2025.08.20

Introduction

Major Changes

Minor Changes

Benchmarks

Documentation

CI / Testing / Infra

Other

Bug Fixes

What's Changed

Contributors

Uh oh!

Release Notes v2025.07.03

Arkouda v2025.07.03

Features

API Enhancements and Compatibility

Performance Improvements

Deprecations and Refactors

Contributors

Uh oh!

Release Notes v2025.01.13

Bug Fixes

Major changes

Minor changes

New Contributors

Contributors

Uh oh!

Release Notes v2024.12.06

Bug Fixes

Major changes

Minor changes

New Contributors

Contributors

Uh oh!

Release Notes v2024.10.02

Bug Fixes

Major Updates

Minor Updates

Contributors

Uh oh!

Release Notes v2024.06.21

Bug Fixes

Major Updates

Minor Updates

Contributors

Uh oh!

Release Notes v2024.04.19

Bug Fixes

Major Updates

Minor Updates

New Contributors

Contributors

Uh oh!

Release Notes v2024.03.18

Bug Fixes