hybridarray

A NumPy-inspired array library for Go, providing efficient columnar data structures with zero-copy views.

Overview

hybridarray combines three key data structures:

Map: O(1) column name lookups (like NumPy's structured array field access)
Linked list: Preserves insertion order for column iteration
Columnar arrays: Cache-friendly data access patterns (like NumPy's contiguous storage)

This hybrid approach enables NumPy-like operations with Go's type safety and performance.

Features

✅ Zero-copy views: Slicing and column selection without data duplication
✅ Type-aware columns: Runtime type information with DType
✅ Ordered iteration: Preserves column insertion order
✅ View composition: Views of views work correctly
✅ Minimal API: Small, focused surface area for research workflows

Installation

go get github.com/alexshd/hybridarray

Quick Start

package main

import (
    "fmt"
    "github.com/alexshd/hybridarray"
)

func main() {
    // Create from map
    arr, _ := hybridarray.FromMap(map[string][]any{
        "x": {1.0, 2.0, 3.0, 4.0, 5.0},
        "y": {10.0, 20.0, 30.0, 40.0, 50.0},
    })

    fmt.Println(arr.Shape()) // (5, 2)

    // Zero-copy slice (rows 1-3)
    view := arr.Slice(1, 4)

    // Select columns
    xy := arr.Select("x", "y")

    // Access values
    val, _ := arr.At(2, "x") // 3.0

    // Iterate columns
    for col := range arr.Columns() {
        fmt.Printf("%s: %v\n", col.Name, col.Data)
    }
}

API Reference

Creating Arrays

// New array with specified rows
arr := hybridarray.New(100)

// From map of columns
arr, err := hybridarray.FromMap(map[string][]any{
    "temperature": {20.5, 21.0, 19.8},
    "humidity":    {65.0, 68.0, 62.0},
})

Adding Columns

data := []any{1.0, 2.0, 3.0}
err := arr.AddColumn("sensor", data, hybridarray.DTypeFloat64)

Accessing Data

// Single value
val, err := arr.At(row, "column")

// Full row as map
row, err := arr.Row(5)

// Column lookup
col := arr.GetColumn("temperature")

// Shape
nrows, ncols := arr.Shape()

// Column names
names := arr.ColumnNames()

Zero-Copy Operations

// Row slicing [start:end)
view, err := arr.Slice(10, 100)

// Column selection
view, err := arr.Select("x", "y", "z")

// Combine operations
filtered := arr.Slice(50, 150).Select("sensor", "value")

Iteration

// Range over columns (Go 1.23+ iter.Seq)
for col := range arr.Columns() {
    fmt.Printf("%s (%s): %d values\n",
        col.Name, col.DType, len(col.Data))
}

Data Types

const (
    DTypeFloat64  // float64, float32
    DTypeInt64    // int, int64, int32, int16, int8
    DTypeString   // string
    DTypeBool     // bool
    DTypeAny      // any (type-erased)
)

Types are inferred automatically in FromMap or can be specified in AddColumn.

Zero-Copy Semantics

Views created with Slice() and Select() share underlying data:

arr, _ := hybridarray.FromMap(map[string][]any{
    "x": {1.0, 2.0, 3.0, 4.0, 5.0},
})

view := arr.Slice(1, 4) // Rows 1-3

// Modifying original data affects view
arr.GetColumn("x").Data[2] = 99.0

val, _ := view.At(1, "x") // 99.0 (sees the change)

This enables efficient data pipelines without copying large arrays.

Performance

Benchmarks on M1 MacBook Pro (example):

BenchmarkFromMap-8              50000    25000 ns/op
BenchmarkSlice-8              5000000      250 ns/op   (zero-copy)
BenchmarkSelect-8             1000000     1500 ns/op   (zero-copy)
BenchmarkAt-8                20000000       65 ns/op
BenchmarkGetColumn-8        100000000       12 ns/op   (map lookup)

Run benchmarks:

go test -bench=. -benchmem

Testing

# Unit tests
go test -v

# Fuzz tests (Go 1.18+)
go test -fuzz=FuzzFromMap -fuzztime=30s
go test -fuzz=FuzzSlice -fuzztime=30s
go test -fuzz=FuzzSelect -fuzztime=30s

# Race detection
go test -race

# Coverage
go test -cover

Design Philosophy

hybridarray is designed as a minimal reference implementation for scientific computing workflows. It prioritizes:

Simplicity: Small API surface, easy to understand
Zero-copy: Memory-efficient view semantics
Type awareness: Runtime type info without generics overhead
Research-friendly: Quick iteration on data transformations

It is not designed for:

Production databases (no ACID guarantees)
Distributed computing (single-machine only)
Complex query optimization (no query planner)

NumPy Comparison

Similar to NumPy's ndarray:

Zero-copy slicing (like NumPy views)
Typed columns (analogous to structured arrays with dtype)
Efficient iteration
Field-based access (like structured arrays: arr['field'])

Different from NumPy:

Columnar storage instead of row-major (more like pandas DataFrame)
Map-based field lookup (O(1) like NumPy's structured arrays)
No multi-dimensional indexing yet (1D + columns only)
No broadcasting or vectorized operations (yet)

Direct NumPy API Equivalents:

# NumPy structured arrays
arr = np.array([(1, 2.5), (2, 3.5)], dtype=[('x', 'i4'), ('y', 'f8')])
view = arr[10:20]  # Zero-copy slice
x_col = arr['x']   # Field access

# hybridarray (Go)
arr := FromMap(map[string][]any{"x": {1, 2}, "y": {2.5, 3.5}})
view := arr.Slice(10, 20)  // Zero-copy slice
x_col := arr.GetColumn("x") // Field access

Go 1.25.3 Features

Uses latest Go features:

iter.Seq for range-over-func column iteration (Go 1.23+)
Improved generic type inference
Enhanced fuzzing support

Future Enhancements

Potential NumPy-inspired additions (not implemented):

Vectorized operations: Add(), Mul(), Apply() (like NumPy ufuncs)
Aggregations: Sum(), Mean(), Std() (like NumPy reductions)
Boolean indexing: Where(predicate) (like NumPy fancy indexing)
Sorting: Sort(), Argsort() (like NumPy sorting)
Set operations: Unique(), Intersect() (like NumPy set routines)
Broadcasting: Automatic shape alignment (NumPy's killer feature)
Multi-dimensional: True ndarray with arbitrary dimensions

Contributing

This is a minimal reference implementation inspired by NumPy. For production scientific computing in Go, consider:

Apache Arrow Go
Gota DataFrames
Custom columnar stores

License

Apache License 2.0 - See LICENSE file for details.

Credits

Primary inspiration: NumPy's ndarray architecture

Additional influences:

NumPy structured arrays (field access, zero-copy views, dtype system)
pandas DataFrame API (columnar storage, named columns)
Apache Arrow columnar format (memory layout)

This is a learning/research implementation to understand NumPy's design principles in Go.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md
array.go		array.go
array_test.go		array_test.go
benchmark_test.go		benchmark_test.go
example_test.go		example_test.go
fuzz_test.go		fuzz_test.go
go.mod		go.mod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

hybridarray

Overview

Features

Installation

Quick Start

API Reference

Creating Arrays

Adding Columns

Accessing Data

Zero-Copy Operations

Iteration

Data Types

Zero-Copy Semantics

Performance

Testing

Design Philosophy

NumPy Comparison

Go 1.25.3 Features

Future Enhancements

Contributing

License

Credits

About

Uh oh!

Releases

Packages

Languages

License

alexshd/hybridarray

Folders and files

Latest commit

History

Repository files navigation

hybridarray

Overview

Features

Installation

Quick Start

API Reference

Creating Arrays

Adding Columns

Accessing Data

Zero-Copy Operations

Iteration

Data Types

Zero-Copy Semantics

Performance

Testing

Design Philosophy

NumPy Comparison

Go 1.25.3 Features

Future Enhancements

Contributing

License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages