A NumPy-inspired array library for Go, providing efficient columnar data structures with zero-copy views.
hybridarray combines three key data structures:
- Map: O(1) column name lookups (like NumPy's structured array field access)
- Linked list: Preserves insertion order for column iteration
- Columnar arrays: Cache-friendly data access patterns (like NumPy's contiguous storage)
This hybrid approach enables NumPy-like operations with Go's type safety and performance.
- ✅ Zero-copy views: Slicing and column selection without data duplication
- ✅ Type-aware columns: Runtime type information with DType
- ✅ Ordered iteration: Preserves column insertion order
- ✅ View composition: Views of views work correctly
- ✅ Minimal API: Small, focused surface area for research workflows
go get github.com/alexshd/hybridarraypackage main
import (
"fmt"
"github.com/alexshd/hybridarray"
)
func main() {
// Create from map
arr, _ := hybridarray.FromMap(map[string][]any{
"x": {1.0, 2.0, 3.0, 4.0, 5.0},
"y": {10.0, 20.0, 30.0, 40.0, 50.0},
})
fmt.Println(arr.Shape()) // (5, 2)
// Zero-copy slice (rows 1-3)
view := arr.Slice(1, 4)
// Select columns
xy := arr.Select("x", "y")
// Access values
val, _ := arr.At(2, "x") // 3.0
// Iterate columns
for col := range arr.Columns() {
fmt.Printf("%s: %v\n", col.Name, col.Data)
}
}// New array with specified rows
arr := hybridarray.New(100)
// From map of columns
arr, err := hybridarray.FromMap(map[string][]any{
"temperature": {20.5, 21.0, 19.8},
"humidity": {65.0, 68.0, 62.0},
})data := []any{1.0, 2.0, 3.0}
err := arr.AddColumn("sensor", data, hybridarray.DTypeFloat64)// Single value
val, err := arr.At(row, "column")
// Full row as map
row, err := arr.Row(5)
// Column lookup
col := arr.GetColumn("temperature")
// Shape
nrows, ncols := arr.Shape()
// Column names
names := arr.ColumnNames()// Row slicing [start:end)
view, err := arr.Slice(10, 100)
// Column selection
view, err := arr.Select("x", "y", "z")
// Combine operations
filtered := arr.Slice(50, 150).Select("sensor", "value")// Range over columns (Go 1.23+ iter.Seq)
for col := range arr.Columns() {
fmt.Printf("%s (%s): %d values\n",
col.Name, col.DType, len(col.Data))
}const (
DTypeFloat64 // float64, float32
DTypeInt64 // int, int64, int32, int16, int8
DTypeString // string
DTypeBool // bool
DTypeAny // any (type-erased)
)Types are inferred automatically in FromMap or can be specified in AddColumn.
Views created with Slice() and Select() share underlying data:
arr, _ := hybridarray.FromMap(map[string][]any{
"x": {1.0, 2.0, 3.0, 4.0, 5.0},
})
view := arr.Slice(1, 4) // Rows 1-3
// Modifying original data affects view
arr.GetColumn("x").Data[2] = 99.0
val, _ := view.At(1, "x") // 99.0 (sees the change)This enables efficient data pipelines without copying large arrays.
Benchmarks on M1 MacBook Pro (example):
BenchmarkFromMap-8 50000 25000 ns/op
BenchmarkSlice-8 5000000 250 ns/op (zero-copy)
BenchmarkSelect-8 1000000 1500 ns/op (zero-copy)
BenchmarkAt-8 20000000 65 ns/op
BenchmarkGetColumn-8 100000000 12 ns/op (map lookup)
Run benchmarks:
go test -bench=. -benchmem# Unit tests
go test -v
# Fuzz tests (Go 1.18+)
go test -fuzz=FuzzFromMap -fuzztime=30s
go test -fuzz=FuzzSlice -fuzztime=30s
go test -fuzz=FuzzSelect -fuzztime=30s
# Race detection
go test -race
# Coverage
go test -coverhybridarray is designed as a minimal reference implementation for scientific computing workflows. It prioritizes:
- Simplicity: Small API surface, easy to understand
- Zero-copy: Memory-efficient view semantics
- Type awareness: Runtime type info without generics overhead
- Research-friendly: Quick iteration on data transformations
It is not designed for:
- Production databases (no ACID guarantees)
- Distributed computing (single-machine only)
- Complex query optimization (no query planner)
Similar to NumPy's ndarray:
- Zero-copy slicing (like NumPy views)
- Typed columns (analogous to structured arrays with dtype)
- Efficient iteration
- Field-based access (like structured arrays:
arr['field'])
Different from NumPy:
- Columnar storage instead of row-major (more like pandas DataFrame)
- Map-based field lookup (O(1) like NumPy's structured arrays)
- No multi-dimensional indexing yet (1D + columns only)
- No broadcasting or vectorized operations (yet)
Direct NumPy API Equivalents:
# NumPy structured arrays
arr = np.array([(1, 2.5), (2, 3.5)], dtype=[('x', 'i4'), ('y', 'f8')])
view = arr[10:20] # Zero-copy slice
x_col = arr['x'] # Field access
# hybridarray (Go)
arr := FromMap(map[string][]any{"x": {1, 2}, "y": {2.5, 3.5}})
view := arr.Slice(10, 20) // Zero-copy slice
x_col := arr.GetColumn("x") // Field accessUses latest Go features:
iter.Seqfor range-over-func column iteration (Go 1.23+)- Improved generic type inference
- Enhanced fuzzing support
Potential NumPy-inspired additions (not implemented):
- Vectorized operations:
Add(),Mul(),Apply()(like NumPy ufuncs) - Aggregations:
Sum(),Mean(),Std()(like NumPy reductions) - Boolean indexing:
Where(predicate)(like NumPy fancy indexing) - Sorting:
Sort(),Argsort()(like NumPy sorting) - Set operations:
Unique(),Intersect()(like NumPy set routines) - Broadcasting: Automatic shape alignment (NumPy's killer feature)
- Multi-dimensional: True ndarray with arbitrary dimensions
This is a minimal reference implementation inspired by NumPy. For production scientific computing in Go, consider:
- Apache Arrow Go
- Gota DataFrames
- Custom columnar stores
Apache License 2.0 - See LICENSE file for details.
Copyright 2025 Alex Shadrin
Primary inspiration: NumPy's ndarray architecture
Additional influences:
- NumPy structured arrays (field access, zero-copy views, dtype system)
- pandas DataFrame API (columnar storage, named columns)
- Apache Arrow columnar format (memory layout)
This is a learning/research implementation to understand NumPy's design principles in Go.