Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
1645e11
feat(unixfs): add SizeEstimationMode for HAMT threshold decisions
lidel Jan 16, 2026
5c4e853
feat(unixfs): add UnixFSProfile for IPIP-499 CID determinism
lidel Jan 16, 2026
d28c8e5
feat(files): add DereferenceSymlinks option for IPIP-499
lidel Jan 17, 2026
4ff72d0
chore(unixfs): remove unnecessary uint64 conversions
lidel Jan 17, 2026
6707376
fix(unixfs): align HAMT sharding threshold with JS implementation
lidel Jan 19, 2026
ffe1a9c
chore: gofmt and changelog PR references
lidel Jan 19, 2026
ebdaf07
fix: nil filter check and thread-safety docs
lidel Jan 20, 2026
3895e9e
fix: correct go-ipfs-chunker URL in comment
lidel Jan 20, 2026
486f900
Add circular symlink test
gammazero Jan 21, 2026
5cf2219
Merge branch 'main' into feat/ipip-499-unixfs-2025
gammazero Jan 21, 2026
3e8339c
feat(unixfs): optimize SizeEstimationBlock and add mode/mtime tests
lidel Jan 27, 2026
d6ef697
refactor(unixfs): unify size tracking and make SizeEstimationMode imm…
lidel Jan 27, 2026
6141039
refactor(unixfs): use arithmetic for exact block size calculation
lidel Jan 27, 2026
56cf0ae
Merge remote-tracking branch 'origin/main' into feat/ipip-499-unixfs-…
lidel Jan 28, 2026
009ea0b
docs(unixfs): clarify protobuf tag encoding comments
lidel Jan 28, 2026
c95efb5
docs(unixfs): clarify varintLen and negative timestamp encoding
lidel Jan 30, 2026
c910c48
fix(mfs): produce raw leaves for single-block files when RawLeaves=true
lidel Feb 1, 2026
54e044f
feat(mfs): add RootOption for chunker, maxLinks, and sizeEstimationMode
lidel Feb 2, 2026
3c593af
test(mfs): add tests for RootOption propagation
lidel Feb 2, 2026
ac97424
feat(mfs): add WithMaxHAMTFanout and WithHAMTShardingSize RootOptions
lidel Feb 2, 2026
7884ae2
fix(unixfs/mod): update curNode after sparse expansion
lidel Feb 2, 2026
eaaff60
chore: modernize for loops and enhance package docs
lidel Feb 3, 2026
c3efc5f
fix(unixfs): preserve mode/mtime during HAMT conversions
lidel Feb 3, 2026
e728f8b
fix(mfs): propagate Chunker to parent directories in Mkdir
lidel Feb 3, 2026
c6829fe
fix(unixfs): fix CI failures in directory_test.go
lidel Feb 3, 2026
0a22cde
refactor(unixfs): simplify maxLinks check in addLinkChild
lidel Feb 3, 2026
5d1c720
fix(unixfs/mod): check Mode in maybeCollapseToRawLeaf
lidel Feb 3, 2026
f34e528
docs(unixfs/mod): add doc.go with package documentation
lidel Feb 3, 2026
1e30b95
Merge remote-tracking branch 'origin/main' into feat/ipip-499-unixfs-…
lidel Feb 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,21 +16,29 @@ The following emojis are used to highlight certain changes:

### Added

- `ipld/unixfs/io`: added `SizeEstimationMode` for configurable HAMT sharding threshold decisions. Supports legacy link-based estimation (`SizeEstimationLinks`), accurate block-based estimation (`SizeEstimationBlock`), or disabling size-based thresholds (`SizeEstimationDisabled`). [#1088](https://github.com/ipfs/boxo/pull/1088), [IPIP-499](https://github.com/ipfs/specs/pull/499)
- `ipld/unixfs/io`: added `UnixFSProfile` with `UnixFS_v0_2015` and `UnixFS_v1_2025` presets for CID-deterministic file and directory DAG construction. [#1088](https://github.com/ipfs/boxo/pull/1088), [IPIP-499](https://github.com/ipfs/specs/pull/499)
- `files`: `NewSerialFileWithOptions` now supports controlling whether symlinks are preserved or dereferenced before being added to IPFS. See `SerialFileOptions.DereferenceSymlinks`. [#1088](https://github.com/ipfs/boxo/pull/1088), [IPIP-499](https://github.com/ipfs/specs/pull/499)

### Changed

- 🛠 `chunker`: `DefaultBlockSize` changed from `const` to `var` to allow runtime configuration via global profiles. [#1088](https://github.com/ipfs/boxo/pull/1088), [IPIP-499](https://github.com/ipfs/specs/pull/499)
- `gateway`: ✨ [IPIP-523](https://github.com/ipfs/specs/pull/523) `?format=` URL query parameter now takes precedence over `Accept` HTTP header, ensuring deterministic HTTP cache behavior and allowing browsers to use `?format=` even when they send `Accept` headers with specific content types. [#1074](https://github.com/ipfs/boxo/pull/1074)

### Removed

### Fixed

- 🛠 `ipld/unixfs/io`: fixed HAMT sharding threshold comparison to use `>` instead of `>=`. A directory exactly at the threshold now stays as a basic (flat) directory, aligning behavior with code documentation and the JS implementation. This is a theoretical breaking change, but unlikely to impact real-world users as it requires a directory to be exactly at the threshold boundary. If you depend on the old behavior, adjust `HAMTShardingSize` to be 1 byte lower. [#1088](https://github.com/ipfs/boxo/pull/1088), [IPIP-499](https://github.com/ipfs/specs/pull/499)
- `ipld/unixfs/mod`: fixed sparse file writes in MFS. Writing past the end of a file (e.g., `ipfs files write --offset 1000 /file` on a smaller file) would lose data because `expandSparse` created the zero-padding node but didn't update the internal pointer. Subsequent writes went to the old unexpanded node.
- `ipld/unixfs/io`: fixed mode/mtime metadata loss during Basic<->HAMT directory conversions. Previously, directories with `WithStat(mode, mtime)` would lose this metadata when converting between basic and sharded formats, or when reloading a HAMT directory from disk.

### Security


## [v0.36.0]

### Added

- `routing/http`: `GET /routing/v1/dht/closest/peers/{key}` per [IPIP-476](https://github.com/ipfs/specs/pull/476)
- `ipld/merkledag`: Added fetched node size reporting to the progress tracker. See [kubo#8915](https://github.com/ipfs/kubo/issues/8915)
- `gateway`: Added a configurable fallback timeout for the gateway handler, defaulting to 1 hour. Configurable via `MaxRequestDuration` in the gateway config.
Expand Down
15 changes: 9 additions & 6 deletions chunker/parse.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,16 @@ import (
"strings"
)

const (
// DefaultBlockSize is the chunk size that splitters produce (or aim to).
DefaultBlockSize int64 = 1024 * 256
// DefaultBlockSize is the chunk size that splitters produce (or aim to).
// Can be modified to change the default for all subsequent chunker operations.
// For CID-deterministic imports, prefer using UnixFSProfile presets from
// ipld/unixfs/io/profile.go which set this and other related globals.
var DefaultBlockSize int64 = 1024 * 256

// No leaf block should contain more than 1MiB of payload data ( wrapping overhead aside )
// This effectively mandates the maximum chunk size
// See discussion at https://github.com/ipfs/boxo/chunker/pull/21#discussion_r369124879 for background
const (
// ChunkSizeLimit is the maximum allowed chunk size.
// No leaf block should contain more than 1MiB of payload data (wrapping overhead aside).
// See discussion at https://github.com/ipfs/go-ipfs-chunker/pull/21#discussion_r369124879
ChunkSizeLimit int = 1048576
)

Expand Down
74 changes: 59 additions & 15 deletions files/serialfile.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,37 @@ import (
"time"
)

// SerialFileOptions configures file traversal behavior for NewSerialFileWithOptions.
type SerialFileOptions struct {
// Filter determines which files to include or exclude during traversal.
// If nil, all files are included.
Filter *Filter

// DereferenceSymlinks controls symlink handling during file traversal.
// When false (default), symlinks are stored as UnixFS nodes with
// Data.Type=symlink (4) containing the target path as specified in
// https://specs.ipfs.tech/unixfs/
// When true, symlinks are dereferenced and replaced with their target:
// symlinks to files become regular file nodes, symlinks to directories
// are traversed recursively.
DereferenceSymlinks bool
}

// serialFile implements Node, and reads from a path on the OS filesystem.
// No more than one file will be opened at a time.
type serialFile struct {
path string
files []os.FileInfo
stat os.FileInfo
filter *Filter
path string
files []os.FileInfo
stat os.FileInfo
filter *Filter
dereferenceSymlinks bool
}

type serialIterator struct {
files []os.FileInfo
path string
filter *Filter
files []os.FileInfo
path string
filter *Filter
dereferenceSymlinks bool

curName string
curFile Node
Expand All @@ -44,10 +62,25 @@ func NewSerialFile(path string, includeHidden bool, stat os.FileInfo) (Node, err
return NewSerialFileWithFilter(path, filter, stat)
}

// NewSerialFileWith takes a filepath, a filter for determining which files should be
// NewSerialFileWithFilter takes a filepath, a filter for determining which files should be
// operated upon if the filepath is a directory, and a fileInfo and returns a
// Node representing file, directory or special file.
func NewSerialFileWithFilter(path string, filter *Filter, stat os.FileInfo) (Node, error) {
return NewSerialFileWithOptions(path, stat, SerialFileOptions{Filter: filter})
}

// NewSerialFileWithOptions creates a Node from a filesystem path with configurable options.
// The stat parameter should be obtained via os.Lstat (not os.Stat) to correctly detect symlinks.
func NewSerialFileWithOptions(path string, stat os.FileInfo, opts SerialFileOptions) (Node, error) {
// If dereferencing symlinks and this is a symlink, stat the target instead
if opts.DereferenceSymlinks && stat.Mode()&os.ModeSymlink != 0 {
targetStat, err := os.Stat(path) // follows symlink
if err != nil {
return nil, err
}
stat = targetStat
}

switch mode := stat.Mode(); {
case mode.IsRegular():
file, err := os.Open(path)
Expand All @@ -70,8 +103,15 @@ func NewSerialFileWithFilter(path string, filter *Filter, stat os.FileInfo) (Nod
}
contents = append(contents, content)
}
return &serialFile{path, contents, stat, filter}, nil
return &serialFile{
path: path,
files: contents,
stat: stat,
filter: opts.Filter,
dereferenceSymlinks: opts.DereferenceSymlinks,
}, nil
case mode&os.ModeSymlink != 0:
// Only reached if DereferenceSymlinks is false
target, err := os.Readlink(path)
if err != nil {
return nil, err
Expand All @@ -98,7 +138,7 @@ func (it *serialIterator) Next() bool {

stat := it.files[0]
it.files = it.files[1:]
for it.filter.ShouldExclude(stat) {
for it.filter != nil && it.filter.ShouldExclude(stat) {
if len(it.files) == 0 {
return false
}
Expand All @@ -113,7 +153,10 @@ func (it *serialIterator) Next() bool {
// recursively call the constructor on the next file
// if it's a regular file, we will open it as a ReaderFile
// if it's a directory, files in it will be opened serially
sf, err := NewSerialFileWithFilter(filePath, it.filter, stat)
sf, err := NewSerialFileWithOptions(filePath, stat, SerialFileOptions{
Filter: it.filter,
DereferenceSymlinks: it.dereferenceSymlinks,
})
if err != nil {
it.err = err
return false
Expand All @@ -130,9 +173,10 @@ func (it *serialIterator) Err() error {

func (f *serialFile) Entries() DirIterator {
return &serialIterator{
path: f.path,
files: f.files,
filter: f.filter,
path: f.path,
files: f.files,
filter: f.filter,
dereferenceSymlinks: f.dereferenceSymlinks,
}
}

Expand All @@ -156,7 +200,7 @@ func (f *serialFile) Size() (int64, error) {
return err
}

if f.filter.ShouldExclude(fi) {
if f.filter != nil && f.filter.ShouldExclude(fi) {
if fi.Mode().IsDir() {
return filepath.SkipDir
}
Expand Down
Loading