Skip to content

Conversation

@Technophobe01
Copy link

Problem

When running xray on Docker images with gzip-compressed layers (which is the default for most Docker Hub images), the command fails with errors like:

error="archive/tar: invalid tar header"

or

error="unexpected EOF"

This happens because LoadPackage in pkg/docker/dockerimage/dockerimage.go tries to read gzip-compressed layer blobs directly as tar archives without decompressing them first.

Root Cause

OCI/Docker images typically use gzip-compressed layers with media types like:

  • application/vnd.docker.image.rootfs.diff.tar.gzip
  • application/vnd.oci.image.layer.v1.tar+gzip

The existing code at line ~1133 passes the raw stream to tar.NewReader():

layer, err := layerFromStream(
    pkg,
    hdr.Name,
    tar.NewReader(tr),  // <- reads gzip data as tar, fails
    ...
)

There's also a TODO comment at line 926 acknowledging this:

// todo: add support for oci.MediaTypeImageLayerGzip and oci.MediaTypeImageLayerZstd

Solution

This PR adds gzip decompression support by:

  1. Checking the OCI manifest media type stored in nonLayerFileNames for gzip indication
  2. Attempting gzip decompression using gzip.NewReader() which validates the gzip magic header
  3. Falling back to raw tar reading if the data is not gzip-compressed
  4. Properly reporting errors when media type indicates gzip but decompression fails

Testing

Tested with various Docker Hub images that use gzip-compressed layers:

  • autonomousplane/chatgpt-clone:latest (Python)
  • autonomousplane/carbon.now.sh:latest (Node.js)
  • Standard alpine, node, python images

Before fix: archive/tar: invalid tar header error
After fix: xray completes successfully with state=done

Checklist

  • Code compiles without errors
  • Follows existing code style
  • Includes DCO sign-off
  • Backwards compatible (non-gzip layers still work)

Many Docker/OCI images use gzip-compressed layers with media types like:
- application/vnd.docker.image.rootfs.diff.tar.gzip
- application/vnd.oci.image.layer.v1.tar+gzip

Previously, LoadPackage would fail with "archive/tar: invalid tar header"
or "unexpected EOF" when processing these layers because it tried to read
the gzip-compressed data directly as a tar archive.

This fix:
1. Checks OCI manifest media type for gzip indication
2. Attempts gzip decompression using gzip.NewReader() which validates
   the gzip header automatically
3. Falls back to raw tar reading if the data is not gzip-compressed
4. Properly reports errors when media type indicates gzip but
   decompression fails

Fixes processing of standard Docker Hub images that use compressed layers.

Signed-off-by: Technophobe01 <pkjarvis01@gmail.com>
… archives

This adds a new --target-image-archive flag to the xray command that allows
analyzing a pre-saved Docker image tar archive directly, without requiring
access to the Docker daemon.

Use case: When orchestrating multiple xray invocations (e.g., for initial
analysis followed by file extraction), the image archive from the first
invocation can be reused directly in subsequent invocations without
re-pulling from registry.

Changes:
- Add FlagTargetImageArchive to cliflags.go
- Register flag in xray command (cli.go)
- Add GetArchiveInfo() helper to extract image ID from archive manifest
- Add archive-based analysis path in handler.go

Signed-off-by: Technophobe01 <pkjarvis01@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant