Skip to content

Abstract reader#39

Draft
vbmithr wants to merge 2 commits intoscosman:masterfrom
deepmarker:abstract-reader
Draft

Abstract reader#39
vbmithr wants to merge 2 commits intoscosman:masterfrom
deepmarker:abstract-reader

Conversation

@vbmithr
Copy link

@vbmithr vbmithr commented Oct 1, 2025

Summary by CodeRabbit

  • New Features
    • Introduced a flexible content reader to fetch file data for ZIP creation, enabling alternative data sources while preserving existing behavior.
  • Refactor
    • Streamlined data retrieval to reduce coupling with HTTP-specific logic and improve maintainability.
    • Migrated legacy I/O utilities to modern standard library equivalents without changing functionality.
  • Tests
    • Updated tests to use current I/O utilities for discarding output; no behavior changes.
  • Chores
    • Minor internal cleanups and naming consistency improvements.

@coderabbitai
Copy link

coderabbitai bot commented Oct 1, 2025

Walkthrough

Refactors IO utilities from ioutil to io in tests and server. Introduces ReaderFunc and SimpleGet for FileEntry, replacing URL storage with a reader abstraction. Updates zip_streamer to stream entries via FileEntry.reader instead of direct HTTP GET, adjusting lifecycle to use r.Close() and io.Copy.

Changes

Cohort / File(s) Summary of Changes
Stdlib ioutil→io migration
test/zip_streamer_test.go, zip_streamer/server.go
Replaced ioutil.Discard→io.Discard and ioutil.ReadAll→io.ReadAll; updated imports to use io.
FileEntry reader abstraction
zip_streamer/file_entry.go
Added exported type ReaderFunc and function SimpleGet(string) ReaderFunc. Replaced FileEntry.url with reader ReaderFunc; removed Url() accessor; added io and net/http imports.
Zip streaming reads via reader
zip_streamer/zip_streamer.go
Replaced http.Get(entry.Url().String()) with entry.reader(); removed HTTP status checks; now defer r.Close(); use io.Copy with the reader.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant S as ZipServer
  participant Z as ZipStreamer
  participant F as FileEntry.reader (ReaderFunc)
  participant H as HTTP Origin (via SimpleGet)

  C->>S: Request ZIP
  S->>Z: Build and stream ZIP
  loop For each FileEntry
    Z->>F: reader()
    activate F
    F->>H: HTTP GET (SimpleGet)
    H-->>F: Body stream
    F-->>Z: io.ReadCloser
    deactivate F
    Z->>Z: io.Copy(entryWriter, reader)
    Z-->>Z: defer reader.Close()
  end
  Z-->>S: ZIP bytes
  S-->>C: ZIP response
  note over Z: ReaderFunc abstracts retrieval\n(old direct http.Get removed)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I hop through streams where bytes now flow,
A ReaderFunc leads where URLs won’t go.
With io’s breeze, I zip and I glide,
Copy, close, and tuck bits inside.
Thump-thump! The archive’s neatly spun—
Less noise, more carrots, work well done. 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Abstract reader” succinctly captures the core change of this pull request, which is the introduction of a ReaderFunc abstraction to decouple file entry reading from a URL-based implementation. It is directly related to the main refactoring effort and conveys the primary intent without extraneous detail.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1fe4904 and 7aa7653.

📒 Files selected for processing (4)
  • test/zip_streamer_test.go (2 hunks)
  • zip_streamer/file_entry.go (2 hunks)
  • zip_streamer/server.go (3 hunks)
  • zip_streamer/zip_streamer.go (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/zip_streamer_test.go (2)
zip_streamer/zip_streamer.go (1)
  • NewZipStream (17-30)
zip_streamer/file_entry.go (1)
  • FileEntry (15-18)
🔇 Additional comments (5)
zip_streamer/file_entry.go (2)

13-13: LGTM: Clean abstraction for lazy data retrieval.

The ReaderFunc type provides a good abstraction that decouples FileEntry from direct HTTP access, enabling better testability and flexibility.


32-58: LGTM: URL validation preserved with new reader abstraction.

The function correctly validates URL scheme and prefix requirements before creating the FileEntry with the new SimpleGet-based reader. The validation logic remains intact while transitioning to the reader abstraction.

zip_streamer/server.go (1)

66-66: LGTM: Standard migration from deprecated ioutil to io.

The changes correctly update ioutil.ReadAll to io.ReadAll, aligning with Go 1.16+ best practices. No functional changes.

Also applies to: 124-124

test/zip_streamer_test.go (1)

16-16: LGTM: Standard migration from deprecated ioutil to io.

The changes correctly update ioutil.Discard to io.Discard, aligning with Go 1.16+ best practices. No functional changes.

Also applies to: 24-24

zip_streamer/zip_streamer.go (1)

53-53: LGTM: Clean streaming via io.Copy.

The transition from direct HTTP response handling to the reader abstraction is well-implemented. io.Copy efficiently streams data from the reader to the zip entry writer.

Comment on lines +20 to +28
func SimpleGet(url string) ReaderFunc {
return func() (io.ReadCloser, error) {
resp, err := http.Get(url)
if err != nil {
return nil, err
}
return resp.Body, nil
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: HTTP status code not validated.

SimpleGet returns the response body without checking resp.StatusCode. A 404, 500, or any non-2xx response will return an open body that may contain error HTML/JSON instead of the expected content, causing corrupt zip entries.

Apply this diff to validate the status code:

 func SimpleGet(url string) ReaderFunc {
 	return func() (io.ReadCloser, error) {
 		resp, err := http.Get(url)
 		if err != nil {
 			return nil, err
 		}
+		if resp.StatusCode != http.StatusOK {
+			resp.Body.Close()
+			return nil, errors.New("HTTP request failed with status: " + resp.Status)
+		}
 		return resp.Body, nil
 	}
 }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In zip_streamer/file_entry.go around lines 20 to 28, SimpleGet currently returns
resp.Body without validating resp.StatusCode; update the function to check that
resp.StatusCode is in the 2xx range (e.g., resp.StatusCode >= 200 &&
resp.StatusCode < 300), and if not, read/close the body and return an error that
includes the status code (and optionally the response body or status text) so
callers do not receive error pages as valid content; ensure the response body is
closed on error to avoid leaks and only return resp.Body when the status is
successful.

Comment on lines +37 to +41
r, err := entry.reader()
if err != nil {
continue
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
continue
}
defer r.Close()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: defer in loop causes resource leak.

The defer r.Close() on Line 41 is inside a loop, which means all readers remain open until StreamAllFiles() returns. For large file lists, this will exhaust file descriptors and HTTP connections.

Apply this diff to close readers immediately after use:

 	for _, entry := range z.entries {
 		r, err := entry.reader()
 		if err != nil {
 			continue
 		}
-		defer r.Close()
 
 		header := &zip.FileHeader{
 			Name:     entry.ZipPath(),
 			Method:   z.CompressionMethod,
 			Modified: time.Now(),
 		}
 		entryWriter, err := zipWriter.CreateHeader(header)
 		if err != nil {
+			r.Close()
 			return err
 		}
 
 		_, err = io.Copy(entryWriter, r)
+		r.Close()
 		if err != nil {
 			return err
 		}

Alternatively, extract the processing into a separate function where defer will execute after each iteration.

🤖 Prompt for AI Agents
In zip_streamer/zip_streamer.go around lines 37 to 41, the defer r.Close() is
declared inside a loop so closes are deferred until function exit (leaking file
descriptors); instead close each reader immediately after its use — either
replace the defer with an explicit r.Close() call right after processing that
iteration (ensuring it runs even on early returns/errors), or move the per-entry
logic into a small helper/anonymous function and keep defer r.Close() inside
that function so it executes at the end of each iteration.

@vbmithr vbmithr marked this pull request as draft October 1, 2025 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant