Skip to content

Conversation

@hcfw007
Copy link
Contributor

@hcfw007 hcfw007 commented Apr 15, 2025

已移除 pool 相关改动。

please move to #90

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @hcfw007, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request merges changes from the @juzi version into the main branch. The primary changes include updating the package name and version, adding environment variables for configuring HTTP timeouts and chunk sizes, modifying the HTTP stream logic to support proxy URLs and chunked downloads, and improving error handling. The pool related changes have been removed as noted in the description.

Highlights

  • Package Update: The package name has been changed from file-box to @juzi/file-box, and the version has been updated to 1.7.16.
  • Environment Variables: Added environment variables to configure HTTP request and response timeouts (FILEBOX_HTTP_REQUEST_TIMEOUT, FILEBOX_HTTP_RESPONSE_TIMEOUT), disable slice downloading (NO_SLICE_DOWN), and set HTTP chunk size (HTTP_CHUNK_SIZE).
  • HTTP Stream Logic: Modified httpStream to support proxy URLs and chunked downloads for large files, improving network resilience.
  • Error Handling: Improved error handling in file-box.ts by adding retry logic to the ready function and ensuring write stream closure on errors.

Changelog

Click here to see the changelog
  • .vscode/settings.json
    • Removed the entire file.
  • README.md
    • Added a new 'Environments' section to document the new environment variables.
    • Added a history entry for v1.7 mentioning the environment variables.
  • package.json
    • Updated the package name from file-box to @juzi/file-box (line 2).
    • Updated the package version to 1.7.16 (line 3).
    • Added @types/node, @typescript-eslint/eslint-plugin, eslint-plugin-n, tap, and typescript as dev dependencies (lines 55-65).
    • Added https-proxy-agent as a dependency (line 70).
    • Removed the git configuration (lines 79-83).
  • src/config.ts
    • Added constants for HTTP_REQUEST_TIMEOUT, HTTP_RESPONSE_TIMEOUT, NO_SLICE_DOWN, HTTP_CHUNK_SIZE, and READY_RETRY based on environment variables (lines 4-15).
  • src/file-box.ts
    • Imported READY_RETRY from ./config.js (line 28).
    • Added headers property to the FileBox constructor (line 390).
    • Added url getter to return the remote URL (lines 511-513).
    • Added proxyUrl property and useProxyUrl method (lines 515-518).
    • Added retry logic to the ready function (lines 618-642).
    • Modified _syncUrlMetadata to include headers and proxy URL when calling httpHeadHeader (line 661).
    • Modified _transformUrlToStream to include headers and proxy URL when calling httpStream (line 891).
    • Ensured write stream closure on errors in toFile (lines 946-948).
    • Refactored the pipe function to handle errors more effectively (lines 1045-1052).
  • src/misc.spec.ts
    • Added a test case for httpStream to verify chunked downloads (lines 72-79).
  • src/misc.ts
    • Added imports for various modules including assert, crypto, events, fs, os, path, stream, and url (lines 1-12).
    • Added constants for HTTP timeouts and chunk size (lines 15-19).
    • Added getProtocol function to retrieve protocol-specific agents and request methods (lines 28-31).
    • Modified httpHeadHeader to accept headers and proxy URL, and use fetch for making HEAD requests (lines 44-74).
    • Modified httpStream to support chunked downloads and proxy URLs (lines 93-113).
    • Added fetch function to handle HTTP requests with timeout and proxy support (lines 115-144).
    • Added downloadFileInChunks function to download files in chunks (lines 146-203).
    • Modified streamToBuffer to use async iteration (lines 205-211).
    • Added setProxy function to set proxy agent for request options (lines 213-217).
  • src/mod.ts
    • Exported all members from misc.js (line 35).
  • src/pure-functions/sized-chunk-transformer.ts
    • Minor change: Added space between brackets in Buffer.concat([ buffer, chunk ]) (line 21).
  • src/urn-registry/uniform-resource-name-registry.ts
    • Minor change: Added space between brackets in array destructuring (lines 145, 294).
  • src/version.ts
    • Removed an empty line at the beginning of the file.
  • tests/fixtures/smoke-testing.ts
    • Updated import statement to use @juzi/file-box instead of file-box (line 6).
  • tests/network-timeout.spec.ts
    • Added a new test file to verify network timeout scenarios.
  • tsconfig.json
    • Set noUnusedLocals to false in compiler options (line 5).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A file in a box,
Across the network it flies,
Chunks make it faster.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces several changes, including updating the package name and version, adding environment variables for configuring HTTP timeouts and chunk size, modifying the file-box class to handle headers and proxy URLs, and adding a new test case for network timeouts. Overall, the changes seem well-structured and address relevant issues. However, some areas could benefit from further review and refinement.

Summary of Findings

  • Error Handling in Chunked Download: The chunked download logic includes a retry mechanism, but the error handling could be improved to provide more context about the failure and potentially implement more sophisticated retry strategies.
  • Potential Resource Leak: In the downloadFileInChunks function, if the retries are exhausted, the temporary file is removed, but it's not guaranteed that the writeStream is properly closed before the rm operation. This could lead to a resource leak or an error during file deletion.
  • Inconsistent Naming: The naming of environment variables and related constants could be more consistent. For example, FILEBOX_HTTP_TIMEOUT and FILEBOX_HTTP_RESPONSE_TIMEOUT are used, but FILE_BOX_READY_RETRY uses a different naming convention.

Merge Readiness

The pull request introduces significant changes that require careful review. While the new features and configurations are valuable, the identified issues related to error handling and potential resource leaks should be addressed before merging. I am unable to approve this pull request, and recommend that it not be merged until the critical and high severity issues are addressed, and that others review and approve this code before merging.

Comment on lines +185 to +189
if (--retries <= 0) {
void rm(tmpFile, { force: true })
writeStream.close()
throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err })
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If the retries are exhausted, the temporary file is removed, but it's not guaranteed that the writeStream is properly closed before the rm operation. This could lead to a resource leak or an error during file deletion. Ensure that the writeStream is closed before attempting to remove the temporary file.

      const err = error as Error
      if (--retries <= 0) {
        writeStream.close()
        void rm(tmpFile, { force: true })
        throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err })

export const HTTP_CHUNK_SIZE = Number(process.env['FILEBOX_HTTP_CHUNK_SIZE'])
|| 1024 * 512

export const READY_RETRY = Number(process.env['FILE_BOX_READY_RETRY']) || 3

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider using a consistent naming convention for environment variables. The other variables use FILEBOX_, while this one uses FILE_BOX_. Using a consistent prefix improves readability and maintainability.

Suggested change
export const READY_RETRY = Number(process.env['FILE_BOX_READY_RETRY']) || 3
export const READY_RETRY = Number(process.env['FILEBOX_READY_RETRY']) || 3

Comment on lines +946 to +949
.once('error', (error) => {
writeStream.close()
reject(error)
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It's good practice to close the writeStream in the error handler to release resources and prevent potential issues with file access. This ensures that the file is properly closed even if an error occurs during the pipe operation.

          writeStream.close()
          reject(error)
        })

if (--retries <= 0) {
void rm(tmpFile, { force: true })
writeStream.close()
throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err })

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error message could be improved to provide more context about the failure, such as the URL being downloaded and the specific error that occurred. This would make it easier to diagnose and resolve issues with chunked downloads.

        throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err })

Comment on lines +173 to +190
try {
const res = await fetch(url, requestOptions, proxyUrl)
assert(allowStatusCode.includes(res.statusCode ?? 0), `Request failed with status code ${res.statusCode}`)
assert(Number(res.headers['content-length']) > 0, 'Server returned 0 bytes of data')
for await (const chunk of res) {
assert(Buffer.isBuffer(chunk))
downSize += chunk.length
writeStream.write(chunk)
}
res.destroy()
} catch (error) {
const err = error as Error
if (--retries <= 0) {
void rm(tmpFile, { force: true })
writeStream.close()
throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err })
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error handling in the chunked download logic could be improved. Currently, if a chunk fails to download, the function retries a few times. However, it might be beneficial to implement more sophisticated retry strategies, such as exponential backoff, to avoid overwhelming the server with retries in quick succession. Additionally, consider adding a mechanism to detect and handle permanent errors (e.g., 404 Not Found) to avoid unnecessary retries.

Suggested change
try {
const res = await fetch(url, requestOptions, proxyUrl)
assert(allowStatusCode.includes(res.statusCode ?? 0), `Request failed with status code ${res.statusCode}`)
assert(Number(res.headers['content-length']) > 0, 'Server returned 0 bytes of data')
for await (const chunk of res) {
assert(Buffer.isBuffer(chunk))
downSize += chunk.length
writeStream.write(chunk)
}
res.destroy()
} catch (error) {
const err = error as Error
if (--retries <= 0) {
void rm(tmpFile, { force: true })
writeStream.close()
throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err })
}
}
try {
const res = await fetch(url, requestOptions, proxyUrl)
assert(allowStatusCode.includes(res.statusCode ?? 0), `Request failed with status code ${res.statusCode}`)
assert(Number(res.headers['content-length']) > 0, 'Server returned 0 bytes of data')
for await (const chunk of res) {
assert(Buffer.isBuffer(chunk))
downSize += chunk.length
writeStream.write(chunk)
}
res.destroy()
} catch (error) {
const err = error as Error
if (--retries <= 0) {
void rm(tmpFile, { force: true })
writeStream.close()
throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err })
}
}

@huan huan requested a review from Copilot April 15, 2025 08:22
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 13 out of 15 changed files in this pull request and generated 1 comment.

Files not reviewed (2)
  • .vscode/settings.json: Language not supported
  • package.json: Language not supported
Comments suppressed due to low confidence (1)

src/config.ts:15

  • [nitpick] The environment variable 'FILE_BOX_READY_RETRY' is inconsistent with the naming convention used in other configuration variables (e.g. FILEBOX_HTTP_REQUEST_TIMEOUT). Consider renaming it to 'FILEBOX_READY_RETRY' for consistency.
export const READY_RETRY = Number(process.env['FILE_BOX_READY_RETRY']) || 3

if (this.size === UNKNOWN_SIZE) {
this._size = (await this.toBuffer()).length
let tryCount = 0
while (1) {
Copy link

Copilot AI Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry loop in the ready() method currently has no delay between retries, which may lead to a tight loop consuming excessive CPU. Consider adding a delay or implementing exponential backoff between retries.

Copilot uses AI. Check for mistakes.
@hcfw007 hcfw007 closed this Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants