-
Notifications
You must be signed in to change notification settings - Fork 9
Merge @juzi version into main #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes: Get rid of deprecated Node.js API: url.parse #20
Feat/ready retry
Feat/headers
Feat/url pool
Revert "Feat/url pool"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @hcfw007, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request merges changes from the @juzi version into the main branch. The primary changes include updating the package name and version, adding environment variables for configuring HTTP timeouts and chunk sizes, modifying the HTTP stream logic to support proxy URLs and chunked downloads, and improving error handling. The pool related changes have been removed as noted in the description.
Highlights
- Package Update: The package name has been changed from
file-boxto@juzi/file-box, and the version has been updated to1.7.16. - Environment Variables: Added environment variables to configure HTTP request and response timeouts (
FILEBOX_HTTP_REQUEST_TIMEOUT,FILEBOX_HTTP_RESPONSE_TIMEOUT), disable slice downloading (NO_SLICE_DOWN), and set HTTP chunk size (HTTP_CHUNK_SIZE). - HTTP Stream Logic: Modified
httpStreamto support proxy URLs and chunked downloads for large files, improving network resilience. - Error Handling: Improved error handling in
file-box.tsby adding retry logic to thereadyfunction and ensuring write stream closure on errors.
Changelog
Click here to see the changelog
- .vscode/settings.json
- Removed the entire file.
- README.md
- Added a new 'Environments' section to document the new environment variables.
- Added a history entry for v1.7 mentioning the environment variables.
- package.json
- Updated the package name from
file-boxto@juzi/file-box(line 2). - Updated the package version to
1.7.16(line 3). - Added
@types/node,@typescript-eslint/eslint-plugin,eslint-plugin-n,tap, andtypescriptas dev dependencies (lines 55-65). - Added
https-proxy-agentas a dependency (line 70). - Removed the
gitconfiguration (lines 79-83).
- Updated the package name from
- src/config.ts
- Added constants for
HTTP_REQUEST_TIMEOUT,HTTP_RESPONSE_TIMEOUT,NO_SLICE_DOWN,HTTP_CHUNK_SIZE, andREADY_RETRYbased on environment variables (lines 4-15).
- Added constants for
- src/file-box.ts
- Imported
READY_RETRYfrom./config.js(line 28). - Added
headersproperty to the FileBox constructor (line 390). - Added
urlgetter to return the remote URL (lines 511-513). - Added
proxyUrlproperty anduseProxyUrlmethod (lines 515-518). - Added retry logic to the
readyfunction (lines 618-642). - Modified
_syncUrlMetadatato include headers and proxy URL when callinghttpHeadHeader(line 661). - Modified
_transformUrlToStreamto include headers and proxy URL when callinghttpStream(line 891). - Ensured write stream closure on errors in
toFile(lines 946-948). - Refactored the
pipefunction to handle errors more effectively (lines 1045-1052).
- Imported
- src/misc.spec.ts
- Added a test case for
httpStreamto verify chunked downloads (lines 72-79).
- Added a test case for
- src/misc.ts
- Added imports for various modules including
assert,crypto,events,fs,os,path,stream, andurl(lines 1-12). - Added constants for HTTP timeouts and chunk size (lines 15-19).
- Added
getProtocolfunction to retrieve protocol-specific agents and request methods (lines 28-31). - Modified
httpHeadHeaderto accept headers and proxy URL, and usefetchfor making HEAD requests (lines 44-74). - Modified
httpStreamto support chunked downloads and proxy URLs (lines 93-113). - Added
fetchfunction to handle HTTP requests with timeout and proxy support (lines 115-144). - Added
downloadFileInChunksfunction to download files in chunks (lines 146-203). - Modified
streamToBufferto use async iteration (lines 205-211). - Added
setProxyfunction to set proxy agent for request options (lines 213-217).
- Added imports for various modules including
- src/mod.ts
- Exported all members from
misc.js(line 35).
- Exported all members from
- src/pure-functions/sized-chunk-transformer.ts
- Minor change: Added space between brackets in
Buffer.concat([ buffer, chunk ])(line 21).
- Minor change: Added space between brackets in
- src/urn-registry/uniform-resource-name-registry.ts
- Minor change: Added space between brackets in array destructuring (lines 145, 294).
- src/version.ts
- Removed an empty line at the beginning of the file.
- tests/fixtures/smoke-testing.ts
- Updated import statement to use
@juzi/file-boxinstead offile-box(line 6).
- Updated import statement to use
- tests/network-timeout.spec.ts
- Added a new test file to verify network timeout scenarios.
- tsconfig.json
- Set
noUnusedLocalstofalsein compiler options (line 5).
- Set
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
A file in a box,
Across the network it flies,
Chunks make it faster.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request introduces several changes, including updating the package name and version, adding environment variables for configuring HTTP timeouts and chunk size, modifying the file-box class to handle headers and proxy URLs, and adding a new test case for network timeouts. Overall, the changes seem well-structured and address relevant issues. However, some areas could benefit from further review and refinement.
Summary of Findings
- Error Handling in Chunked Download: The chunked download logic includes a retry mechanism, but the error handling could be improved to provide more context about the failure and potentially implement more sophisticated retry strategies.
- Potential Resource Leak: In the
downloadFileInChunksfunction, if the retries are exhausted, the temporary file is removed, but it's not guaranteed that thewriteStreamis properly closed before thermoperation. This could lead to a resource leak or an error during file deletion. - Inconsistent Naming: The naming of environment variables and related constants could be more consistent. For example,
FILEBOX_HTTP_TIMEOUTandFILEBOX_HTTP_RESPONSE_TIMEOUTare used, butFILE_BOX_READY_RETRYuses a different naming convention.
Merge Readiness
The pull request introduces significant changes that require careful review. While the new features and configurations are valuable, the identified issues related to error handling and potential resource leaks should be addressed before merging. I am unable to approve this pull request, and recommend that it not be merged until the critical and high severity issues are addressed, and that others review and approve this code before merging.
| if (--retries <= 0) { | ||
| void rm(tmpFile, { force: true }) | ||
| writeStream.close() | ||
| throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err }) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the retries are exhausted, the temporary file is removed, but it's not guaranteed that the writeStream is properly closed before the rm operation. This could lead to a resource leak or an error during file deletion. Ensure that the writeStream is closed before attempting to remove the temporary file.
const err = error as Error
if (--retries <= 0) {
writeStream.close()
void rm(tmpFile, { force: true })
throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err })| export const HTTP_CHUNK_SIZE = Number(process.env['FILEBOX_HTTP_CHUNK_SIZE']) | ||
| || 1024 * 512 | ||
|
|
||
| export const READY_RETRY = Number(process.env['FILE_BOX_READY_RETRY']) || 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a consistent naming convention for environment variables. The other variables use FILEBOX_, while this one uses FILE_BOX_. Using a consistent prefix improves readability and maintainability.
| export const READY_RETRY = Number(process.env['FILE_BOX_READY_RETRY']) || 3 | |
| export const READY_RETRY = Number(process.env['FILEBOX_READY_RETRY']) || 3 |
| .once('error', (error) => { | ||
| writeStream.close() | ||
| reject(error) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (--retries <= 0) { | ||
| void rm(tmpFile, { force: true }) | ||
| writeStream.close() | ||
| throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message could be improved to provide more context about the failure, such as the URL being downloaded and the specific error that occurred. This would make it easier to diagnose and resolve issues with chunked downloads.
throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err })| try { | ||
| const res = await fetch(url, requestOptions, proxyUrl) | ||
| assert(allowStatusCode.includes(res.statusCode ?? 0), `Request failed with status code ${res.statusCode}`) | ||
| assert(Number(res.headers['content-length']) > 0, 'Server returned 0 bytes of data') | ||
| for await (const chunk of res) { | ||
| assert(Buffer.isBuffer(chunk)) | ||
| downSize += chunk.length | ||
| writeStream.write(chunk) | ||
| } | ||
| res.destroy() | ||
| } catch (error) { | ||
| const err = error as Error | ||
| if (--retries <= 0) { | ||
| void rm(tmpFile, { force: true }) | ||
| writeStream.close() | ||
| throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling in the chunked download logic could be improved. Currently, if a chunk fails to download, the function retries a few times. However, it might be beneficial to implement more sophisticated retry strategies, such as exponential backoff, to avoid overwhelming the server with retries in quick succession. Additionally, consider adding a mechanism to detect and handle permanent errors (e.g., 404 Not Found) to avoid unnecessary retries.
| try { | |
| const res = await fetch(url, requestOptions, proxyUrl) | |
| assert(allowStatusCode.includes(res.statusCode ?? 0), `Request failed with status code ${res.statusCode}`) | |
| assert(Number(res.headers['content-length']) > 0, 'Server returned 0 bytes of data') | |
| for await (const chunk of res) { | |
| assert(Buffer.isBuffer(chunk)) | |
| downSize += chunk.length | |
| writeStream.write(chunk) | |
| } | |
| res.destroy() | |
| } catch (error) { | |
| const err = error as Error | |
| if (--retries <= 0) { | |
| void rm(tmpFile, { force: true }) | |
| writeStream.close() | |
| throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err }) | |
| } | |
| } | |
| try { | |
| const res = await fetch(url, requestOptions, proxyUrl) | |
| assert(allowStatusCode.includes(res.statusCode ?? 0), `Request failed with status code ${res.statusCode}`) | |
| assert(Number(res.headers['content-length']) > 0, 'Server returned 0 bytes of data') | |
| for await (const chunk of res) { | |
| assert(Buffer.isBuffer(chunk)) | |
| downSize += chunk.length | |
| writeStream.write(chunk) | |
| } | |
| res.destroy() | |
| } catch (error) { | |
| const err = error as Error | |
| if (--retries <= 0) { | |
| void rm(tmpFile, { force: true }) | |
| writeStream.close() | |
| throw new Error(`Download file with chunk failed! ${err.message}`, { cause: err }) | |
| } | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 13 out of 15 changed files in this pull request and generated 1 comment.
Files not reviewed (2)
- .vscode/settings.json: Language not supported
- package.json: Language not supported
Comments suppressed due to low confidence (1)
src/config.ts:15
- [nitpick] The environment variable 'FILE_BOX_READY_RETRY' is inconsistent with the naming convention used in other configuration variables (e.g. FILEBOX_HTTP_REQUEST_TIMEOUT). Consider renaming it to 'FILEBOX_READY_RETRY' for consistency.
export const READY_RETRY = Number(process.env['FILE_BOX_READY_RETRY']) || 3
| if (this.size === UNKNOWN_SIZE) { | ||
| this._size = (await this.toBuffer()).length | ||
| let tryCount = 0 | ||
| while (1) { |
Copilot
AI
Apr 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retry loop in the ready() method currently has no delay between retries, which may lead to a tight loop consuming excessive CPU. Consider adding a delay or implementing exponential backoff between retries.
已移除 pool 相关改动。
please move to #90