Skip to content

Conversation

@CJBshuosi
Copy link
Collaborator

此log模块用于开发人员在代码开发过程中记录执行日志,方便调试代码
Related to #4
Signed-off-by: LIU BOYU oor2020@163.com

…code testing by developers.

Related to jerry609#4

Signed-off-by: LIU BOYU <oor2020@163.com>
…code testing by developers.

    Related to jerry609#4

    Signed-off-by: LIU BOYU <oor2020@163.com>
@gemini-code-assist
Copy link

Summary of Changes

Hello @CJBshuosi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive and flexible logging system designed to enhance debugging and operational monitoring for developers. It centralizes log configuration, enables logging to distinct files based on functionality, and includes features like log rotation and trace ID propagation. This significantly improves the ability to track application flow and diagnose issues efficiently.

Highlights

  • Centralized Logging Module: Introduced a new, centralized logging module (logging_config.py) to provide structured and configurable logging capabilities across the application.
  • Configurable Log Files and Rotation: Implemented support for multiple log files, each with its own path defined in log_config.yaml, and automatic log rotation based on size and backup count to manage disk space.
  • Static Logger Class: Added a Logger static class with methods for different log levels (debug, info, warning, error, critical) and the ability to specify a target log file for granular control.
  • Trace ID Management: Integrated trace ID management using Python's contextvars to enable tracking of requests or operations across multiple log entries, improving debugging in concurrent environments.
  • Environment Variable Configuration: Allowed configuration of logging parameters (log level, base directory, max bytes per file, backup count) via environment variables, offering flexible deployment without code changes.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/paperbot/utils/log_config.yaml
    • Added a new YAML file to define log file paths and the base directory for logs.
  • src/paperbot/utils/logging_config.py
    • Added a new Python module implementing a static Logger class for structured logging.
    • Included functionality for log file rotation, configurable log levels, and dynamic file path resolution.
    • Introduced LogFiles class for accessing predefined log file paths.
    • Implemented contextvars for managing and propagating trace IDs across log messages.
Activity
  • No activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@jerry609 jerry609 merged commit f68178a into jerry609:master Feb 10, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new centralized logging module, offering features like configuration from environment variables, YAML file-based log path definitions, log rotation, and context-aware trace IDs. However, it presents significant security concerns, including a high-severity path traversal vulnerability in how log file paths are resolved and a medium-severity resource exhaustion risk due to unbounded caching of file handlers. These must be resolved to prevent potential abuse, such as unauthorized file access/overwrites or application crashes. Beyond security, the implementation has critical thread safety issues when writing to log files, silent failures during YAML configuration loading, minor inconsistencies in log file key naming, and largely re-implements functionality available in Python's standard logging module, suggesting a refactor to leverage the built-in library for robustness and simplicity.

Comment on lines +231 to +232
handler.stream.write(formatted + "\n")
handler.stream.flush()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Writing directly to handler.stream bypasses the thread-safe mechanisms (i.e., locks) provided by Python's logging.Handler. In a multi-threaded application, this can lead to race conditions and result in garbled or interleaved log messages. To ensure thread safety, you should acquire the handler's lock before writing to the stream.

    handler.acquire()
    try:
        handler.stream.write(formatted + "\n")
        handler.stream.flush()
    finally:
        handler.release()

Comment on lines +186 to +198
def _resolve_file_path(file: Optional[str]) -> str:
"""Resolve the full file path for logging."""
base_dir = _config.get("base_dir", DEFAULT_LOG_DIR)

if file is None:
return str(Path(base_dir) / DEFAULT_LOG_FILE)

# If file contains directory separator, use as relative path under base_dir
if "/" in file or "\\" in file:
return str(Path(base_dir) / file)

# Otherwise, just a filename in base_dir
return str(Path(base_dir) / file)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The _resolve_file_path function is vulnerable to path traversal. It joins the base_dir with the file argument without validating that the resulting path remains within the base_dir. If the file argument is an absolute path or contains traversal sequences like .., it can point to arbitrary files on the system. This could allow an attacker to overwrite sensitive files if they can influence the log file name.

Suggested change
def _resolve_file_path(file: Optional[str]) -> str:
"""Resolve the full file path for logging."""
base_dir = _config.get("base_dir", DEFAULT_LOG_DIR)
if file is None:
return str(Path(base_dir) / DEFAULT_LOG_FILE)
# If file contains directory separator, use as relative path under base_dir
if "/" in file or "\\" in file:
return str(Path(base_dir) / file)
# Otherwise, just a filename in base_dir
return str(Path(base_dir) / file)
def _resolve_file_path(file: Optional[str]) -> str:
"""Resolve the full file path for logging and prevent path traversal."""
base_dir = Path(_config.get("base_dir", DEFAULT_LOG_DIR)).resolve()
if file is None:
return str(base_dir / DEFAULT_LOG_FILE)
# Resolve the path and ensure it's under base_dir
target_path = (base_dir / file).resolve()
try:
target_path.relative_to(base_dir)
except ValueError:
# Fallback or raise error if path is outside base_dir
return str(base_dir / DEFAULT_LOG_FILE)
return str(target_path)

Comment on lines +102 to +103
except Exception:
pass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This broad except Exception: pass will silently ignore all errors during the loading of log_config.yaml, including a missing pyyaml dependency or a malformed YAML file. This can make debugging configuration issues very difficult. It's better to at least log a warning to standard error when the configuration can't be loaded.

Suggested change
except Exception:
pass
except Exception as e:
import sys
print(f"WARNING: Failed to load log config from {LOG_CONFIG_FILE}, using defaults. Error: {e}", file=sys.stderr)

Comment on lines +149 to +163
def _get_file_handler(file_path: str) -> RotatingFileHandler:
"""Get or create a file handler for the given path."""
if file_path not in _file_handlers:
path = Path(file_path)
_ensure_dir(path.parent)

handler = RotatingFileHandler(
filename=str(path),
maxBytes=_config.get("max_bytes", DEFAULT_MAX_BYTES),
backupCount=_config.get("backup_count", DEFAULT_BACKUP_COUNT),
encoding="utf-8",
)
_file_handlers[file_path] = handler

return _file_handlers[file_path]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The _file_handlers dictionary acts as an unbounded cache for file handlers. Since handlers are never removed or closed (except on global shutdown), providing dynamic file names to the logger can lead to a resource exhaustion attack (Denial of Service). An attacker could trigger the creation of thousands of file handlers, exhausting file descriptors and memory.

Comment on lines +9 to +13
files:
HARVEST: "harvest/harvest.log"
api: "api/api.log"
error: "errors/error.log"
database: "db/database.log"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The keys under files use mixed casing (HARVEST vs api, error, database). The corresponding Python code in logging_config.py suggests that these are meant to be accessed like constants (e.g., LogFiles.HARVEST), which by convention (PEP 8) should be uppercase. For consistency and improved readability, it's recommended to use uppercase for all keys.

files:
  HARVEST: "harvest/harvest.log"
  API: "api/api.log"
  ERROR: "errors/error.log"
  DATABASE: "db/database.log"

Comment on lines +87 to +91
cls._files = {
"harvest": "harvest/harvest.log",
"api": "api/api.log",
"error": "errors/error.log",
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The hardcoded default values for log files are incomplete. The log_config.yaml file includes a database log, but it's missing from these defaults. If the YAML file fails to load, any attempt to use LogFiles.DATABASE will result in an AttributeError. To make the fallback behavior more robust and consistent, the default values should include all expected log files.

Suggested change
cls._files = {
"harvest": "harvest/harvest.log",
"api": "api/api.log",
"error": "errors/error.log",
}
cls._files = {
"harvest": "harvest/harvest.log",
"api": "api/api.log",
"error": "errors/error.log",
"database": "db/database.log",
}

Comment on lines +213 to +214
frame = inspect.currentframe()
caller_frame = frame.f_back.f_back if frame and frame.f_back else None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using inspect.currentframe() to get caller information can have a significant performance impact, especially if logging is done frequently in tight loops. The standard logging module is highly optimized for this (often in C). While this approach works, it's important to be aware of the potential overhead. If performance becomes a concern, consider refactoring to leverage the standard logging module's more efficient mechanisms for capturing caller info (e.g., using %(filename)s and %(lineno)d in the format string).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants