-
-
Notifications
You must be signed in to change notification settings - Fork 50
feat: implement Automatic Documentation Generator #644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: implement Automatic Documentation Generator #644
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a new DocsGenerator engine (cortex/docs_generator.py) with CLI integration ( Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI
participant DocsGen as "DocsGenerator"
participant InstallHist as "InstallationHistory"
participant ConfigMgr as "ConfigManager"
participant FS as "FileSystem"
participant Templates
User->>CLI: cortex docs generate nginx
CLI->>DocsGen: generate_software_docs("nginx")
DocsGen->>InstallHist: fetch install history
InstallHist-->>DocsGen: history
DocsGen->>ConfigMgr: query package info
ConfigMgr-->>DocsGen: package_info
DocsGen->>FS: scan config locations (/etc, ~/.config, ...)
FS-->>DocsGen: config_file_paths
DocsGen->>Templates: load software/default templates
Templates-->>DocsGen: template_objs
DocsGen->>DocsGen: render guides (installation, config, quick-start, troubleshooting)
DocsGen->>FS: write ~/.cortex/docs/nginx/*
DocsGen-->>CLI: generated_file_paths
CLI-->>User: ✓ Documentation generated
sequenceDiagram
actor User
participant CLI
participant DocsGen as "DocsGenerator"
participant FS as "FileSystem"
participant Renderer as "Markdown/HTML/PDF"
User->>CLI: cortex docs export nginx --format pdf
CLI->>DocsGen: export_docs("nginx","pdf")
DocsGen->>FS: check ~/.cortex/docs/nginx/
alt docs exist
FS-->>DocsGen: doc_files
else docs missing
DocsGen->>DocsGen: generate_software_docs("nginx")
end
DocsGen->>FS: read and merge .md files
DocsGen->>Renderer: convert merged markdown to pdf/html
alt pdfkit available
Renderer-->>DocsGen: nginx_docs.pdf
else fallback to HTML
Renderer-->>DocsGen: nginx_docs.html
end
DocsGen->>FS: write exported file
DocsGen-->>CLI: export_file_path
CLI-->>User: ✓ Exported to nginx_docs.pdf
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
CLA Verification PassedAll contributors have signed the CLA.
|
Summary of ChangesHello @pratyush07-hub, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request integrates an automatic documentation generator into Cortex Linux, significantly improving the accessibility and currency of system and software documentation. By automating the creation and updating of guides, it aims to provide users with up-to-date information on installations, configurations, and usage, directly enhancing the overall user experience and system maintainability. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces an automatic documentation generator, a significant new feature for Cortex. The implementation is comprehensive, covering document generation, viewing, and exporting. However, the current implementation has critical security vulnerabilities related to path traversal that must be addressed. Additionally, there are opportunities to improve error handling, dependency management, and code structure for better maintainability and robustness. While the test coverage for the new feature is good, it lacks tests for the identified security vulnerabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@cortex/cli.py`:
- Around line 4022-4043: The current check uses substring heuristics ("failed"
in path.lower()) in the docs export branch; update export logic so export_docs
returns a structured result (e.g., tuple (success, path) or dict {'success':
bool, 'path': str} ) or raises an exception on error, and then change the CLI
branch (the args.docs_action == "export" block that calls docs_gen.export_docs)
to rely on that boolean/exception instead of substring matching; specifically
modify DocsGenerator.export_docs and the caller in cortex/cli.py to use the new
return shape (check success explicitly or catch the exception) and display the
appropriate success or warning message.
In `@cortex/docs_generator.py`:
- Around line 125-129: The code uses user-provided software_name, format, and
filenames directly to build filesystem paths and resolve templates (see
variables/software_name, format, docs_dir and the file-writing loop in
docs_generator.py), which allows path traversal; fix by validating and
normalizing these inputs: reject or sanitize any value containing path
separators, "..", or null bytes; restrict format to an allowlist of known safe
formats; construct paths using pathlib and call resolve() then assert the
resolved path is a child of docs_dir to prevent escapes; also sanitize filenames
produced in docs (and any template lookup keys) to a safe subset of characters
or map them to generated safe names before opening/writing files. Ensure the
same checks are applied at the other occurrences you noted (around the other
blocks at lines 134-139, 228-229, 242-247).
- Around line 23-30: Add explicit return type annotations and a docstring:
annotate the constructor def __init__(self) as -> None and add a one-line
docstring describing initialization (e.g., "Initialize docs generator, configure
paths and helpers."), and annotate the public method def view_guide(self, ...)
as -> None (keep its existing docstring unchanged). Update the function
signatures in the Cortex DocsGenerator class (look for __init__ and view_guide)
to include the -> None return type to satisfy the typing guideline.
In `@cortex/installation_history.py`:
- Around line 366-378: The current docs auto-generation block (guarded by
InstallationStatus.SUCCESS and invoking
cortex.docs_generator.DocsGenerator.generate_software_docs on packages) must be
further gated by the installation's operation_type so it only runs for INSTALL,
UPGRADE, and CONFIG (not for REMOVE/PURGE/ROLLBACK or dry-run). Update the block
to query the installation record (using the installation identifier available in
this context) to fetch operation_type from the DB, check that operation_type is
one of "INSTALL", "UPGRADE", or "CONFIG" before instantiating DocsGenerator and
calling generate_software_docs for each package, and skip generation otherwise;
keep the existing ImportError and generic Exception handling around
DocsGenerator as-is.
🧹 Nitpick comments (1)
docs/modules/README_DOCS_GENERATOR.md (1)
21-23: Minor wording nit: “CLI Interface” is redundant.Consider shortening to “CLI” for clarity.
✏️ Wording tweak
-- **CLI Interface (`cortex/cli.py`)**: The `cortex docs` command group. +- **CLI (`cortex/cli.py`)**: The `cortex docs` command group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cortex/cli.py`:
- Around line 4056-4077: The docs command block lacks exception handling around
DocsGenerator operations (DocsGenerator, generate_software_docs, export_docs,
view_guide), so wrap each docs_action branch (generate, export, view) in a
try/except that catches Exception, logs a user-friendly error via cx_print
(matching other CLI handlers), and returns a non-zero exit code on failure;
ensure you still return the normal success codes when no exception occurs and
include the exception message in the cx_print to aid the user.
♻️ Duplicate comments (1)
cortex/cli.py (1)
4056-4077: Avoid substring heuristics for export failures.
"failed" in path.lower()will misclassify software names likefailed-service. Prefer a stricter prefix or a structured return fromexport_docs.
🧹 Nitpick comments (1)
cortex/cli.py (1)
4075-4077: Handle missingdocs_actiongracefully.When the user runs
cortex docswithout a subcommand,args.docs_actionwill beNone, which falls through to theelsebranch and prints help. This works, but consider adding an explicit check for better clarity, similar to hownotify(line 223) handles missing subcommands.♻️ Optional improvement
elif args.command == "docs": + if not args.docs_action: + docs_parser.print_help() + return 0 docs_gen = DocsGenerator() if args.docs_action == "generate":
Anshgrover23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pratyush07-hub Kindly address all coderabbitai comments and then ping me again.
Also, follow contributing.md guidelines ( i.e. add a demonstration video in PR description, write AI Usage etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cortex/installation_history.py (1)
334-341: Lines 335-361 are incorrectly dedented outside thewithblock context.After
op_type = InstallationType(result[2])(line 334), the indentation drops from 16 spaces to 12 spaces, exiting thewith self._pool.get_connection() as conn:context. Lines 335-361 attempt to useresult,cursor, andconnoutside their valid scope, which would cause a runtime error since the connection context has been closed.The entire block from line 335 through line 361 (database UPDATE, commit, and logging) must be indented 4 additional spaces to remain within the
withblock.🐛 Proposed fix
packages = json.loads(result[0]) op_type = InstallationType(result[2]) - start_time = datetime.datetime.fromisoformat(result[1]) - duration = (datetime.datetime.now() - start_time).total_seconds() - - # Create after snapshot - after_snapshot = self._create_snapshot(packages) - - # Update record - cursor.execute( + start_time = datetime.datetime.fromisoformat(result[1]) + duration = (datetime.datetime.now() - start_time).total_seconds() + + # Create after snapshot + after_snapshot = self._create_snapshot(packages) + + # Update record + cursor.execute((Continue re-indenting all lines through line 361 to be inside the
withblock)
🤖 Fix all issues with AI agents
In `@cortex/docs_generator.py`:
- Around line 287-298: The export_docs method is overwriting the validated path
returned by _get_software_dir and then constructing export_path using the raw
software_name, reintroducing path traversal; keep and use the sanitized
software_dir returned by _get_software_dir (do not reassign software_dir =
self.docs_dir / software_name), ensure generate_software_docs is called with the
original software_name only if software_dir.exists() is false, and build
export_path using a safe basename derived from the validated software_dir (e.g.,
software_dir.name or otherwise sanitized identifier) combined with the validated
format so the filename/export target cannot be influenced by unsanitized input.
🧹 Nitpick comments (6)
cortex/docs_generator.py (3)
79-92: Inconsistent use of sanitized vs. unsanitized name in data gathering.Line 84 uses
safe_namefor package lookup, but line 91 uses the originalsoftware_namefor filtering installation history. This inconsistency could cause mismatches if the input contains characters that get sanitized (e.g.,pkg$name→pkg_name).Consider using
safe_nameconsistently:♻️ Suggested fix
# Get installation history for this package history_records = self.history.get_history(limit=100) pkg_history = [ r for r in history_records - if software_name in r.packages and r.status == InstallationStatus.SUCCESS + if safe_name in r.packages and r.status == InstallationStatus.SUCCESS ]
153-173: Consider expanding the docstring for public API.Per coding guidelines, public APIs should have docstrings. While a one-liner exists, expanding it with parameter and return documentation would improve usability:
📝 Suggested docstring
def generate_software_docs(self, software_name: str) -> dict[str, str]: - """Generate multiple MD documents for a software.""" + """Generate multiple MD documents for a software. + + Args: + software_name: Name of the software package to document. + + Returns: + A dict mapping document filenames to their absolute paths. + + Raises: + ValueError: If software_name is invalid or contains path traversal attempts. + """ software_dir = self._get_software_dir(software_name)
174-193: Potential path traversal viaguide_nameparameter.While
software_nameis sanitized, theguide_nameparameter is used directly in path construction (lines 177-178) without validation. A maliciousguide_namelike../../etc/passwdcould potentially escape the template directory.Although this is an internal method and callers use hardcoded guide names, adding validation would provide defense-in-depth:
🔒 Suggested hardening
def _get_template(self, software_name: str, guide_name: str) -> Template: """Load a template for a specific software or the default.""" safe_name = self._sanitize_name(software_name) + # Validate guide_name against known templates + valid_guides = {"Installation_Guide", "Configuration_Reference", "Quick_Start", "Troubleshooting"} + if guide_name not in valid_guides: + logger.warning(f"Unknown guide type: {guide_name}, using fallback") + return Template("# ${name}\n\nDocumentation template missing.") + software_template = (self.template_base_dir / safe_name / f"{guide_name}.md").resolve() default_template = (self.template_base_dir / "default" / f"{guide_name}.md").resolve()docs/modules/README_DOCS_GENERATOR.md (1)
23-23: Minor: "CLI Interface" is redundant."CLI" already stands for "Command Line Interface", so "CLI Interface" is tautological. Consider simplifying to just "CLI" or expanding to "Command-Line Interface".
📝 Suggested fix
-- **CLI Interface (`cortex/cli.py`)**: The `cortex docs` command group. +- **CLI (`cortex/cli.py`)**: The `cortex docs` command group.tests/test_docs_security.py (1)
78-87: Simplify positive test case.The try/except with
pytest.failis unnecessary. If the method raises, pytest will fail the test automatically with a clearer traceback.♻️ Suggested simplification
def test_safe_software_name(): """Verify that legitimate software names are accepted.""" gen = DocsGenerator() - try: - gen._sanitize_name("postgresql") - gen._sanitize_name("nginx-common") - gen._sanitize_name("python3.12") - gen._sanitize_name("libssl1.1") - except ValueError: - pytest.fail("Legitimate software name raised ValueError") + # These should not raise - test fails automatically if they do + assert gen._sanitize_name("postgresql") == "postgresql" + assert gen._sanitize_name("nginx-common") == "nginx-common" + assert gen._sanitize_name("python3.12") == "python3.12" + assert gen._sanitize_name("libssl1.1") == "libssl1.1"tests/test_doc_trigger_gating.py (1)
22-30: Consider adding more status combinations for completeness.The test cases cover the key scenarios but could be more comprehensive by adding
FAILEDstatus forUPGRADEandCONFIGoperations to ensure consistency:🧪 Additional test cases
test_cases = [ (InstallationType.INSTALL, InstallationStatus.SUCCESS, True), (InstallationType.UPGRADE, InstallationStatus.SUCCESS, True), (InstallationType.CONFIG, InstallationStatus.SUCCESS, True), (InstallationType.REMOVE, InstallationStatus.SUCCESS, False), (InstallationType.PURGE, InstallationStatus.SUCCESS, False), (InstallationType.ROLLBACK, InstallationStatus.SUCCESS, False), (InstallationType.INSTALL, InstallationStatus.FAILED, False), + (InstallationType.UPGRADE, InstallationStatus.FAILED, False), + (InstallationType.CONFIG, InstallationStatus.FAILED, False), ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cortex/docs_generator.py`:
- Around line 287-343: The export_docs function validates using format.lower()
but then compares the original format string, causing mixed-case inputs like
"MD" to bypass branches; normalize the format variable immediately after
validation (e.g., set format = format.lower()) so subsequent comparisons (if
format == "md"/"html"/"pdf") and the export_path extension creation use the
normalized value; update any occurrences of export_path and generated filenames
that depend on format or safe_name to use the normalized format to ensure
correct branch selection and output file naming.
|
Hey @Anshgrover23 , |
|



Related Issue
Closes #58
Summary
Implemented the Automatic Documentation Generator for Cortex Linux.
This PR adds the ability to generate installation guides, system configuration docs, and usage guides, export them in MD, PDF, or HTML, auto-update on changes, support customizable templates, and includes unit tests with >80% coverage.
Demonstration
Screencast.from.2026-01-19.12-54-56.webm
AI Disclosure
Claude Opus 4.5 (Antigravity Coding Assistant) was used to help improve test case clarity, organize documentation sections, and ensure consistent structure across the docs.
Checklist
type(scope): descriptionor[scope] descriptionpytest tests/)Summary by CodeRabbit
New Features
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.