Skip to content

Conversation

@Sahilbhatane
Copy link
Collaborator

@Sahilbhatane Sahilbhatane commented Dec 31, 2025

Related Issue

Closes #325

Summary

Adds voice input capability to Cortex Linux, enabling users to speak commands instead of typing them. This implementation uses local speech-to-text processing with Whisper for privacy and low latency.

Features Added

  • Voice command mode: cortex voice for continuous voice input
  • Single command mode: cortex voice --single for one-shot recording
  • Mic flag integration: cortex install --mic and cortex ask --mic
  • Push-to-talk hotkey: F9 (customizable via CORTEX_VOICE_HOTKEY)
  • Confirmation prompt: After voice transcription, users can choose dry-run, execute, or cancel
  • Visual feedback: Recording animation (● Recording...) during speech capture

Technical Details

  • Uses faster-whisper (optimized Whisper) for accurate, local STT
  • Default model: base.en (~150MB, good accuracy/speed balance)
  • Optional dependencies via pip install cortex-linux[voice]
  • Cross-platform support (Linux primary, Windows compatible)

Files Changed

  • cortex/voice.py - New voice input handler module
  • cortex/cli.py - Added voice commands and --mic flags
  • cortex/branding.py - Windows ASCII fallback for console output
  • pyproject.toml - Added [voice] optional dependencies
  • docs/VOICE_INPUT.md - User documentation
  • tests/test_voice.py - Unit tests (20 tests)
  • tests/test_ollama_integration.py - Fixed Windows compatibility

Checklist

  • Tests pass (pytest tests/)
  • MVP label added if closing MVP issue
  • Update "Cortex -h" (if needed)
  • Linting passes (ruff check, black --check)
  • Security check passes (bandit -r cortex/)
  • Documentation added (docs/VOICE_INPUT.md)
  • AI Use - IDE - Custom LLM council - Opus 4.5 x 5.1-codex-max x gemini 3 pro

Note from maintainer: This PR implements voice input using a local-first approach with faster-whisper instead of the Fizy AI integration originally proposed in #325. This decision was made to:

  1. Privacy: All speech processing happens locally—no data sent to external services
  2. Latency: Local processing is faster than round-trip API calls
  3. Offline support: Works without internet connection
  4. Cost: No per-request API fees for users

The Fizy integration may be revisited in a future phase if there's demand for cloud-based voice processing with additional features.

additionally for Wayland based Ubuntu user (if hotkey doesn't work) -

Option 1: Run with sudo (for global hotkey access)
sudo cortex voice

Option 2: Use X11 session (login screen → gear icon → "Ubuntu on Xorg")

Option 3: Use single-shot mode (no hotkey needed)
cortex voice --single`


Summary by CodeRabbit

  • New Features

    • Integrated voice input (push-to-talk, continuous mode) with a top-level voice command and --mic support for CLI.
  • UX

    • Improved Windows console rendering with platform-appropriate icons and separators.
    • Enhanced install flow with clearer progress, parallel/sequential handling, and better success/failure reporting.
  • Documentation

    • New Voice Input guide; updated command reference and README.
  • Dependencies / Packaging

    • Added optional "voice" extras group and updated packaging/requirements.
  • Tests

    • Comprehensive tests for voice input and related integrations.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings December 31, 2025 13:23
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 31, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a push-to-talk voice input subsystem (VoiceInputHandler + CLI wiring), Windows-safe terminal rendering fallbacks, documentation and tests, a new voice optional dependency extra, packaging updates, and refactors the install flow with parallel/sequential execution helpers.

Changes

Cohort / File(s) Summary
Voice core & tests
cortex/voice.py, tests/test_voice.py
New VoiceInputHandler, exceptions (VoiceInputError, MicrophoneNotFoundError, ModelNotFoundError), get_voice_handler(); push-to-talk hotkey, audio capture, lazy model loading (faster-whisper), transcription modes, continuous/single-shot flows, and comprehensive unit tests.
CLI integration
cortex/cli.py
Added `CortexCLI.voice(continuous: bool=False, model: str
Branding / Console
cortex/branding.py
Console initialized with Windows-aware flags; substitutes ASCII icons/separators on Windows and preserves Unicode elsewhere; example strings adjusted. Check rendering-dependent tests/output.
Docs & CLI reference
docs/VOICE_INPUT.md, docs/COMMANDS.md, README.md
New Voice Input docs, Quick Reference entry for cortex voice, and README feature row. Validate examples, requirements, and usage text.
Packaging & deps
pyproject.toml, setup.py, requirements.txt
Added voice optional-deps (faster-whisper, sounddevice, pynput, numpy); updated all extra and extras_require["voice"]; requirements tweaks (PyYAML change). Confirm extras and version bounds.
Tests: infra & tweaks
tests/test_ollama_integration.py, tests/test_api_key_detector.py, tests/test_cli.py, tests/test_cli_extended.py, tests/integration/test_end_to_end.py
Ollama presence check now uses shutil.which; several tests now patch pathlib.Path.cwd instead of home; CI bootstrap updated to install system deps for voice-related native builds. Check CI/environment assumptions.
Git metadata
.gitignore
Added common virtualenv entries (env/, venv/, myenv/, etc.).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as CortexCLI
    participant VIH as VoiceInputHandler
    participant Audio as Audio System
    participant Model as Whisper Model

    User->>CLI: run `cortex voice` or use `--mic`
    CLI->>VIH: get_voice_handler(model?)
    VIH->>Audio: check microphone availability
    Audio-->>VIH: device OK

    Note over User,VIH: Push-to-talk (hotkey) or single-shot
    User->>Audio: press hotkey & speak
    Audio->>VIH: stream audio chunks
    User->>VIH: release hotkey / end capture
    VIH->>VIH: assemble buffer, stop recording

    VIH->>Model: lazy load & transcribe(audio)
    Model-->>VIH: text/segments
    VIH->>CLI: deliver transcription callback
    CLI->>User: display result or invoke install/ask flow
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

MVP

Suggested reviewers

  • Suyashd999
  • Dhruv-89

Poem

🐰
I press my paw, the hotkey sings,
Tiny hops and audio rings;
Whisper listens, words take flight,
Commands bounce out into the night,
Hooray — the CLI now listens right! 🎙️

🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR implements local voice input via faster-whisper instead of the Fizy AI integration specified in #325. While the feature adds voice capabilities, it does not implement the required Fizy integration, endpoints, authentication, or latency requirements documented in the issue. The PR should either implement the Fizy AI integration as specified in #325, or the linked issue should be updated to reflect the new local-first voice approach or a new issue created for future Fizy integration.
Out of Scope Changes check ❓ Inconclusive The PR includes changes beyond the specified issue scope: Windows ASCII fallback in branding.py, setup.py creation, requirements.txt dependency changes, and updates to test utilities for API key detection paths, which are tangential to the core voice feature. Clarify whether peripheral changes (Windows compatibility, dependency management, test infrastructure updates) are intentional supporting changes or scope creep; document rationale for setup.py and requirements.txt modifications.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'feat: Added voice feature for end-user fix #325' accurately describes the primary change—adding voice input functionality to address issue #325, using the conventional commit format.
Description check ✅ Passed The PR description is comprehensive, including related issue, detailed summary with features, technical details, files changed, and a complete checklist with AI disclosure. It follows the template structure well.
Docstring Coverage ✅ Passed Docstring coverage is 82.05% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Dec 31, 2025

CLA Verification Passed

All contributors have signed the CLA.

Contributor Signed As
@Sahilbhatane @Sahilbhatane

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds voice input capability to Cortex Linux, enabling users to speak commands instead of typing them. The implementation uses local speech-to-text processing with faster-whisper for privacy and low latency. However, there are several critical issues that need to be addressed before merging.

Key Changes

  • New voice input module with support for continuous and single-shot voice commands
  • Optional [voice] dependency group in pyproject.toml
  • Windows compatibility improvements in branding module (ASCII fallback for console output)
  • Comprehensive documentation for voice features
  • 20 unit tests for voice functionality

Reviewed changes

Copilot reviewed 12 out of 42 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/test_voice.py Unit tests for voice input handler with mocked dependencies
requirements.txt Issue: Voice dependencies incorrectly added as required instead of optional
pyproject.toml Proper optional voice dependency group configuration
myenv/* Critical: Entire virtual environment directory committed (should be excluded)
docs/VOICE_INPUT.md Comprehensive user documentation for voice features
cortex/branding.py Windows compatibility with ASCII fallback for special characters

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
requirements.txt (1)

9-25: Duplicate PyYAML dependencies with inconsistent versions.

There are three occurrences of PyYAML in this file:

  • Line 9: PyYAML>=6.0.0
  • Line 21: pyyaml>=6.0.0
  • Line 25: PyYAML==6.0.3

This creates confusion and potential version conflicts. Keep only one entry with a consistent version constraint.

🔎 Proposed fix
 # Configuration
 PyYAML>=6.0.0
 
 # Environment variable loading from .env files
 python-dotenv>=1.0.0
 
 # Encryption for environment variable secrets
 cryptography>=42.0.0
 
 # Terminal UI
 rich>=13.0.0
 
-# Configuration
-pyyaml>=6.0.0
-
 # Type hints for older Python versions
 typing-extensions>=4.0.0
-PyYAML==6.0.3
♻️ Duplicate comments (4)
myenv/Scripts/deactivate.bat (1)

1-22: Part of virtual environment that should not be committed.

This file is part of the myenv/ virtual environment directory that should be removed from version control. See the comment on myenv/pyvenv.cfg for details.

myenv/share/man/man1/isympy.1 (1)

1-188: Third-party package artifact that should not be committed.

This is a SymPy man page installed in the virtual environment. It's part of the myenv/ directory that should be removed from version control. See the comment on myenv/pyvenv.cfg for details.

myenv/Scripts/activate.bat (1)

1-34: Part of virtual environment with hardcoded developer paths.

This file contains a hardcoded path (C:\Users\sahil\...) on line 11 and is part of the myenv/ virtual environment that should be removed. See the comment on myenv/pyvenv.cfg for details.

myenv/Scripts/Activate.ps1 (1)

1-528: Part of virtual environment that should not be committed.

This PowerShell activation script is part of the myenv/ virtual environment directory that should be removed from version control. See the comment on myenv/pyvenv.cfg for details.

🧹 Nitpick comments (4)
docs/VOICE_INPUT.md (1)

66-83: Add language identifier to fenced code blocks for markdown lint compliance.

Static analysis flagged missing language specifiers. For terminal output and diagrams, use text or console as the language identifier.

🔎 Suggested fix

Line 66:

-```
+```text
$ cortex voice

Line 146:

-```
+```text
┌──────────────┐    ┌──────────────┐
cortex/voice.py (3)

296-307: Recording indicator bypasses branding utilities.

The recording indicator at line 302 uses raw string formatting " CX | " instead of the cx_print function from cortex/branding.py. This creates inconsistency in terminal output, especially on Windows where cx_print uses ASCII-only icons.

🔎 Suggested approach

Consider using console.print with Rich markup for consistency, or document why direct stdout is necessary (e.g., for \r carriage return updates).

If carriage return updates are required, you could use Rich's Live context or Status for animated updates instead of raw stdout manipulation.


425-428: Busy-wait loop could be replaced with event-based waiting.

The infinite loop with time.sleep(0.1) is a busy-wait pattern. While functional, it wastes CPU cycles polling. Consider using an Event.wait() pattern that blocks until signaled.

🔎 Alternative approach
+        self._exit_event = threading.Event()
+
         try:
             # Keep the main thread alive
-            while True:
-                time.sleep(0.1)
+            self._exit_event.wait()
         except KeyboardInterrupt:
             cx_print("\nVoice mode exited.", "info")

Then set self._exit_event.set() in the stop() method.


482-487: Bare except clause silently swallows all exceptions.

The bare except at line 485 catches and ignores all exceptions, including KeyboardInterrupt and SystemExit. This could mask programming errors during debugging.

🔎 Suggested fix
         if self._hotkey_listener:
             try:
                 self._hotkey_listener.stop()
-            except Exception:
-                pass
+            except OSError:
+                # Listener may already be stopped
+                pass
             self._hotkey_listener = None

Or log the exception at debug level:

except Exception as e:
    logging.debug("Error stopping hotkey listener: %s", e)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8171eca and 9acb0d7.

⛔ Files ignored due to path filters (29)
  • myenv/Scripts/coloredlogs.exe is excluded by !**/*.exe
  • myenv/Scripts/cortex.exe is excluded by !**/*.exe
  • myenv/Scripts/ct2-fairseq-converter.exe is excluded by !**/*.exe
  • myenv/Scripts/ct2-marian-converter.exe is excluded by !**/*.exe
  • myenv/Scripts/ct2-openai-gpt2-converter.exe is excluded by !**/*.exe
  • myenv/Scripts/ct2-opennmt-py-converter.exe is excluded by !**/*.exe
  • myenv/Scripts/ct2-opennmt-tf-converter.exe is excluded by !**/*.exe
  • myenv/Scripts/ct2-opus-mt-converter.exe is excluded by !**/*.exe
  • myenv/Scripts/ct2-transformers-converter.exe is excluded by !**/*.exe
  • myenv/Scripts/distro.exe is excluded by !**/*.exe
  • myenv/Scripts/dotenv.exe is excluded by !**/*.exe
  • myenv/Scripts/f2py.exe is excluded by !**/*.exe
  • myenv/Scripts/hf.exe is excluded by !**/*.exe
  • myenv/Scripts/httpx.exe is excluded by !**/*.exe
  • myenv/Scripts/humanfriendly.exe is excluded by !**/*.exe
  • myenv/Scripts/isympy.exe is excluded by !**/*.exe
  • myenv/Scripts/markdown-it.exe is excluded by !**/*.exe
  • myenv/Scripts/numpy-config.exe is excluded by !**/*.exe
  • myenv/Scripts/onnxruntime_test.exe is excluded by !**/*.exe
  • myenv/Scripts/openai.exe is excluded by !**/*.exe
  • myenv/Scripts/pip.exe is excluded by !**/*.exe
  • myenv/Scripts/pip3.12.exe is excluded by !**/*.exe
  • myenv/Scripts/pip3.exe is excluded by !**/*.exe
  • myenv/Scripts/pyav.exe is excluded by !**/*.exe
  • myenv/Scripts/pygmentize.exe is excluded by !**/*.exe
  • myenv/Scripts/python.exe is excluded by !**/*.exe
  • myenv/Scripts/pythonw.exe is excluded by !**/*.exe
  • myenv/Scripts/tiny-agents.exe is excluded by !**/*.exe
  • myenv/Scripts/tqdm.exe is excluded by !**/*.exe
📒 Files selected for processing (13)
  • cortex/branding.py
  • cortex/cli.py
  • cortex/voice.py
  • docs/VOICE_INPUT.md
  • myenv/Scripts/Activate.ps1
  • myenv/Scripts/activate
  • myenv/Scripts/activate.bat
  • myenv/Scripts/deactivate.bat
  • myenv/pyvenv.cfg
  • myenv/share/man/man1/isympy.1
  • pyproject.toml
  • requirements.txt
  • tests/test_voice.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/test_voice.py
  • cortex/voice.py
  • cortex/branding.py
  • cortex/cli.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/test_voice.py
{setup.py,setup.cfg,pyproject.toml,**/__init__.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use Python 3.10 or higher as the minimum supported version

Files:

  • pyproject.toml
🧬 Code graph analysis (3)
tests/test_voice.py (1)
cortex/voice.py (9)
  • VoiceInputHandler (39-495)
  • _ensure_dependencies (87-125)
  • _check_microphone (153-175)
  • transcribe (226-270)
  • stop (477-495)
  • VoiceInputError (21-24)
  • MicrophoneNotFoundError (27-30)
  • ModelNotFoundError (33-36)
  • get_voice_handler (498-517)
cortex/voice.py (1)
cortex/branding.py (1)
  • cx_print (52-82)
cortex/cli.py (1)
cortex/voice.py (4)
  • VoiceInputError (21-24)
  • VoiceInputHandler (39-495)
  • start_voice_mode (399-432)
  • record_single (434-475)
🪛 GitHub Actions: CI
cortex/cli.py

[error] 580-580: Ruff check failed: W293 Blank line contains whitespace. Command: 'ruff check . --output-format=github'.

🪛 GitHub Check: lint
cortex/cli.py

[failure] 591-591: Ruff (W293)
cortex/cli.py:591:1: W293 Blank line contains whitespace


[failure] 588-588: Ruff (W293)
cortex/cli.py:588:1: W293 Blank line contains whitespace


[failure] 580-580: Ruff (W293)
cortex/cli.py:580:1: W293 Blank line contains whitespace

🪛 GitHub Check: Lint
cortex/cli.py

[failure] 591-591: Ruff (W293)
cortex/cli.py:591:1: W293 Blank line contains whitespace


[failure] 588-588: Ruff (W293)
cortex/cli.py:588:1: W293 Blank line contains whitespace


[failure] 580-580: Ruff (W293)
cortex/cli.py:580:1: W293 Blank line contains whitespace

🪛 LanguageTool
docs/VOICE_INPUT.md

[grammar] ~141-~141: Ensure spelling is correct
Context: ...apture** - Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribe...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~141-~141: Ensure spelling is correct
Context: ... Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribes using `fa...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🪛 markdownlint-cli2 (0.18.1)
docs/VOICE_INPUT.md

66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


146-146: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Agent
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.10)
🔇 Additional comments (14)
cortex/branding.py (3)

8-14: Good Windows compatibility improvements for Rich console.

The platform-aware Console initialization with force_terminal=True and legacy_windows based on platform detection is a solid approach for cross-platform terminal rendering.


63-79: Well-structured platform-specific icon fallbacks.

The ASCII fallback icons for Windows (|, +, !, x, *) are appropriate replacements for the Unicode characters that may not render correctly in Windows terminals. The conditional structure is clean and maintainable.


92-102: Consistent platform-specific separators.

The separator adjustments in cx_step and cx_header align with the icon changes above, ensuring consistent visual appearance across platforms.

pyproject.toml (1)

72-79: Voice optional dependencies are correctly structured.

The optional dependency group is properly defined with appropriate version constraints compatible with Python 3.10+. Including it in the all extra is appropriate for comprehensive installation.

docs/VOICE_INPUT.md (1)

1-46: Well-structured documentation with comprehensive coverage.

The documentation covers all essential aspects: installation, usage modes (continuous and single), configuration options, troubleshooting, privacy considerations, and API reference. The privacy-first approach (local processing, no audio uploads) is well highlighted.

tests/test_voice.py (1)

10-11: Good test organization with clear class-based grouping.

The test suite is well-organized into logical test classes (TestVoiceInputHandler, TestVoiceInputExceptions, TestGetVoiceHandler, TestRecordingState), each with focused test methods and appropriate fixtures. The mocking strategy for optional dependencies is sound.

Also applies to: 199-200, 224-225, 268-269

cortex/cli.py (3)

540-552: Good implementation with proper dependency handling.

The voice method correctly handles the optional dependency import with a helpful error message guiding users to install the voice extras. The API key and provider checks follow the established pattern in the codebase.


560-577: Consider case-insensitive verb matching and edge cases.

The install verb detection uses startswith() after lowercasing, which is good. However, the verb removal at lines 574-577 operates on the original text but uses len(verb) from the lowercase version, which works correctly since length is preserved. The logic is sound.

One edge case: if the user says "Install", the slicing text[len(verb):] correctly preserves the original casing of the software name.


1633-1638: Help table correctly advertises new voice capabilities.

The Rich help table now includes voice command entries, making the feature discoverable to users.

cortex/voice.py (5)

21-36: Well-designed exception hierarchy.

The exception classes are properly structured with a base VoiceInputError and specific subclasses for different error conditions. Docstrings are present as required by coding guidelines. The pass statements are appropriate for simple exception classes.


51-85: Good initialization with sensible defaults and type hints.

The __init__ method has proper type hints, comprehensive docstrings, and sensible defaults. The environment variable fallback for model_name provides flexibility. Internal state management with explicit initialization is clean.


226-270: Transcription method is well-implemented with proper validation.

The transcribe method correctly:

  • Handles lazy model loading (line 238-239)
  • Returns early for empty audio (line 241-242)
  • Validates dtype before processing (line 245-246)
  • Uses appropriate VAD parameters for voice commands
  • Strips and joins segment texts properly

498-517: Factory function provides clean API.

The get_voice_handler factory function has proper type hints and docstring. It provides a simpler interface for common use cases while still allowing customization.


177-202: Audio recording implementation is robust.

The _start_recording method properly:

  • Clears the buffer before starting
  • Uses a callback-based approach for non-blocking capture
  • Sets up appropriate stream parameters (16kHz, mono, float32)
  • Handles errors by resetting state and raising a specific exception

@Sahilbhatane Sahilbhatane changed the title Issue 325 Added voice feature for end-user fix #325 Dec 31, 2025
@dhvll
Copy link
Collaborator

dhvll commented Dec 31, 2025

@Sahilbhatane remove all those virtual environment files. You should never push those.

@Sahilbhatane
Copy link
Collaborator Author

@Sahilbhatane remove all those virtual environment files. You should never push those.

yeah ik, .gitignore didn't recognized them ig.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
cortex/voice.py (1)

486-490: Improve exception handling in hotkey listener cleanup.

The bare except Exception: pass silently swallows all errors. Following the pattern used at line 498 for the audio stream, add debug logging to track potential issues during cleanup.

🔎 Proposed fix
         if self._hotkey_listener:
             try:
                 self._hotkey_listener.stop()
-            except Exception:
-                pass
+            except Exception as e:
+                logging.debug("Error stopping hotkey listener: %s", e)
             self._hotkey_listener = None
cortex/cli.py (3)

554-558: Unused provider variable.

The provider variable is retrieved at line 558 but never used in the voice method. The process_voice_command callback already retrieves the API key and uses self.ask() and self.install(), which internally get the provider themselves.

🔎 Consider removing unused variable
         api_key = self._get_api_key()
         if not api_key:
             return 1
-
-        provider = self._get_provider()

579-600: Fix trailing whitespace to resolve pipeline failures.

Based on past review comments, lines 580, 588, and 591 contain trailing whitespace on blank lines, causing CI/linter failures. These must be fixed for the PR to pass.

🔎 Remove trailing whitespace

Ensure lines 580, 588, and 591 are completely empty (no spaces or tabs):

                 cx_print(f"Installing: {software}", "info")
-                
+
                 # Ask user for confirmation
                 console.print()
                 console.print("[bold cyan]Choose an action:[/bold cyan]")
                 console.print("  [1] Dry run (preview commands)")
                 console.print("  [2] Execute (run commands)")
                 console.print("  [3] Cancel")
                 console.print()
-                
+
                 try:
                     choice = input("Enter choice [1/2/3]: ").strip()
-                    
+
                     if choice == "1":

2000-2015: Add error handling for VoiceInputError and its subclasses.

The install --mic flow catches ImportError but doesn't handle VoiceInputError, MicrophoneNotFoundError, or ModelNotFoundError. If voice recording fails (e.g., no microphone detected, model loading fails), an unhandled exception will propagate.

🔎 Proposed fix
             if getattr(args, "mic", False):
                 try:
-                    from cortex.voice import VoiceInputHandler
+                    from cortex.voice import VoiceInputError, VoiceInputHandler

                     handler = VoiceInputHandler()
                     cx_print("Press F9 to speak what you want to install...", "info")
                     software = handler.record_single()
                     if not software:
                         cx_print("No speech detected.", "warning")
                         return 1
                     cx_print(f"Installing: {software}", "info")
                 except ImportError:
                     cli._print_error("Voice dependencies not installed.")
                     cx_print("Install with: pip install cortex-linux[voice]", "info")
                     return 1
+                except VoiceInputError as e:
+                    cli._print_error(str(e))
+                    return 1
🧹 Nitpick comments (1)
.gitignore (1)

14-20: Consolidate duplicate entries for maintainability.

The .gitignore file contains numerous redundant entries. For example:

  • env/, venv/, ENV/ appear at lines 14–15, 143–145
  • .mypy_cache/, .pytest_cache/, .coverage, htmlcov/ appear at lines 70–71 and 146–149

This duplication reduces readability and makes future maintenance harder.

🔎 Suggested cleanup: Remove duplicates and consolidate sections
  # ==============================
  # Logs & Misc
  # ==============================
  *.log
  logs/
  *.tmp
  *.bak
  *.swp
- .env
- .venv
- env/
- venv/
- ENV/
- .mypy_cache/
- .pytest_cache/
- .coverage
- htmlcov/
  *.out
  *~
  *.swo

Then, verify that all non-duplicate entries are already covered in their respective sections above (Virtual Environments, mypy/Pyre/pytype, etc.).

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9acb0d7 and 891dff5.

⛔ Files ignored due to path filters (23)
  • myenv/Scripts/bandit-baseline.exe is excluded by !**/*.exe
  • myenv/Scripts/bandit-config-generator.exe is excluded by !**/*.exe
  • myenv/Scripts/bandit.exe is excluded by !**/*.exe
  • myenv/Scripts/black.exe is excluded by !**/*.exe
  • myenv/Scripts/blackd.exe is excluded by !**/*.exe
  • myenv/Scripts/coverage-3.12.exe is excluded by !**/*.exe
  • myenv/Scripts/coverage.exe is excluded by !**/*.exe
  • myenv/Scripts/coverage3.exe is excluded by !**/*.exe
  • myenv/Scripts/dmypy.exe is excluded by !**/*.exe
  • myenv/Scripts/mypy.exe is excluded by !**/*.exe
  • myenv/Scripts/mypyc.exe is excluded by !**/*.exe
  • myenv/Scripts/nltk.exe is excluded by !**/*.exe
  • myenv/Scripts/normalizer.exe is excluded by !**/*.exe
  • myenv/Scripts/pip.exe is excluded by !**/*.exe
  • myenv/Scripts/pip3.12.exe is excluded by !**/*.exe
  • myenv/Scripts/pip3.exe is excluded by !**/*.exe
  • myenv/Scripts/py.test.exe is excluded by !**/*.exe
  • myenv/Scripts/pytest.exe is excluded by !**/*.exe
  • myenv/Scripts/ruff.exe is excluded by !**/*.exe
  • myenv/Scripts/safety.exe is excluded by !**/*.exe
  • myenv/Scripts/stubgen.exe is excluded by !**/*.exe
  • myenv/Scripts/stubtest.exe is excluded by !**/*.exe
  • myenv/Scripts/typer.exe is excluded by !**/*.exe
📒 Files selected for processing (6)
  • .gitignore
  • cortex/cli.py
  • cortex/voice.py
  • myenv/share/man/man1/bandit.1
  • tests/test_ollama_integration.py
  • tests/test_voice.py
✅ Files skipped from review due to trivial changes (1)
  • myenv/share/man/man1/bandit.1
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_voice.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/test_ollama_integration.py
  • cortex/voice.py
  • cortex/cli.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/test_ollama_integration.py
🧬 Code graph analysis (3)
tests/test_ollama_integration.py (1)
scripts/setup_ollama.py (1)
  • check_ollama_installed (75-77)
cortex/voice.py (1)
cortex/branding.py (1)
  • cx_print (52-82)
cortex/cli.py (1)
cortex/voice.py (4)
  • VoiceInputError (21-24)
  • VoiceInputHandler (39-499)
  • start_voice_mode (401-436)
  • record_single (438-479)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test (Python 3.12)
  • GitHub Check: Test (Python 3.10)
  • GitHub Check: Test (Python 3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.10)
  • GitHub Check: test (3.11)
🔇 Additional comments (10)
tests/test_ollama_integration.py (1)

15-15: LGTM: Clean refactor to shutil.which.

Replacing the subprocess-based Ollama detection with shutil.which is a cleaner, more direct approach that aligns with the pattern used in scripts/setup_ollama.py (lines 74-76).

Also applies to: 29-30, 37-37

cortex/voice.py (5)

1-36: LGTM: Clean module structure and exception hierarchy.

Module docstring clearly describes the feature, and the exception hierarchy provides appropriate granularity for error handling.


39-86: LGTM: Well-structured initialization with environment-driven configuration.

Good use of lazy loading for the model and environment variable overrides. The type hints and docstring meet coding guidelines.


87-270: LGTM: Robust dependency checks and transcription pipeline.

The dependency validation provides helpful user guidance, and the transcription implementation uses appropriate parameters (beam_size=5, VAD filtering, no_speech_threshold=0.6).


272-436: LGTM: Thread-safe hotkey handling with good UX.

The implementation properly uses threading primitives (locks, events) and provides visual feedback during recording. KeyboardInterrupt handling ensures clean exit.


502-521: LGTM: Clean factory function with proper type hints.

The factory function follows Python conventions and provides a clean entry point for creating voice handlers.

cortex/cli.py (4)

1633-1633: LGTM: Help documentation clearly advertises voice features.

The help table updates appropriately document the new voice input capabilities for users.

Also applies to: 1638-1638


1722-1739: LGTM: Well-structured argument parsing for voice features.

The --mic and --single flags are appropriately configured, and making the question argument optional enables the voice input flow.


1743-1755: LGTM: Consistent --mic flag integration.

The install command's --mic flag follows the same pattern as the ask command, maintaining consistency across the CLI.


1989-1998: LGTM: Clean command routing for voice features.

The routing logic appropriately delegates to the voice method for both explicit voice commands and --mic flags, with helpful error messages.

Comment on lines +16 to +29
myenv/
venv312/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for committed virtual environment files in the current HEAD and recent commits

echo "=== Checking git log for myenv and venv312 directories ==="
git log --all --oneline --full-history -- myenv/ venv312/ 2>&1 | head -20

echo -e "\n=== Checking git status ==="
git status --short | grep -E '(myenv|venv312)' || echo "No myenv/venv312 in current status"

echo -e "\n=== Checking HEAD commit for these files ==="
git ls-tree -r HEAD | grep -E '(myenv|venv312)' || echo "No myenv/venv312 files in HEAD"

echo -e "\n=== Checking recent commits (last 5) ==="
git log --oneline -5

echo -e "\n=== Checking if myenv and venv312 are untracked/ignored ==="
git check-ignore -v myenv/ venv312/ 2>&1 || echo "Checking ignore patterns..."

echo -e "\n=== Listing actual directories in repository root ==="
ls -la | grep -E '(myenv|venv|env)' || echo "No venv directories found in root"

Repository: cortexlinux/cortex

Length of output: 5200


Critical: Virtual environment files committed to git; .gitignore change is incomplete.

The myenv/ directory with virtual environment artifacts (activation scripts, pyvenv.cfg, executables, man pages) is already committed in this PR. Adding myenv/ and venv312/ to .gitignore prevents future commits but does not remove these files from git history.

These directories must be removed from the commit before merging. Use git rm -r --cached myenv/ venv312/ and amend the commit, or rebase to exclude them entirely. Without this cleanup, the repository will include unnecessary virtual environment files that violate the project's policy of never committing local development environments.

🤖 Prompt for AI Agents
In .gitignore around lines 16-17, the review notes that myenv/ and venv312/ were
committed despite being ignored; remove those directories from the repository
index and rewrite the offending commit(s): run git rm -r --cached myenv/
venv312/ to untrack them, commit the removal (or git commit --amend if you want
to fix the latest commit), and if the directories exist in earlier commits use
an interactive rebase or filter-branch/BFG to purge them from history; ensure
.gitignore retains myenv/ and venv312/ so they are not re-added, then push with
force-if-rewriting-history and notify reviewers.

@Sahilbhatane Sahilbhatane marked this pull request as draft December 31, 2025 13:38
@dhvll dhvll requested a review from Anshgrover23 December 31, 2025 13:55
@Anshgrover23 Anshgrover23 removed their request for review December 31, 2025 14:09
@Sahilbhatane Sahilbhatane removed the request for review from mikejmorgan-ai December 31, 2025 14:10
@Sahilbhatane Sahilbhatane marked this pull request as ready for review December 31, 2025 14:35
@Sahilbhatane Sahilbhatane marked this pull request as draft December 31, 2025 14:35
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
cortex/voice.py (2)

185-189: Consider using logging for audio status messages.

Line 187 prints audio status directly to stderr. For consistency with the rest of the module (which uses cx_print and logging.debug), consider using logging.debug() instead.

🔎 Proposed refactor
         def audio_callback(indata, frames, time_info, status):
             if status:
-                print(f"Audio status: {status}", file=sys.stderr)
+                logging.debug("Audio status: %s", status)
             if self._is_recording:
                 self._audio_buffer.append(indata.copy())

238-250: Redundant model check after lazy loading.

Lines 248-250 check if self._model is None after calling _load_model() on line 239. However, _load_model() either sets self._model or raises ModelNotFoundError, so the model can never be None at line 249. This check is dead code.

🔎 Proposed refactor
         if self._model is None:
             self._load_model()

         if len(audio_data) == 0:
             return ""

         # faster-whisper expects float32 audio normalized to [-1, 1]
         if audio_data.dtype != np.float32:
             audio_data = audio_data.astype(np.float32)

-        # Model should be loaded at this point
-        if self._model is None:
-            raise ModelNotFoundError("Model must be loaded before transcription")
-
         segments, info = self._model.transcribe(
docs/VOICE_INPUT.md (1)

66-66: Consider adding language identifiers to code blocks.

The fenced code blocks at lines 66 and 146 are missing language identifiers. While not critical, adding them improves syntax highlighting and documentation quality.

🔎 Suggested additions

For line 66 (example session output):

-```
+```console
 $ cortex voice

For line 146 (ASCII diagram):

-```
+```text
 ┌──────────────┐    ┌──────────────┐    ┌──────────────┐

Also applies to: 146-157

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 891dff5 and 8f2384f.

📒 Files selected for processing (5)
  • cortex/cli.py
  • cortex/voice.py
  • docs/VOICE_INPUT.md
  • requirements.txt
  • tests/test_voice.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • requirements.txt
  • tests/test_voice.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • cortex/cli.py
  • cortex/voice.py
🧬 Code graph analysis (2)
cortex/cli.py (3)
cortex/voice.py (4)
  • VoiceInputError (21-24)
  • VoiceInputHandler (39-499)
  • start_voice_mode (401-436)
  • record_single (438-479)
cortex/first_run_wizard.py (1)
  • _print_error (746-748)
cortex/branding.py (1)
  • cx_print (52-82)
cortex/voice.py (1)
cortex/branding.py (1)
  • cx_print (52-82)
🪛 LanguageTool
docs/VOICE_INPUT.md

[grammar] ~141-~141: Ensure spelling is correct
Context: ...apture** - Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribe...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~141-~141: Ensure spelling is correct
Context: ... Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribes using `fa...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🪛 markdownlint-cli2 (0.18.1)
docs/VOICE_INPUT.md

66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


146-146: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (10)
cortex/voice.py (3)

339-383: LGTM! Thread-safe hotkey handling.

The hotkey setup properly uses a lock for thread safety and correctly sets the recording flag before starting threads. The logic handles both press-to-start and press-to-stop flows correctly.


385-436: LGTM! Proper error handling and resource cleanup.

The recording worker and continuous voice mode implementation correctly handle errors and ensure resources are cleaned up via the finally block.


481-499: LGTM! Exception handling improved.

The cleanup logic properly handles exceptions during shutdown and logs them for debugging. This addresses the past review comment about silent exception handling.

docs/VOICE_INPUT.md (2)

49-49: LGTM! Model size documentation is now consistent.

The documentation correctly states that the default base.en model is ~150MB, which is consistent with the table at line 117. This addresses the past review comment about the documentation inconsistency.


1-261: LGTM! Comprehensive and well-structured documentation.

The documentation provides excellent coverage of the voice input feature, including installation, usage, configuration, troubleshooting, and API reference. The structure is clear and user-friendly.

cortex/cli.py (5)

540-625: LGTM! Well-designed voice input integration.

The voice() method properly handles both continuous and single-shot modes, includes user confirmation for installations, and has comprehensive error handling for missing dependencies and voice input errors.


1989-1996: LGTM! Clean integration of voice input with ask command.

The --mic flag integration properly routes to the voice handler and provides clear error messages when neither a question nor the mic flag is provided.


1998-2027: LGTM! VoiceInputError handling properly implemented.

The install command's --mic integration now correctly imports and catches VoiceInputError (lines 2001, 2014-2016), addressing the past review comment about missing error handling. The implementation provides clear error messages and proper fallback behavior.


1728-1753: LGTM! Parser configuration is correct.

The new voice command and --mic flag integrations are properly configured with clear help text and appropriate argument handling. Making software optional (line 1741) correctly supports the --mic workflow.


1987-1988: LGTM! Voice command routing is correct.

The routing logic correctly maps the --single flag to the continuous parameter by negating it, so that by default (no flag) continuous mode is active, and with --single it switches to single-shot mode.

@mikejmorgan-ai mikejmorgan-ai self-assigned this Jan 10, 2026
@Sahilbhatane Sahilbhatane changed the title Added voice feature for end-user fix #325 feat: Added voice feature for end-user fix #325 Jan 11, 2026
Sahilbhatane added a commit to Sahilbhatane/cortex that referenced this pull request Jan 12, 2026
@Sahilbhatane Sahilbhatane marked this pull request as ready for review January 12, 2026 16:03
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @cortex/cli.py:
- Around line 2656-2674: VoiceInputHandler is instantiated in the --mic install
path but handler.stop() isn't guaranteed to run if an exception occurs; wrap
creation and usage of VoiceInputHandler (the handler = VoiceInputHandler();
software = handler.record_single(); cx_print(...) flow) in a try/finally (or use
a context manager if available) and call handler.stop() in the finally block so
resources are always cleaned up even on exceptions (ensure you still catch
ImportError and VoiceInputError as before and only return after stop() is
called).

In @requirements.txt:
- Around line 22-24: Remove the duplicated dependency and stray comment by
deleting the repeated "# Configuration" comment and the duplicate
"PyYAML>=6.0.0" entry (the second occurrence) so only the original comment and
single "PyYAML>=6.0.0" remain in requirements.txt; ensure no other duplicate
package lines exist to avoid being listed twice by setup.py.
🧹 Nitpick comments (3)
cortex/voice.py (2)

347-360: Consider terminal width for line clearing.

The hardcoded 70 spaces (line 360) may not fully clear the line on wider terminals or may wrap on narrower ones. This is a minor cosmetic issue.

♻️ Optional improvement
         # Clear the line
-        console.print(" " * 70, end="\r")
+        import shutil
+        cols = shutil.get_terminal_size().columns
+        console.print(" " * min(cols - 1, 70), end="\r")

362-388: Consider adding return type hint.

The _get_hotkey_key method could benefit from a return type annotation for clarity, though it's a private method.

♻️ Optional improvement
-    def _get_hotkey_key(self):
+    def _get_hotkey_key(self) -> Any:
         """Get the pynput key object for the configured hotkey."""

Or more precisely with the actual type when pynput is available.

tests/test_cli_extended.py (1)

43-52: LGTM! The added Path.cwd patch aligns with Ollama local key resolution.

The change correctly patches both Path.home and Path.cwd for this specific test case where environment variables are cleared and detection falls through to interactive mode. Other tests (test_get_api_key_openai, test_get_api_key_claude) don't need the Path.cwd patch since they set environment variables that short-circuit before reaching that code path.

Optional: Consider using contextlib.ExitStack or combining patches to reduce nesting depth (4 levels here), though this is consistent with the file's existing style.

♻️ Optional: Flatten nested context managers
     def test_get_api_key_not_found(self) -> None:
         # When no API key is set and user selects Ollama, falls back to Ollama local mode
         from cortex.api_key_detector import PROVIDER_MENU_CHOICES

-        with patch.dict(os.environ, {}, clear=True):
-            with patch("pathlib.Path.home", return_value=self._temp_home):
-                with patch("pathlib.Path.cwd", return_value=self._temp_home):
-                    with patch("builtins.input", return_value=PROVIDER_MENU_CHOICES["ollama"]):
-                        api_key = self.cli._get_api_key()
-                        self.assertEqual(api_key, "ollama-local")
+        with (
+            patch.dict(os.environ, {}, clear=True),
+            patch("pathlib.Path.home", return_value=self._temp_home),
+            patch("pathlib.Path.cwd", return_value=self._temp_home),
+            patch("builtins.input", return_value=PROVIDER_MENU_CHOICES["ollama"]),
+        ):
+            api_key = self.cli._get_api_key()
+            self.assertEqual(api_key, "ollama-local")
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f2384f and 758299d.

📒 Files selected for processing (11)
  • .gitignore
  • cortex/cli.py
  • cortex/voice.py
  • requirements.txt
  • setup.py
  • tests/integration/test_end_to_end.py
  • tests/test_api_key_detector.py
  • tests/test_cli.py
  • tests/test_cli_extended.py
  • tests/test_ollama_integration.py
  • tests/test_voice.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/test_voice.py
  • .gitignore
  • tests/test_ollama_integration.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/test_api_key_detector.py
  • tests/test_cli_extended.py
  • cortex/cli.py
  • tests/test_cli.py
  • setup.py
  • tests/integration/test_end_to_end.py
  • cortex/voice.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/test_api_key_detector.py
  • tests/test_cli_extended.py
  • tests/test_cli.py
  • tests/integration/test_end_to_end.py
{setup.py,setup.cfg,pyproject.toml,**/__init__.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use Python 3.10 or higher as the minimum supported version

Files:

  • setup.py
🧬 Code graph analysis (4)
tests/test_api_key_detector.py (1)
cortex/api_key_detector.py (1)
  • detect (77-123)
tests/test_cli_extended.py (1)
cortex/cli.py (1)
  • _get_api_key (130-152)
tests/test_cli.py (1)
cortex/cli.py (1)
  • _get_api_key (130-152)
cortex/voice.py (1)
cortex/branding.py (1)
  • cx_print (52-82)
🔇 Additional comments (14)
requirements.txt (1)

26-40: Informational comment block looks good.

The documentation for the optional voice feature dependencies and system requirements is helpful for developers who want to enable voice input.

tests/integration/test_end_to_end.py (1)

20-22: LGTM!

The PIP_BOOTSTRAP_DEV constant is well-defined and appropriately used in test_project_tests_run_inside_container to install dev dependencies needed for pytest.

cortex/cli.py (3)

637-743: Voice method implementation is well-structured.

The voice() method properly handles:

  • Dependency import failures with helpful error messages
  • API key validation
  • Continuous vs single-shot modes
  • Input validation for install commands (length, dangerous characters)
  • Resource cleanup in the finally block

744-901: Refactored install helpers improve maintainability.

The extraction of _handle_parallel_execution, _handle_sequential_execution, and related utilities from the install() method reduces complexity and improves readability. Error handling is consistent throughout.


2253-2286: Voice CLI integration is well-designed.

The voice subcommand with --single flag and --mic flags on ask/install provide flexible voice input options. Help text clearly describes the functionality.

cortex/voice.py (6)

19-34: Exception hierarchy is well-designed.

The exception classes provide clear, specific error types for voice input failures with appropriate inheritance from VoiceInputError.


49-83: Initialization is thorough with sensible defaults.

Environment variable support for CORTEX_WHISPER_MODEL provides flexibility. Threading primitives are properly initialized.


206-273: Recording implementation handles resources correctly.

Good practices observed:

  • Memory protection with ~60 second buffer limit (line 221)
  • Proper stream cleanup in _stop_recording_stream
  • Setting _is_recording = False before stopping stream prevents race conditions
  • Buffer cleared in finally block to prevent memory leaks

275-321: Transcription method is well-configured.

The VAD (Voice Activity Detection) parameters and condition_on_previous_text=False setting are appropriate for command-style speech. Lazy model loading improves startup time.


390-434: Hotkey handling uses proper synchronization.

The recording_lock correctly serializes access to the recording state, preventing race conditions when the hotkey is pressed rapidly. The daemon threads ensure cleanup on program exit.


532-550: Cleanup method is robust.

The stop() method properly handles multiple cleanup scenarios with appropriate exception handling, preventing errors from leaking if resources are already cleaned up.

tests/test_cli.py (1)

46-49: Test isolation improvement is correct.

Adding the pathlib.Path.cwd patch ensures the test doesn't accidentally find .env files in the actual current working directory, improving test reliability and isolation.

tests/test_api_key_detector.py (1)

162-166: Test isolation fix is consistent with other test changes.

Patching pathlib.Path.cwd alongside Path.home ensures complete isolation from the real filesystem, preventing false positives if the test runner's directory contains .env files.

setup.py (1)

48-55: Voice extras definition is correct and properly synced across configuration files.

The optional voice dependency group is properly structured with appropriate version constraints in both setup.py and pyproject.toml, and is correctly included in the all extra for comprehensive dependency installation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
cortex/cli.py (4)

679-683: Consider expanding dangerous character validation.

The current list covers common shell metacharacters but misses some injection vectors like newlines (\n, \r), backslashes (\), and quotes (', "). For defense-in-depth, consider using an allowlist pattern instead.

♻️ Suggested improvement
-                # Check for dangerous characters that shouldn't be in package names
-                dangerous_chars = [";", "&", "|", "`", "$", "(", ")"]
-                if any(char in software for char in dangerous_chars):
-                    cx_print("Invalid characters detected in software name", "error")
-                    return
+                # Check for dangerous characters that shouldn't be in package names
+                dangerous_chars = [";", "&", "|", "`", "$", "(", ")", "\n", "\r", "\\", "'", '"']
+                if any(char in software for char in dangerous_chars):
+                    cx_print("Invalid characters detected in software name", "error")
+                    return

695-706: Consider using console.input for consistent styling.

The prompt uses input() while other parts of the CLI use console.input() from Rich for styled prompts. This inconsistency may result in visual differences.

♻️ Optional: Use Rich console for consistent styling
                 try:
-                    choice = input("Enter choice [1/2/3]: ").strip()
+                    choice = console.input("[bold cyan]Enter choice [1/2/3]: [/bold cyan]").strip()

744-753: Hardcoded special case is a code smell.

This normalization logic embeds a specific package combination directly in code. Consider moving such mappings to a configuration file or the existing stacks.json for maintainability.


846-849: Fragile status comparison using string value.

Using getattr(t.status, "value", "") == "failed" is fragile. If t.status is an enum, compare directly against the enum member for type safety.

♻️ Suggested improvement
     def _get_parallel_error_msg(self, parallel_tasks: list) -> str:
         """Extract error message from failed parallel tasks."""
-        failed_tasks = [t for t in parallel_tasks if getattr(t.status, "value", "") == "failed"]
+        from cortex.coordinator import StepStatus
+        failed_tasks = [t for t in parallel_tasks if t.status == StepStatus.FAILED]
         return failed_tasks[0].error if failed_tasks else self.INSTALL_FAIL_MSG
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 758299d and 580eb76.

📒 Files selected for processing (2)
  • cortex/cli.py
  • requirements.txt
🚧 Files skipped from review as they are similar to previous changes (1)
  • requirements.txt
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • cortex/cli.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test (Python 3.10)
  • GitHub Check: Test (Python 3.12)
  • GitHub Check: Test (Python 3.11)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.10)
  • GitHub Check: test (3.12)
🔇 Additional comments (11)
cortex/cli.py (11)

10-10: LGTM!

Import is correctly placed with other cortex module imports.


39-41: LGTM!

Good extraction of a repeated string literal into a class constant for maintainability.


154-175: LGTM!

Comment clarifications improve code readability without changing functionality.


712-742: LGTM!

Proper resource cleanup pattern with handler = None initialization before try block and cleanup in finally. The nested try/except in the finally block correctly prevents cleanup errors from masking the original exception.


765-827: LGTM!

Good error handling with specific exception types (ValueError, OSError) and a catch-all for unexpected errors. The parallel execution logic correctly propagates errors to history.


851-901: LGTM!

Sequential execution handler is well-structured with proper progress callbacks and comprehensive error recording.


903-978: LGTM!

Good refactoring of the install method. The extraction of parallel and sequential execution into helper methods improves readability and maintainability. Return type annotation adds clarity.


2146-2152: LGTM!

Help table appropriately documents the new voice input features with clear descriptions.


2253-2270: LGTM!

Voice command parser is well-designed with --single flag providing flexibility between continuous and single-shot modes. The optional question argument properly supports the --mic flag workflow.


2656-2694: LGTM!

The --mic flag handling for install is correctly structured:

  • Proper try/finally for resource cleanup
  • Early return on voice detection failure
  • Variable software is correctly set in both branches before use at line 2689

2078-2078: LGTM!

Consistent use of the INSTALL_FAIL_MSG class constant across error handlers.

@Sahilbhatane
Copy link
Collaborator Author

Sahilbhatane commented Jan 12, 2026

image

base.en model doesn't recognize the speech properly

@Sahilbhatane
Copy link
Collaborator Author

Added selection mode for user where they can choose the model they want for there system, also stating clearly small model = low accuracy during speech to text

Copy link
Collaborator

@Anshgrover23 Anshgrover23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sahilbhatane Kindly resolve conflicts and address coderabbitai comments.

Sahilbhatane added a commit to Sahilbhatane/cortex that referenced this pull request Jan 19, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cortex/api_key_detector.py (1)

126-147: CORTEX_PROVIDER=anthropic is not recognized.

The check on line 133 only accepts "openai" or "claude", but users might reasonably set CORTEX_PROVIDER=anthropic (which matches the internal provider name and ENV_VAR_PROVIDERS mapping). This would cause the explicit provider preference to be silently ignored.

🔧 Suggested fix
-        if explicit_provider in ["openai", "claude"]:
+        if explicit_provider in ["openai", "claude", "anthropic"]:
             target_env_var = (
-                "OPENAI_API_KEY" if explicit_provider == "openai" else "ANTHROPIC_API_KEY"
+                "OPENAI_API_KEY" if explicit_provider == "openai" else "ANTHROPIC_API_KEY"
             )
-            target_provider = "openai" if explicit_provider == "openai" else "anthropic"
+            target_provider = "openai" if explicit_provider == "openai" else "anthropic"
🤖 Fix all issues with AI agents
In `@cortex/api_key_detector.py`:
- Around line 160-177: The encrypted-storage branch only recognizes "openai" and
"claude" but not "anthropic", so update the explicit_provider check and mapping
in the function where explicit_provider, target_env_var, and target_provider are
set (in cortex/api_key_detector.py around the encrypted-storage block): include
"anthropic" in the allowed values (e.g., explicit_provider in
["openai","claude","anthropic"]) and adjust the mapping so that both "claude"
and "anthropic" map to target_env_var="ANTHROPIC_API_KEY" and
target_provider="anthropic" while "openai" maps to "OPENAI_API_KEY"/"openai";
leave the env_mgr.get_variable and return behavior unchanged.

In `@cortex/cli.py`:
- Around line 883-893: process_voice_command currently calls input() (the prompt
block printing choices and reading choice) from the recording worker thread
which can hang or corrupt terminal I/O; change this so the recording worker does
not block on stdin but instead sends the user-prompt request to the main/UI
thread (for example via a thread-safe queue.Queue message or an asyncio event
loop call_soon_threadsafe) and returns, then have the main thread display the
console prompts and read input and put the chosen action back to the worker (or
set an Event) for processing; update the logic around the prompt/choice handling
in process_voice_command to enqueue a "prompt user for action" message and
consume the response delivered by the main thread rather than calling input()
directly.

In `@tests/integration/test_end_to_end.py`:
- Around line 20-23: Update the SYSTEM_DEPS_INSTALL string to use dynamic kernel
headers and include Python headers: replace linux-headers-generic with
linux-headers-$(uname -r) and add python3-dev to the package list used by
SYSTEM_DEPS_INSTALL so both PIP_BOOTSTRAP and PIP_BOOTSTRAP_DEV (which reference
SYSTEM_DEPS_INSTALL) will install the correct kernel headers and Python
development headers required for building pynput/evdev.
♻️ Duplicate comments (2)
cortex/cli.py (2)

170-188: Prevent provider/key mismatch when CORTEX_PROVIDER is set.

If CORTEX_PROVIDER is set to a provider different from the detected key, _get_provider() will still return the explicit provider, potentially pairing the wrong SDK with the detected key. That leads to hard-to-debug auth failures. Consider validating the explicit provider against the detected key or making _get_api_key() honor the explicit provider.


894-906: Reuse the session API key/provider for voice commands (avoid re‑prompt).

voice() obtains api_key, but process_voice_command calls self.install() / self.ask() which fetch the key again. In continuous mode this can re-prompt or swap providers mid-session. Consider thin wrappers that accept api_key/provider and bypass _get_api_key().

🧹 Nitpick comments (4)
docs/VOICE_INPUT.md (1)

213-218: Minor clarification: faster-whisper supports GPU acceleration.

The limitation "CPU-based inference (no GPU acceleration by default)" is slightly misleading. faster-whisper automatically uses CUDA if available; there's no additional configuration needed. Consider clarifying this or noting that GPU acceleration requires CUDA-capable hardware and proper drivers.

📝 Suggested clarification
-- CPU-based inference (no GPU acceleration by default)
+- GPU acceleration available automatically with CUDA; falls back to CPU otherwise
cortex/voice.py (2)

85-107: Add numpy to the dependency pre-check for a clearer failure mode.

_start_recording() and transcribe() import numpy, but _ensure_dependencies() doesn’t validate it. If only partial deps are installed, you’ll surface a raw ImportError later instead of a guided message.

♻️ Proposed addition
         try:
             from pynput import keyboard  # noqa: F401
         except ImportError:
             missing.append("pynput")
+
+        try:
+            import numpy  # noqa: F401
+        except ImportError:
+            missing.append("numpy")

449-451: Escape transcribed text to avoid Rich markup injection.

console.print treats text as markup by default. If transcription contains [ or Rich tags, output can be malformed or crash. Safer to escape or disable markup for the variable part.

♻️ Proposed fix
+from rich.markup import escape
...
-                console.print(f"\n[bold cyan]Heard:[/bold cyan] {text}\n")
+                console.print(f"\n[bold cyan]Heard:[/bold cyan] {escape(text)}\n")
setup.py (1)

8-21: Consider defensive filtering of all pip-option prefixes to future-proof requirement parsing.

The current filter (not line.startswith("-r")) works correctly for the existing requirements.txt, which contains no pip flags and properly keeps voice dependencies (faster-whisper, sounddevice, etc.) in optional comments only. However, to guard against future changes or misconfiguration, tighten the filter to exclude all - prefixed lines:

-            if line.strip() and not line.startswith("#") and not line.startswith("-r")
+            if line.strip() and not line.startswith("#") and not line.startswith("-")

This makes the code more robust against accidental inclusion of pip options (-e, --extra-index-url, -f, etc.) or optional extras if requirements files are later modified.

Comment on lines +883 to +899
# Ask user for confirmation
console.print()
console.print("[bold cyan]Choose an action:[/bold cyan]")
console.print(" [1] Dry run (preview commands)")
console.print(" [2] Execute (run commands)")
console.print(" [3] Cancel")
console.print()

try:
choice = input("Enter choice [1/2/3]: ").strip()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid input() from the recording worker thread.

process_voice_command is invoked by the recording worker thread. Calling input() from a background thread can hang or corrupt terminal I/O on some platforms. Consider queueing the command to the main thread (or a dedicated UI thread) and prompting there.

🤖 Prompt for AI Agents
In `@cortex/cli.py` around lines 883 - 893, process_voice_command currently calls
input() (the prompt block printing choices and reading choice) from the
recording worker thread which can hang or corrupt terminal I/O; change this so
the recording worker does not block on stdin but instead sends the user-prompt
request to the main/UI thread (for example via a thread-safe queue.Queue message
or an asyncio event loop call_soon_threadsafe) and returns, then have the main
thread display the console prompts and read input and put the chosen action back
to the worker (or set an Event) for processing; update the logic around the
prompt/choice handling in process_voice_command to enqueue a "prompt user for
action" message and consume the response delivered by the main thread rather
than calling input() directly.

Comment on lines +20 to +23
# Install system dependencies needed for pynput/evdev compilation
SYSTEM_DEPS_INSTALL = "apt-get update && apt-get install -y --no-install-recommends linux-headers-generic build-essential libasound2-dev libportaudio2 portaudio19-dev"
PIP_BOOTSTRAP = f"{SYSTEM_DEPS_INSTALL} && python -m pip install --quiet --upgrade pip setuptools build && python -m pip install --quiet --no-cache-dir -e /workspace"
PIP_BOOTSTRAP_DEV = f"{SYSTEM_DEPS_INSTALL} && python -m pip install --quiet --upgrade pip setuptools build && python -m pip install --quiet --no-cache-dir -e /workspace[dev]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the test file to understand the context
cat -n tests/integration/test_end_to_end.py | head -50

Repository: cortexlinux/cortex

Length of output: 2542


🏁 Script executed:

# Search for DEFAULT_IMAGE definition in the file
rg "DEFAULT_IMAGE" tests/integration/test_end_to_end.py -A 2 -B 2

Repository: cortexlinux/cortex

Length of output: 620


🏁 Script executed:

# Let's also check if there are any other references to docker images or base images
rg "python:3.11|Debian|Ubuntu|image" tests/integration/test_end_to_end.py -i

Repository: cortexlinux/cortex

Length of output: 194


🌐 Web query:

linux-headers-generic package Debian vs Ubuntu apt repository

💡 Result:

Short answer: In Debian "linux-headers-generic" is a virtual package name (no real Ubuntu-style metapackage) — Debian provides architecture-specific header metapackages such as linux-headers-amd64, linux-headers-686-pae, etc. In Ubuntu "linux-headers-generic" is a real metapackage maintained by the Ubuntu kernel team that depends on the current generic kernel headers for that Ubuntu release and repository (main/updates/security). [1][2]

Implication for apt:

  • On Debian: apt will install one of the real providers (e.g. linux-headers-amd64) that satisfy the virtual linux-headers-generic dependency. [1]
  • On Ubuntu: installing linux-headers-generic pulls in the matching generic linux-headers-VERSION package for that Ubuntu release, giving you the headers for the currently supported generic kernel. [2]

Sources:

  • Debian packages listing (linux-headers-generic is virtual; shows providers). [1]
  • Ubuntu package listing (linux-headers-generic metapackage maintained in Ubuntu releases). [2]

🌐 Web query:

pynput evdev compilation dependencies kernel headers required

💡 Result:

Short answer — to build/install python-evdev (pynput is separate; python-evdev binds the Linux evdev C API) you need:

  • A C toolchain: gcc/clang (build-essential).
  • Python dev headers: python3-dev / python3-devel.
  • Kernel headers (so /usr/include/linux/input.h and optionally input-event-codes.h are available): linux-headers-$(uname -r) on Debian/Ubuntu, kernel-headers-$(uname -r) on RHEL/Fedora, kernel-headers on Arch.
  • pip (or use your distro package / prebuilt wheel).

Typical install examples:

  • Debian/Ubuntu: apt install build-essential python3-dev python3-pip linux-headers-$(uname -r).
  • RHEL/Fedora: dnf install gcc python3-devel python3-pip kernel-headers-$(uname -r).
  • Arch: pacman -S gcc python-pip core/linux-api-headers.

Notes:

  • The setup looks for /usr/include/linux/input.h and input-event-codes.h by default; you can pass --evdev-headers to point to nonstandard locations.
  • Binary wheels (evdev-binary / distro packages) avoid building against your kernel headers but may not expose all features.

Sources: [1][2][3]

References
[1] python-evdev docs (installation, headers, --evdev-headers).
[2] Debian/Ubuntu python-evdev manpage (same requirements).
[3] Community/issue examples showing missing linux/input.h error and required linux-headers.


linux-headers-generic should use dynamic version selection, and python3-dev is missing.

The package linux-headers-generic exists on both Debian and Ubuntu (as a virtual/metapackage), so apt-get won't fail. However, best practice is to use linux-headers-$(uname -r) for dynamic header selection matching the running kernel.

More importantly, python3-dev is missing from SYSTEM_DEPS_INSTALL. Building Python bindings (pynput/evdev) requires Python headers, which are provided by python3-dev. Without it, the build will fail.

Suggested fix:

-SYSTEM_DEPS_INSTALL = "apt-get update && apt-get install -y --no-install-recommends linux-headers-generic build-essential libasound2-dev libportaudio2 portaudio19-dev"
+SYSTEM_DEPS_INSTALL = "apt-get update && apt-get install -y --no-install-recommends linux-headers-$(uname -r) build-essential python3-dev libasound2-dev libportaudio2 portaudio19-dev"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Install system dependencies needed for pynput/evdev compilation
SYSTEM_DEPS_INSTALL = "apt-get update && apt-get install -y --no-install-recommends linux-headers-generic build-essential libasound2-dev libportaudio2 portaudio19-dev"
PIP_BOOTSTRAP = f"{SYSTEM_DEPS_INSTALL} && python -m pip install --quiet --upgrade pip setuptools build && python -m pip install --quiet --no-cache-dir -e /workspace"
PIP_BOOTSTRAP_DEV = f"{SYSTEM_DEPS_INSTALL} && python -m pip install --quiet --upgrade pip setuptools build && python -m pip install --quiet --no-cache-dir -e /workspace[dev]"
# Install system dependencies needed for pynput/evdev compilation
SYSTEM_DEPS_INSTALL = "apt-get update && apt-get install -y --no-install-recommends linux-headers-$(uname -r) build-essential python3-dev libasound2-dev libportaudio2 portaudio19-dev"
PIP_BOOTSTRAP = f"{SYSTEM_DEPS_INSTALL} && python -m pip install --quiet --upgrade pip setuptools build && python -m pip install --quiet --no-cache-dir -e /workspace"
PIP_BOOTSTRAP_DEV = f"{SYSTEM_DEPS_INSTALL} && python -m pip install --quiet --upgrade pip setuptools build && python -m pip install --quiet --no-cache-dir -e /workspace[dev]"
🤖 Prompt for AI Agents
In `@tests/integration/test_end_to_end.py` around lines 20 - 23, Update the
SYSTEM_DEPS_INSTALL string to use dynamic kernel headers and include Python
headers: replace linux-headers-generic with linux-headers-$(uname -r) and add
python3-dev to the package list used by SYSTEM_DEPS_INSTALL so both
PIP_BOOTSTRAP and PIP_BOOTSTRAP_DEV (which reference SYSTEM_DEPS_INSTALL) will
install the correct kernel headers and Python development headers required for
building pynput/evdev.

@sonarqubecloud
Copy link

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@cortex/branding.py`:
- Around line 14-22: The Console is being initialized with force_terminal=True
and legacy_windows=sys.platform == "win32", which overrides Rich's
auto-detection; update the Console creation in branding.py (the Console
constructor assigned to console) to either remove both parameters so Rich can
auto-detect, or if you truly need forced styling keep only force_terminal=True
and drop the legacy_windows platform check so you don't enable legacy mode on
all Windows systems—modify the Console(...) call accordingly and ensure any
downstream code using the console symbol continues to work with the new
initialization.

In `@cortex/voice.py`:
- Around line 444-458: The worker _recording_worker can leave callers blocked
because record_single waits on done_event but the hotkey callback
(_hotkey_callback) is only invoked when text is non-empty; modify
_recording_worker to always signal completion to the caller: ensure that
regardless of whether text is non-empty, empty, or an exception occurs you
invoke the done callback (or call _hotkey_callback with an empty string/explicit
completion token) or otherwise set/trigger the same completion mechanism used by
record_single (referencing record_single, done_event, _hotkey_callback,
record_and_transcribe, and _is_recording) so the waiting thread is unblocked;
keep existing logging behavior for empty transcription and exceptions but
guarantee the completion callback/path is executed in all branches (including
except and finally).

In `@docs/VOICE_INPUT.md`:
- Around line 65-83: The fenced code blocks in VOICE_INPUT.md example session
and the ASCII diagram (around the blocks shown) are missing a language specifier
which triggers MD040; update both triple-backtick fences to include a language
(use "text") so they read ```text at the start of each block (the example
session block and the ASCII diagram block referenced in the diff) to satisfy
markdownlint.

In `@pyproject.toml`:
- Around line 90-98: Update the pynput dependency in the voice extras from
"pynput>=1.7.6" to "pynput>=1.8.1" to ensure compatibility with Python 3.10+,
and add a note in the project installation/setup docs (or README) stating that
sounddevice requires the system PortAudio library (libportaudio2 on Linux) to be
installed before pip installing the dependencies; adjust the "voice" extras
declaration and the docs/installation section accordingly.
♻️ Duplicate comments (4)
cortex/api_key_detector.py (1)

160-176: Explicit provider in encrypted storage still skips anthropic.
The explicit-provider branch doesn’t include "anthropic", so CORTEX_PROVIDER=anthropic won’t be honored before the default order.

🔧 Suggested fix
-            if explicit_provider in ["openai", "claude"]:
+            if explicit_provider in ["openai", "claude", "anthropic"]:
tests/integration/test_end_to_end.py (1)

20-23: Add python3-dev and kernel-matched headers for evdev builds.

pynput/evdev compilation typically requires Python headers, and linux-headers-generic may not match the running kernel in the container. This can cause build failures during integration tests.

🔧 Proposed fix
-SYSTEM_DEPS_INSTALL = "apt-get update && apt-get install -y --no-install-recommends linux-headers-generic build-essential libasound2-dev libportaudio2 portaudio19-dev"
+SYSTEM_DEPS_INSTALL = "apt-get update && apt-get install -y --no-install-recommends linux-headers-$(uname -r) build-essential python3-dev libasound2-dev libportaudio2 portaudio19-dev"
cortex/cli.py (2)

889-909: Avoid input() from the recording worker thread.

process_voice_command() is invoked from the recording worker thread, but it calls input() (Line 897). Blocking stdin on a background thread can deadlock or corrupt terminal I/O on some platforms. Consider queueing the prompt to the main thread (or a UI loop) and handling the choice there.


838-913: Prevent repeated API key prompts in continuous voice sessions.

voice() obtains an API key once (Line 838), but process_voice_command() calls self.install() / self.ask() which fetch keys again. In continuous mode this can re-prompt unexpectedly. Consider passing the session key/provider into internal helpers to avoid re-detection.

🧹 Nitpick comments (2)
cortex/voice.py (1)

370-396: Add a return type for _get_hotkey_key.

Type hints are required, and this method currently has none. Consider -> Any | None (or a forward reference to the pynput Key type) to keep annotations consistent. As per coding guidelines, type hints are required.

cortex/cli.py (1)

966-1103: Remove unreachable install block or wire helpers into the main flow.

The new helper methods are never called because the inline execution path returns earlier, leaving the block at Lines 1351-1355 unreachable. This is dead code and risks drift between duplicate implementations. Consider removing the unreachable block or refactoring install() to route through the helpers.

✂️ Minimal cleanup (remove dead block)
-            print("\nExecuting commands...")
-            if parallel:
-                return self._handle_parallel_execution(commands, software, install_id, history)
-
-            return self._handle_sequential_execution(commands, software, install_id, history)

Also applies to: 1347-1355

Comment on lines +14 to +22
import sys

from rich import box
from rich.console import Console
from rich.panel import Panel
from rich.table import Table

console = Console()
# Use force_terminal and legacy_windows for better Windows compatibility
console = Console(force_terminal=True, legacy_windows=sys.platform == "win32")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Rich Console force_terminal legacy_windows parameters documentation recommended usage

💡 Result:

Summary — what the parameters do and how to use them

  • force_terminal (Optional[bool]): override auto-detection of a terminal. None = auto-detect; True = emit terminal control/ANSI codes even when writing to a non-tty (e.g., file or pipe); False = suppress control codes. Use only when you explicitly need or must avoid control codes (for example, set True to keep ANSI colors when redirecting output). [1][2]

  • legacy_windows (Optional[bool]): enable legacy Windows mode (None = auto-detect). When True, Rich restricts features/characters that don’t render correctly on older Windows consoles; safe_box and other fallbacks are coordinated with this setting. Only enable manually if you know your target environment is an older Windows console that lacks VT/ANSI support. [1][2]

Recommended usage

  • Leave both as None (defaults) so Rich auto-detects capabilities in most cases. [2]
  • Set force_terminal=True when you need ANSI/control sequences in non-interactive output (e.g., producing colored logs or files) and understand the consumer can handle them. Set False to force plain text. [2]
  • Use legacy_windows=True only for compatibility with legacy Windows console hosts; prefer relying on feature detection or modern terminals (Windows 10+ with VT enabled) when possible. Consider safe_box=True to avoid box-drawing characters on limited consoles. [1][2][3]

Sources

  • Rich Console reference and API docs. [1][2]
  • Microsoft docs on Legacy Console mode (Windows). [3]

Reconsider the Console initialization approach; the current pattern deviates from recommended Rich practices.

The combination force_terminal=True with legacy_windows=sys.platform == "win32" is not the recommended approach. According to Rich documentation:

  • Recommended pattern: Leave both parameters as None (defaults) so Rich auto-detects capabilities.
  • force_terminal=True: Use only when you explicitly need ANSI codes in non-interactive output and understand the consumer can handle them. This may be appropriate here if the module always needs styled output.
  • legacy_windows=sys.platform == "win32": This is problematic because it blindly enables legacy mode on all Windows systems. Modern Windows 10+ with VT support should NOT have this restriction. The docs recommend relying on auto-detection or feature detection instead, not OS platform checks.

Consider either:

  1. Removing both parameters and letting Rich auto-detect, or
  2. If styled output is critical, keep only force_terminal=True and remove the blanket legacy_windows logic (let Rich auto-detect based on actual console capabilities).
🤖 Prompt for AI Agents
In `@cortex/branding.py` around lines 14 - 22, The Console is being initialized
with force_terminal=True and legacy_windows=sys.platform == "win32", which
overrides Rich's auto-detection; update the Console creation in branding.py (the
Console constructor assigned to console) to either remove both parameters so
Rich can auto-detect, or if you truly need forced styling keep only
force_terminal=True and drop the legacy_windows platform check so you don't
enable legacy mode on all Windows systems—modify the Console(...) call
accordingly and ensure any downstream code using the console symbol continues to
work with the new initialization.

Comment on lines +444 to +458
def _recording_worker(self) -> None:
"""Worker thread for recording and transcription."""
try:
text = self.record_and_transcribe()

if text and self._hotkey_callback:
console.print(f"\n[bold cyan]Heard:[/bold cyan] {text}\n")
self._hotkey_callback(text)
elif not text:
cx_print("No speech detected. Try speaking louder or closer to the mic.", "warning")

except Exception as e:
cx_print(f"Recording error: {e}", "error")
finally:
self._is_recording = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

record_single can hang when transcription is empty or errors.

record_single() waits on done_event (Line 532), but _recording_worker only calls the callback when text is non-empty. If no speech is detected or an exception occurs, the callback never fires and the caller blocks indefinitely.

🛠️ Proposed fix
-    def _recording_worker(self) -> None:
-        """Worker thread for recording and transcription."""
-        try:
-            text = self.record_and_transcribe()
-
-            if text and self._hotkey_callback:
-                console.print(f"\n[bold cyan]Heard:[/bold cyan] {text}\n")
-                self._hotkey_callback(text)
-            elif not text:
-                cx_print("No speech detected. Try speaking louder or closer to the mic.", "warning")
-
-        except Exception as e:
-            cx_print(f"Recording error: {e}", "error")
-        finally:
-            self._is_recording = False
+    def _recording_worker(self) -> None:
+        """Worker thread for recording and transcription."""
+        text = ""
+        try:
+            text = self.record_and_transcribe()
+            if text:
+                console.print(f"\n[bold cyan]Heard:[/bold cyan] {text}\n")
+            else:
+                cx_print("No speech detected. Try speaking louder or closer to the mic.", "warning")
+        except Exception as e:
+            cx_print(f"Recording error: {e}", "error")
+        finally:
+            if self._hotkey_callback:
+                self._hotkey_callback(text)
+            self._is_recording = False
🤖 Prompt for AI Agents
In `@cortex/voice.py` around lines 444 - 458, The worker _recording_worker can
leave callers blocked because record_single waits on done_event but the hotkey
callback (_hotkey_callback) is only invoked when text is non-empty; modify
_recording_worker to always signal completion to the caller: ensure that
regardless of whether text is non-empty, empty, or an exception occurs you
invoke the done callback (or call _hotkey_callback with an empty string/explicit
completion token) or otherwise set/trigger the same completion mechanism used by
record_single (referencing record_single, done_event, _hotkey_callback,
record_and_transcribe, and _is_recording) so the waiting thread is unblocked;
keep existing logging behavior for empty transcription and exceptions but
guarantee the completion callback/path is executed in all branches (including
except and finally).

Comment on lines +65 to +83
**Example session:**
```
$ cortex voice
CX ✓ Voice mode active. Press F9 to speak, Ctrl+C to exit.
CX │ Listening...
[Press F9]
CX │ Recording ●●○ (Press F9 to stop)
[Speak: "Install nginx"]
[Press F9]
CX ⠋ Transcribing...
Heard: Install nginx
CX │ Installing: nginx
CX ⠋ Understanding request...
...
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Specify fenced code block languages (MD040).

markdownlint flags these blocks because the language is missing. Add a language (e.g., text) to both blocks.

✍️ Suggested fix
-```
+```text
 $ cortex voice
 CX ✓ Voice mode active. Press F9 to speak, Ctrl+C to exit.
 CX │ Listening...
@@
 ...
-```
+```

@@
-```
+```text
 ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
 │      F9      │───>│    Record    │───>│  Transcribe  │
 │   Hotkey     │    │    Audio     │    │   (Whisper)  │
 └──────────────┘    └──────────────┘    └──────────────┘
@@
 └──────────────┘    └──────────────┘    └──────────────┘
-```
+```

Also applies to: 146-157

🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In `@docs/VOICE_INPUT.md` around lines 65 - 83, The fenced code blocks in
VOICE_INPUT.md example session and the ASCII diagram (around the blocks shown)
are missing a language specifier which triggers MD040; update both
triple-backtick fences to include a language (use "text") so they read ```text
at the start of each block (the example session block and the ASCII diagram
block referenced in the diff) to satisfy markdownlint.

Comment on lines +90 to 98
voice = [
"faster-whisper>=0.10.0",
"sounddevice>=0.4.6",
"pynput>=1.7.6",
"numpy>=1.24.0",
]
all = [
"cortex-linux[dev,security,docs,dashboard]",
"cortex-linux[dev,security,docs,dashboard,voice]",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

faster-whisper 0.10.0 PyPI minimum version requirements system dependencies

💡 Result:

Short answer — for faster-whisper v0.10.0:

  • Minimum Python: 3.9+. [1][2]
  • Python packages/versions called out in the release: ctranslate2 (updated to include 3.22.0) and tokenizers (updated to include 0.15). [1]
  • Audio: PyAV is used (FFmpeg not required system-wide). [2]
  • GPU/system libs (for GPU execution): NVIDIA cuBLAS and cuDNN matching the CUDA toolchain you use (the project targets CUDA 12 / cuBLAS + cuDNN; see ctranslate2/CUDA notes and downgrade-workarounds if you must use CUDA 11). [1][2]

Sources: GitHub release notes for v0.10.0 and the faster-whisper PyPI page. [1] GitHub releases (v0.10.0). [2] faster-whisper · PyPI.


🌐 Web query:

sounddevice 0.4.6 PyPI minimum version requirements system dependencies

💡 Result:

Summary of minimum requirements and system dependencies for sounddevice 0.4.6:

  • Python: any Python version that supports CFFI (practically Python 3.7+ for recent releases). [1]
  • pip/setuptools: required to install the package. [1]
  • CFFI: required (install via package manager or pip). (CFFI >=1.0 is expected by the project). [1]
  • PortAudio library: required on the system (package often named libportaudio, libportaudio2, or portaudio). On Windows and macOS the pip wheel includes PortAudio; on Linux/other Unix you must install PortAudio via your package manager or build it yourself. [1][2]
  • NumPy: optional — only required if you want to use NumPy-array convenience functions (otherwise Raw* stream classes work without NumPy). [1]

Sources: python-sounddevice 0.4.6 installation docs and packaging notes. [1] Installation — python-sounddevice (0.4.6). [2] piwheels release/notes for 0.4.6.


🌐 Web query:

pynput 1.7.6 PyPI minimum version requirements system dependencies

💡 Result:

Summary for pynput 1.7.6 (PyPI release Jan 1, 2022)

  • Minimum Python support: Python 2.7 and Python 3.4–3.9 (classifiers on the PyPI 1.7.6 page). [1]
  • Platform / system dependencies (runtime):
    • Linux/Xorg: python-xlib (python3-xlib) is required; evdev/uinput is used on some backends. [2][3]
    • macOS: PyObjC framework packages (e.g. pyobjc-framework-ApplicationServices, pyobjc-framework-Quartz) are required for macOS backends. [2]
    • Windows: no extra third-party system packages (uses built-in Win32 APIs). [2]

Sources:

  • PyPI — pynput 1.7.6 (classifiers & release). [1]
  • Package dependency notes / platform deps (piwheels / package indexes) and platform specifics (linux/macOS). [2][3]

🌐 Web query:

numpy 1.24.0 PyPI minimum version Python requirements

💡 Result:

Minimum Python version required for NumPy 1.24.0 is Python 3.8. [1][2]

Sources:
[1] NumPy 1.24.0 release notes (supported Python versions: 3.8–3.11).
[2] PyPI release page for numpy 1.24.0 (wheel files for cp38, cp39, cp310, cp311).


🏁 Script executed:

# First, let's examine the pyproject.toml context to understand the overall configuration
head -100 pyproject.toml

Repository: cortexlinux/cortex

Length of output: 2521


🌐 Web query:

pynput latest version Python 3.10 3.11 3.12 support compatibility

💡 Result:

Latest pynput release: 1.8.1 (uploaded Mar 17, 2025). It ships a universal py2.py3 wheel and includes fixes for Python 3.12, so it is compatible with Python 3.10, 3.11 and 3.12. [1][2]

Sources:

  • PyPI release (pynput 1.8.1, uploaded Mar 17, 2025). [1]
  • pynput GitHub / changelog and issues referencing 3.12 fixes/compatibility. [2]

🏁 Script executed:

# Check if there's more recent pynput versions available
curl -s https://pypi.org/pypi/pynput/json | python3 -c "
import json, sys
data = json.load(sys.stdin)
releases = sorted(data['releases'].keys(), reverse=True)[:10]
for v in releases:
    print(f'{v}: Python requirements = {data[\"releases\"][v][0].get(\"requires_python\", \"not specified\") if data[\"releases\"][v] else \"N/A\"}')" 2>/dev/null || echo "Using alternative check"

Repository: cortexlinux/cortex

Length of output: 148


🏁 Script executed:

# Alternative: check pypi data for pynput versions
python3 << 'EOF'
import json
import urllib.request

try:
    url = "https://pypi.org/pypi/pynput/json"
    with urllib.request.urlopen(url) as response:
        data = json.loads(response.read().decode('utf-8'))
    
    print("Recent pynput versions and Python support:")
    releases = sorted(data['releases'].keys(), reverse=True)[:15]
    for version in releases:
        release_info = data['releases'][version]
        if release_info:
            first_file = release_info[0]
            requires_python = first_file.get('requires_python', 'Not specified')
            print(f"  {version}: {requires_python}")
        else:
            print(f"  {version}: (no files)")
except Exception as e:
    print(f"Error fetching PyPI data: {e}")
EOF

Repository: cortexlinux/cortex

Length of output: 148


Update pynput to 1.8.1—current version 1.7.6 does not support Python 3.10+.

The specified pynput 1.7.6 (Jan 2022 release) only supports Python 3.4–3.9. Since this project declares Python 3.10+ support, this creates a version mismatch. Upgrade to pynput 1.8.1 (Mar 2025), which adds Python 3.10–3.12 support.

Additionally, sounddevice on Linux requires the system PortAudio library (libportaudio2). Ensure this is documented in installation/setup instructions. The other dependencies (faster-whisper, numpy) are compatible with Python 3.10+.

🤖 Prompt for AI Agents
In `@pyproject.toml` around lines 90 - 98, Update the pynput dependency in the
voice extras from "pynput>=1.7.6" to "pynput>=1.8.1" to ensure compatibility
with Python 3.10+, and add a note in the project installation/setup docs (or
README) stating that sounddevice requires the system PortAudio library
(libportaudio2 on Linux) to be installed before pip installing the dependencies;
adjust the "voice" extras declaration and the docs/installation section
accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fizy AI Integration — Voice DevOps

4 participants