GitHub - Neuroca-Inc/_Neuroca: Dynamic, self managing biology-like memory layer for LLMs. Give your agent the ability to remember and forget in three sophisticated layers: short term memory, medium term memory, and long term memory.

Persistent Memory SDK for LLMs (NCA)

NEW: Neuroca benchmarks results here

➡️ View the full Table of Contents for easy navigation.

📚 Official Documentation: _Neuroca/docs/

Overview

NeuroCognitive Architecture (NCA) is an advanced framework designed to imbue Large Language Models (LLMs) with more sophisticated, human-like cognitive capabilities. It transcends standard Retrieval-Augmented Generation (RAG) by replacing fixed context windows and explicit retrieval calls with a dynamic, multi-tiered memory system inspired by biological cognition. This organic system allows the LLM's effective context to grow and adapt indefinitely, fostering accurate, stable, and long-term conversational understanding. Memory management—including consolidation, decay, and relevance scoring—operates automatically in the background. This enables the LLM to genuinely learn and evolve from interactions, "experiencing" its memories organically rather than relying on external tools to fetch isolated data points.

Index for this README

Overview
Key Features
NCA vs. Standard RAG: A Deeper Dive
Example Use Cases
Architecture
Installation
Configuration
Usage
Development
Documentation
Contributing
Roadmap
Future Directions & Integrations
License
Acknowledgments
Author
Contact

Key Features

Dynamic Multi-Tiered Memory System: Unlike flat vector databases used in typical RAG, NCA features a structured memory hierarchy inspired by human cognition:
- Short-Term Memory (STM): High-speed, temporary storage for immediate context (akin to working memory). Governed by TTL (Time-To-Live).
- Medium-Term Memory (MTM): Intermediate storage for recently relevant information, acting as a buffer before long-term consolidation. Managed by capacity limits and decay.
- Long-Term Memory (LTM): Persistent storage for consolidated knowledge and core facts. Supports efficient retrieval over large datasets.
Biologically-Inspired Processes:
- Memory Consolidation: Automatic background process to move important memories from STM -> MTM -> LTM based on relevance, frequency, and importance scores.
- Memory Decay: Memories naturally lose relevance over time unless reinforced, mimicking forgetting and keeping the memory system focused.
- Importance Scoring: Allows explicit weighting of memories, influencing retrieval and consolidation priority.
Rich Memory Representation: Stores not just content embeddings but also crucial metadata like timestamps, sources, tags, and importance scores, enabling more complex querying and context association.
Dynamically Managed Backends: The system automatically selects and manages the optimal storage backend (e.g., high-speed In-Memory for STM, persistent SQLite/Vector DBs for MTM/LTM) for each memory tier based on configuration. This happens transparently, ensuring the best balance of speed, persistence, and scalability for different types of memory without manual intervention. The architecture is designed for easy extension with new backend types.
Advanced Search Capabilities: Goes beyond simple vector similarity to allow filtering and searching based on metadata, time, importance, and tags across different memory tiers.
Thread-Safe Design: Engineered for robust performance in asynchronous applications with thread-local connection management for database backends.
Organic Learning & Context: Facilitates true learning and adaptation by allowing the LLM's effective context to grow organically and potentially indefinitely. Unlike RAG's manual retrieval, NCA's automatic background processes provide context seamlessly. This enables the LLM to evolve its understanding over time based on its "experiences"—remembering not just facts, but also past interactions, mistakes made, recent documentation changes, version updates, and other crucial context—without explicit tool intervention.

NCA vs. Standard RAG: A Deeper Dive

While standard Retrieval-Augmented Generation (RAG) enhances LLMs by fetching external data, NCA offers a fundamentally more dynamic and integrated approach to memory and context:

Feature	Standard RAG	NeuroCognitive Architecture (NCA)	Example Advantage
Memory Structure	Typically a flat vector database.	Multi-tiered (STM, MTM, LTM) mimicking cognitive processes.	NCA can prioritize recent info (STM) while retaining core knowledge (LTM), unlike a single-pool vector DB.
Context Handling	Relies on fixed context windows & explicit retrieval calls.	Context grows organically; relevant memories surface automatically.	NCA avoids abrupt context loss; past relevant details (e.g., user preferences) remain accessible indefinitely.
Info Lifecycle	Data is often static until manually updated.	Memories consolidate, decay, and are updated based on interactions.	In NCA, outdated information naturally fades, while frequently used info strengthens, keeping context relevant.
Retrieval	Primarily vector similarity search.	Multi-faceted search (similarity, metadata, time, importance, tags).	NCA can find memories "from last Tuesday about Project X" not just semantically similar text.
Learning	Limited to the static data retrieved.	Learns implicitly through memory dynamics and consolidation.	NCA adapts its knowledge base over time based on ongoing interactions, reflecting genuine learning.
Integration	Often requires separate tool calls for retrieval.	Memory management (incl. backend selection) is integrated; context provided seamlessly.	LLM interacts more naturally with NCA, as memory (across dynamically managed backends) feels intrinsic.

In essence, where standard RAG bolts on an external database, NCA integrates a dynamic, self-maintaining cognitive memory system, enabling LLMs to develop a more persistent, adaptive, and nuanced understanding of context over time.

Example Use Cases

NCA's advanced memory capabilities unlock potential for applications beyond the scope of simpler RAG systems:

Truly Stateful Conversational Agents: Build chatbots and virtual assistants that maintain coherent, long-term memory of user preferences, past interactions, and evolving goals across multiple sessions.
Personalized Adaptive Learning: Create educational tools that track a student's knowledge progression, remembering areas of difficulty and tailoring content accordingly.
Complex Task Automation & Planning: Develop agents that can perform multi-step tasks, remembering previous actions, outcomes, and environmental state changes to inform future decisions.
Creative Writing Assistants: Aid writers by maintaining consistent character details, plot points, and world-building elements across long narratives.
Cognitive Science Research: Provide a platform for simulating and experimenting with different memory models and cognitive architectures.

(For a visual overview of the system's structure, see the Component Diagram).

Health Dynamics (Planned):
- Energy management and resource allocation
- Attention and focus mechanisms
- Cognitive load monitoring and adaptation
Biological Inspiration:
- Neural pathway simulation
- Neurotransmitter-inspired state management
- Circadian rhythm effects on performance
LLM Integration:
- Seamless integration with popular LLM frameworks
- Prompt engineering and context management
- Response optimization based on cognitive state

Architecture

The NCA system is structured around modular components. The main source code resides in src/neuroca:

src/neuroca/
├── api/                  # API layer and endpoints
├── assets/               # Static assets (images, templates)
├── cli/                  # Command-line interface tools
├── config/               # Configuration files and settings
├── core/                 # Core domain logic and models
├── db/                   # Database interaction layer (schemas, migrations if any)
├── infrastructure/       # Infrastructure setup (e.g., Docker config specific to app)
├── integration/          # LLM and external service integration components
├── memory/               # Memory tier implementations (STM, MTM, LTM, backends)
├── monitoring/           # Monitoring and observability hooks/integrations
├── scripts/              # Standalone utility scripts related to neuroca
├── tools/                # Internal development/operational tools
├── utils/                # Shared utility functions

Note: Top-level directories like docs/, tests/, scripts/ (project-level), etc., exist at the project root (Neuroca/).

Installation

Prerequisites

Python 3.10 or 3.11 (primary tested targets for the 1.0.0 GA release; use 3.12 only for CPU-only workflows because GPU-accelerated extras such as PyTorch do not yet ship wheels for 3.12)
Docker and Docker Compose (optional, for containerized deployment)
Access to LLM API credentials (if integrating remote providers)

Quick Start (PyPI GA)

Set up a fresh virtual environment and install the general-availability build from PyPI. This is the simplest path to evaluate Neuroca with the default SQLite backends.

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install --upgrade pip
pip install "neuroca==1.0.0"

# Verify the CLI entry point resolves and list the available groups
neuroca --help

Install from Source

# Clone the repository and enter the project root
git clone https://github.com/justinlietz93/Neuro-Cognitive-Architecture.git
cd Neuro-Cognitive-Architecture

# Install the package in editable mode with tooling/test extras
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install --upgrade pip
pip install -e ".[dev,test]"

# (Optional) Build a wheel if you need an artefact for deployment
python -m build
pip install dist/*.whl

Development Setup

Using Poetry (Recommended for Development)

# Clone the repository
git clone https://github.com/justinlietz93/Neuro-Cognitive-Architecture.git
cd Neuro-Cognitive-Architecture

# Install dependencies using Poetry (with dev/test extras)
poetry install --with dev,test

# Activate the virtual environment
poetry shell

Using Pip (Alternative for Development)

# Clone the repository
git clone https://github.com/justinlietz93/Neuro-Cognitive-Architecture.git
cd Neuro-Cognitive-Architecture

# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies with extras for tooling and tests
pip install -e ".[dev,test]"

Using Docker

# Clone the repository
git clone https://github.com/justinlietz93/Neuro-Cognitive-Architecture.git
cd Neuro-Cognitive-Architecture

# Build and run with Docker Compose
docker-compose up -d

Configuration

Copy the example environment file:
```
cp .env.example .env
```
Edit the .env file with your specific configuration:
- LLM API credentials
- Database connection details
- Memory system parameters
- Health dynamics settings
Additional configuration options are available in the config/ directory.

Usage

Quick Verification

After installing Neuroca, run the smoke tests below to confirm the environment is ready:

# Execute the curated memory system demo (writes to an ephemeral SQLite DB)
python scripts/basic_memory_test.py

# Capture a snapshot of the live tiers via the CLI
neuroca memory inspect --tier stm --limit 3

Both commands should complete without warnings and print at least one stored memory entry.

API

Start the FastAPI-powered service using the provided helper or the module entry point:

make run-api
# or
python -m neuroca.api.server

Once booted, the API listens on http://localhost:8000. Probe the health check to validate the deployment:

curl -sf http://localhost:8000/health

CLI

The neuroca binary exposes scoped command groups for day-to-day operations. After activating your virtual environment, run neuroca --help to see the available top-level groups (llm, memory, system). Common flows:

# Display available commands and options
neuroca --help

# Run an LLM query using the local Ollama provider without touching the live memory tiers
neuroca llm query "summarise the latest log entries" \
  --provider ollama --model gemma3:4b --no-memory --no-health --no-goals

# Seed a short-term memory file and inspect stored entries
neuroca memory seed ./examples/memories.json --tier stm --user demo-user
neuroca memory inspect --tier stm --limit 5

# Create a database backup (PostgreSQL) or copy a local SQLite database
neuroca system backup --path ./backups/$(date +%Y%m%d).sql

# Restore from a previously created backup
neuroca system restore ./backups/20250919.sql

Each command supports --help for additional switches; for example, neuroca memory --help details vector-index maintenance utilities and other maintenance tasks covered by the automated tests. The CLI always respects the active configuration file (development or production) so long as the NCA_ENV environment variable is exported.

Python Library

from neuroca import NeuroCognitiveArchitecture

# Initialize the architecture
nca = NeuroCognitiveArchitecture()

# Configure memory parameters
nca.configure(
    working_memory_capacity=7,
    episodic_decay_rate=0.05,
    semantic_consolidation_threshold=0.8
)

# Process input through the cognitive architecture
response = nca.process("What is the relationship between quantum physics and consciousness?")

# Access memory components
working_memory = nca.memory.working.get_contents()

Memory Manager Quick Start (Async)

import asyncio
from neuroca.memory.manager import MemoryManager, MemoryTier


async def main() -> None:
    manager = MemoryManager()
    await manager.initialize()

    await manager.add_memory(
        tier=MemoryTier.WORKING,
        content="Remember to announce the beta release",
        metadata={"tags": ["release", "beta"], "importance": 0.9},
    )

    results = await manager.search_memories(
        query="beta release announcement",
        tier=MemoryTier.SEMANTIC,
        limit=5,
    )

    for memory in results:
        print(memory.content, memory.metadata)

    await manager.shutdown()


asyncio.run(main())

Development

Setting Up Development Environment

# Install development dependencies (includes test extras)
poetry install --with dev,test

# Set up pre-commit hooks
pre-commit install

Running Tests

# Run all tests
pytest -q

# Run specific test modules
pytest tests/memory/

Code Quality

# Run linting
ruff check

# Verify formatting
black --check .

# Run type checking
mypy --hide-error-context --no-error-summary src

Documentation

In repo: _Neuroca/docs/

Latest beta release notes: docs/RELEASE_NOTES.md

Comprehensive documentation is also available in the docs/ directory and can be built and served locally using MkDocs:

# Navigate to the docs directory within the Neuroca project
cd Neuroca/docs

# Build and serve the documentation site (requires mkdocs installed - use `poetry install --with dev` if using Poetry)
mkdocs serve

The documentation site will typically be available at http://127.0.0.1:8000. Note: The mkdocs.yml configuration file may currently be out of sync with the actual documentation file structure and require updates to build correctly. Refer to the mkdocs.yml file for configuration details.

Contributing

We welcome contributions to the NeuroCognitive Architecture project! Please see CONTRIBUTING.md for guidelines on how to contribute.

Development Workflow

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Roadmap

Phase 1: Core memory system implementation
Phase 2: Health dynamics integration
Phase 3: Advanced biological components
Phase 4: LLM integration optimization
Phase 5: Performance tuning and scaling

Future Directions & Integrations

The NCA project is actively evolving. Key areas for future development include:

Advanced Cognitive Features: Implementing more nuanced cognitive functions like attention mechanisms, emotional modeling, and more sophisticated learning algorithms.
Performance Optimization: Continuously improving the speed and efficiency of memory operations, particularly for large-scale deployments. This includes optimizing database interactions, refining consolidation algorithms, and exploring hardware acceleration options.
Benchmarking: Conducting rigorous benchmarks comparing NCA's performance (speed, scalability, retrieval accuracy, context quality) against other popular memory systems and RAG frameworks.
Expanded Backend Support: Adding support for more database and vector store backends (e.g., PostgreSQL, Redis, specialized vector databases like Weaviate or Pinecone) to provide greater flexibility.
Framework Compatibility & Integrations:
- LangChain: Enhancing the native integration with LangChain, providing seamless compatibility with LangChain's ecosystem of tools and agents. We aim to offer NCA as a sophisticated, stateful memory alternative within LangChain workflows.
- Other Frameworks: Exploring integrations with other popular AI/LLM frameworks (e.g., LlamaIndex, Haystack) to broaden NCA's applicability.
Enhanced Tooling: Developing better tools for monitoring memory state, debugging cognitive processes, and visualizing memory dynamics.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Performance & Benchmarking

For detailed performance comparisons and benchmarks demonstrating Neuroca’s permanent, reliable, accurate, and fast memory capabilities, see the standalone Neuroca Benchmarks Repository.

Acknowledgments

Cognitive science research that inspired this architecture
Mary Shelly's "Frankenstein: The Modern Prometheus"
The open-source AI community
Contributors and early adopters

Author

Justin Lietz - Initial work & Lead Developer
Justin's first prototype of the Autonomous Project Generator with Claude 3.7 Sonnet (who produced > 70% of the codebase in one prompt)

Contact

For questions, feedback, or collaboration opportunities, please open an issue on this repository or contact Justin Lietz. jlietz93@gmail.com

Note: NeuroCognitive Architecture is currently in ALPHA. Features and interfaces may change significantly as the project develops. We warmly welcome ANY and ALL feedback, bug reports, and feature requests via GitHub Issues. Your input is invaluable as we work towards integrating NCA into the Apex-CodeGenesis VSCode Extension and replacing the existing Agno memory system within the Cogito platform.

Updated as of 4/15/2025

### Demo Script

Run a minimal demo to insert and search a memory (clean, no warnings):

python scripts/basic_memory_test.py

Production via Docker

The Docker image defaults to production settings (ENV/NCA_ENV=production). A production configuration file is provided at config/production.yaml.

Build and run:

docker build -t neuroca:1.0.0 .
docker run --rm -p 8000:8000 neuroca:1.0.0

Prometheus metrics are disabled by default in the demo and can be enabled via configuration in production.

Deploy with Docker Compose (Production‑ready quick start)

Run Neuroca with Postgres using the provided compose file:

docker compose -f docker-compose.agent.yml up -d --build
curl -sf http://localhost:8000/health

Environment defaults select production settings and wire the DB connection. Health checks are enabled. For a multi‑day soak, see docs/operations/runbooks/soak-test.md.

Name		Name	Last commit message	Last commit date
Latest commit History 284 Commits
.archive		.archive
.codacy		.codacy
.github		.github
about		about
benchmarks		benchmarks
config		config
docs		docs
logs		logs
sandbox		sandbox
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
.yamllint		.yamllint
AGENTS.md		AGENTS.md
ARCHITECTURE_RULES.md		ARCHITECTURE_RULES.md
BETA_READINESS_CHECKLIST.md		BETA_READINESS_CHECKLIST.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TABLE_OF_CONTENTS.md		TABLE_OF_CONTENTS.md
build_template.ps1		build_template.ps1
build_template_fixed.ps1		build_template_fixed.ps1
build_template_v2.ps1		build_template_v2.ps1
dashboard_fixer.py		dashboard_fixer.py
debug_schema_creation.ps1		debug_schema_creation.ps1
docker-compose.agent.yml		docker-compose.agent.yml
docker-compose.yml		docker-compose.yml
insert_worker.py		insert_worker.py
neuroca_backup.sql		neuroca_backup.sql
neuroca_schema.sql		neuroca_schema.sql
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_local_ollama.py		run_local_ollama.py

License

Neuroca-Inc/_Neuroca

Folders and files

Latest commit

History

Repository files navigation

Persistent Memory SDK for LLMs (NCA)

Overview

Index for this README

Key Features

NCA vs. Standard RAG: A Deeper Dive

Example Use Cases

Architecture

Installation

Prerequisites

Quick Start (PyPI GA)

Install from Source

Development Setup

Using Poetry (Recommended for Development)

Using Pip (Alternative for Development)

Using Docker

Configuration

Usage

Quick Verification

API

CLI

Python Library

Memory Manager Quick Start (Async)

Development

Setting Up Development Environment

Running Tests

Code Quality

Documentation

Contributing

Development Workflow

Roadmap

Future Directions & Integrations

License

Performance & Benchmarking

Acknowledgments

Author

Contact

Production via Docker

Deploy with Docker Compose (Production‑ready quick start)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages