-
Notifications
You must be signed in to change notification settings - Fork 3
feat(memory): implement scope and acceptance criteria #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
34d7e2a
feat(memory): implement P0 scope and acceptance criteria
CJBshuosi 78ed180
fix(memory): use SessionProvider instead of DbProvider
CJBshuosi 978fdd6
fix(ci): add python-multipart and fix collector imports
CJBshuosi c3bb7e3
fix: remove SessionProvider arg from MemoryMetricCollector calls
CJBshuosi a6e2ad3
fix(ci): add gitpython to requirements-ci.txt
CJBshuosi eaec917
fix(ci): add requests to requirements-ci.txt
CJBshuosi 6d118c6
fix(ci): add jinja2 to requirements-ci.txt
CJBshuosi 995c0ce
fix(ci): add missing dependencies to requirements-ci.txt
CJBshuosi 982cb19
test(memory): add unit and integration tests for acceptance criteria
CJBshuosi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,144 @@ | ||
| # Memory Namespace and Isolation Strategy | ||
|
|
||
| This document describes the isolation boundaries and query resolution rules implemented in the memory system. | ||
|
|
||
| > **Reference Implementation:** `src/paperbot/infrastructure/stores/memory_store.py` | ||
|
|
||
| ## Isolation Hierarchy | ||
|
|
||
| Memory items are isolated using a three-level namespace hierarchy, as defined in `memory_todo.md`: | ||
|
|
||
| ``` | ||
| ┌─────────────────────────────────────────────────────────────────┐ | ||
| │ Level 1: user_id (required) │ | ||
| │ ┌───────────────────────────────────────────────────────────┐ │ | ||
| │ │ Level 2: workspace_id (optional) │ │ | ||
| │ │ ┌─────────────────────────────────────────────────────┐ │ │ | ||
| │ │ │ Level 3: scope_type:scope_id (optional) │ │ │ | ||
| │ │ └─────────────────────────────────────────────────────┘ │ │ | ||
| │ └───────────────────────────────────────────────────────────┘ │ | ||
| └─────────────────────────────────────────────────────────────────┘ | ||
| ``` | ||
|
|
||
| ## Level 1: user_id (Required) | ||
|
|
||
| The primary isolation boundary. Cross-user access is forbidden. | ||
|
|
||
| **Implementation:** Every query in `memory_store.py` starts with: | ||
| ```python | ||
| stmt = select(MemoryItemModel).where(MemoryItemModel.user_id == user_id) | ||
| ``` | ||
|
|
||
| ## Level 2: workspace_id (Optional) | ||
|
|
||
| Team or project-level isolation within a user's memories. | ||
|
|
||
| **Implementation:** `memory_store.py` line 293-294: | ||
| ```python | ||
| if workspace_id is not None: | ||
| stmt = stmt.where(MemoryItemModel.workspace_id == workspace_id) | ||
| ``` | ||
|
|
||
| **Behavior:** | ||
| - If `workspace_id` is set on a memory, it is only visible when querying with the same `workspace_id` | ||
| - Memories with `workspace_id=NULL` are user-global | ||
|
|
||
| ## Level 3: scope_type:scope_id (Optional) | ||
|
|
||
| Fine-grained isolation for research tracks, projects, or papers. | ||
|
|
||
| **Implementation:** `memory_store.py` lines 296-301: | ||
| ```python | ||
| if scope_type is not None: | ||
| if scope_type == "global": | ||
| stmt = stmt.where(or_(MemoryItemModel.scope_type == scope_type, MemoryItemModel.scope_type.is_(None))) | ||
| else: | ||
| stmt = stmt.where(MemoryItemModel.scope_type == scope_type) | ||
| if scope_id is not None: | ||
| stmt = stmt.where(MemoryItemModel.scope_id == scope_id) | ||
| ``` | ||
|
|
||
| **Scope Types:** | ||
|
|
||
| | scope_type | scope_id | Usage | | ||
| |------------|----------|-------| | ||
| | `global` | NULL | User-wide memories, visible across all tracks | | ||
| | `track` | track_id | Research direction specific | | ||
| | `project` | project_id | Project-specific context | | ||
| | `paper` | paper_id | Paper-specific notes | | ||
|
|
||
| ## Content Hash Deduplication | ||
|
|
||
| To prevent duplicates while allowing the same content in different scopes, the content hash includes scope information. | ||
|
|
||
| **Implementation:** `memory_store.py` lines 168-171: | ||
| ```python | ||
| # Dedup within a user's scope boundary (prevents cross-track pollution/dedup collisions). | ||
| content_hash = _sha256_text( | ||
| f"{effective_scope_type}:{effective_scope_id or ''}:{m.kind}:{content}" | ||
| ) | ||
| ``` | ||
|
|
||
| ## Status Filtering | ||
|
|
||
| Queries only return approved, non-deleted, non-expired items. | ||
|
|
||
| **Implementation:** `memory_store.py` lines 303-305: | ||
| ```python | ||
| stmt = stmt.where(MemoryItemModel.deleted_at.is_(None)) | ||
| stmt = stmt.where(MemoryItemModel.status == "approved") | ||
| stmt = stmt.where(or_(MemoryItemModel.expires_at.is_(None), MemoryItemModel.expires_at > now)) | ||
| ``` | ||
|
|
||
| ## Auto-Status Assignment | ||
|
|
||
| Status is automatically set based on confidence and PII risk. | ||
|
|
||
| **Implementation:** `memory_store.py` lines 174-178: | ||
| ```python | ||
| effective_status = "approved" if float(m.confidence) >= 0.60 else "pending" | ||
| pii_risk = _estimate_pii_risk(content) | ||
| if pii_risk >= 2 and effective_status == "approved": | ||
| effective_status = "pending" | ||
| ``` | ||
|
|
||
| **Rules:** | ||
| - `confidence >= 0.60` → `approved` | ||
| - `confidence < 0.60` → `pending` | ||
| - `pii_risk >= 2` → override to `pending` (requires human review) | ||
|
|
||
| ## Provider as Metadata | ||
|
|
||
| The source platform (ChatGPT/Gemini/Claude) is stored in `memory_sources.platform`, not in `memory_items`. | ||
|
|
||
| **Rationale:** From `memory_survey.md` section 6.1: | ||
| > 建议拆成三张逻辑表: | ||
| > - `memory_sources`:导入批次(平台、文件名、sha256、统计信息) | ||
| > - `memory_items`:稳定条目 | ||
|
|
||
| Provider is for provenance tracking, not query filtering. | ||
|
|
||
| ## Audit Trail | ||
|
|
||
| All mutations are logged in `memory_audit_log`. | ||
|
|
||
| **Implementation:** `memory_store.py` lines 205-220: | ||
| ```python | ||
| session.add( | ||
| MemoryAuditLogModel( | ||
| ts=now, | ||
| actor_id=actor_id, | ||
| user_id=user_id, | ||
| workspace_id=workspace_id, | ||
| action="create", | ||
| item_id=row.id, | ||
| ... | ||
| ) | ||
| ) | ||
| ``` | ||
|
|
||
| ## References | ||
|
|
||
| - Requirements: `docs/memory_todo.md` P0 section | ||
| - Survey: `docs/memory_survey.md` section 6 | ||
| - Implementation: `src/paperbot/infrastructure/stores/memory_store.py` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,141 @@ | ||
| # Memory Type Taxonomy | ||
|
|
||
| This document defines the formal taxonomy for memory types in PaperBot's cross-platform memory middleware. | ||
|
|
||
| ## Overview | ||
|
|
||
| Memory items are categorized into three main categories based on their purpose and lifecycle: | ||
|
|
||
| 1. **User Memory** - Stable, cross-session information about the user | ||
| 2. **Episodic Memory** - Session-derived facts with potential decay | ||
| 3. **Workspace/Project Memory** - Scope-specific context | ||
|
|
||
| ## Memory Type Definitions | ||
|
|
||
| ### User Memory | ||
|
|
||
| Long-term, stable information about the user that persists across sessions. | ||
|
|
||
| | Kind | Definition | Persistence | Confidence Threshold | Examples | | ||
| |------|------------|-------------|---------------------|----------| | ||
| | `profile` | Identity facts: name, role, affiliation, background | Permanent until edited | 0.85 | "My name is Jerry", "I'm a PhD student at MIT" | | ||
| | `preference` | Style, format, or interaction preferences | Semi-permanent | 0.72 | "I prefer concise answers", "Use Chinese for responses" | | ||
| | `goal` | Long-term objectives or research directions | Session-spanning | 0.68 | "I'm researching LLM memory systems", "Working on a paper about RAG" | | ||
| | `constraint` | Hard rules or requirements that must be followed | Permanent | 0.62 | "Never include code examples", "Always cite sources" | | ||
| | `project` | Project-level context and background | Project-scoped | 0.70 | "PaperBot is a research workflow tool", "The codebase uses FastAPI" | | ||
|
|
||
| ### Episodic Memory | ||
|
|
||
| Information derived from specific sessions or interactions, subject to decay or update. | ||
|
|
||
| | Kind | Definition | Persistence | Confidence Threshold | Examples | | ||
| |------|------------|-------------|---------------------|----------| | ||
| | `fact` | Specific facts mentioned during interaction | Decaying (use_count tracked) | 0.65 | "User's deadline is Friday", "Paper was submitted to NeurIPS" | | ||
| | `decision` | Key decisions made during sessions | Project-scoped | 0.70 | "We chose SQLite over Postgres", "Using Docker for sandboxing" | | ||
| | `hypothesis` | Tentative assumptions or inferences | Low-priority, may expire | 0.50 | "User may be a graduate student", "Likely interested in NLP" | | ||
| | `todo` | Action items or pending tasks | Time-bound (expires_at) | 0.58 | "Need to implement FTS5", "Review the memory module" | | ||
|
|
||
| ### Workspace/Project Memory | ||
|
|
||
| Scope-specific information tied to a particular workspace or research track. | ||
|
|
||
| | Kind | Definition | Persistence | Confidence Threshold | Examples | | ||
| |------|------------|-------------|---------------------|----------| | ||
| | `keyword_set` | Domain-specific keywords or focus areas | Project-scoped | 0.75 | "Focus areas: RAG, memory, multi-agent" | | ||
| | `note` | Free-form annotations or observations | Project-scoped | 0.60 | "Important: check GDPR compliance", "Consider using vector DB" | | ||
|
|
||
| ## What is NOT Memory | ||
|
|
||
| The following are explicitly **excluded** from the memory system: | ||
|
|
||
| | Item | Reason | Where It Belongs | | ||
| |------|--------|------------------| | ||
| | **Context Caching** | Performance optimization, not long-term storage | LLM provider layer (prompt caching) | | ||
| | **Code Index** | Repository structure, symbols, dependencies | `context_engine/` or `code_index` subsystem | | ||
| | **Search Results** | Ephemeral retrieval results | Not persisted; used immediately | | ||
| | **Embeddings** | Vector representations of content | Separate `memory_embeddings` table (future) | | ||
| | **Raw Conversation History** | Unprocessed message logs | `memory_sources` table (input, not output) | | ||
| | **Temporary Variables** | Session-only working memory | In-memory only, not persisted | | ||
|
|
||
| ## Status Workflow | ||
|
|
||
| Memory items follow a status workflow based on confidence and risk: | ||
|
|
||
| ``` | ||
| Extracted (MemoryCandidate) | ||
| │ | ||
| ├── confidence >= 0.60 AND pii_risk < 2 | ||
| │ │ | ||
| │ ▼ | ||
| │ ┌─────────────┐ | ||
| │ │ approved │ ──── Can be retrieved and injected | ||
| │ └─────────────┘ | ||
| │ | ||
| └── confidence < 0.60 OR pii_risk >= 2 | ||
| │ | ||
| ▼ | ||
| ┌─────────────┐ | ||
| │ pending │ ──── Requires human review | ||
| └─────────────┘ | ||
| │ | ||
| ┌─────────┴─────────┐ | ||
| │ │ | ||
| ▼ ▼ | ||
| ┌─────────────┐ ┌─────────────┐ | ||
| │ approved │ │ rejected │ | ||
| └─────────────┘ └─────────────┘ | ||
| ``` | ||
|
|
||
| ## Confidence Score Guidelines | ||
|
|
||
| When implementing extractors, use these confidence score ranges: | ||
|
|
||
| | Score Range | Meaning | Action | | ||
| |-------------|---------|--------| | ||
| | 0.85 - 1.00 | High confidence, explicit statement | Auto-approve | | ||
| | 0.70 - 0.84 | Good confidence, clear inference | Auto-approve | | ||
| | 0.60 - 0.69 | Moderate confidence | Auto-approve (borderline) | | ||
| | 0.50 - 0.59 | Low confidence | Pending (needs review) | | ||
| | 0.00 - 0.49 | Very low confidence | Pending or reject | | ||
|
|
||
| ## PII Risk Levels | ||
|
|
||
| | Level | Description | Behavior | | ||
| |-------|-------------|----------| | ||
| | 0 | No PII detected | Normal processing | | ||
| | 1 | Possible PII (redacted patterns found) | Normal, but flagged | | ||
| | 2 | High PII risk (raw PII detected) | Auto-pending, requires review | | ||
|
|
||
| ## Usage in Code | ||
|
|
||
| ```python | ||
| from paperbot.memory.schema import MemoryKind, MemoryCandidate | ||
|
|
||
| # Valid memory kinds | ||
| valid_kinds = [ | ||
| # User Memory | ||
| "profile", "preference", "goal", "constraint", "project", | ||
| # Episodic Memory | ||
| "fact", "decision", "hypothesis", "todo", | ||
| # Workspace Memory | ||
| "keyword_set", "note" | ||
| ] | ||
|
|
||
| # Creating a memory candidate | ||
| candidate = MemoryCandidate( | ||
| kind="preference", | ||
| content="User prefers concise answers", | ||
| confidence=0.75, | ||
| tags=["style", "output"], | ||
| evidence="From conversation: 'Please keep responses brief'", | ||
| scope_type="global", | ||
| scope_id=None, | ||
| status="approved" # Will be set by store based on confidence | ||
| ) | ||
| ``` | ||
|
|
||
| ## References | ||
|
|
||
| - Design Document: `docs/P0_memory_scope_acceptance_design.md` | ||
| - Schema Definition: `src/paperbot/memory/schema.py` | ||
| - Store Implementation: `src/paperbot/infrastructure/stores/memory_store.py` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| """ | ||
| Memory evaluation tests for Scope and Acceptance criteria. | ||
|
|
||
| Test files: | ||
| - test_deletion_compliance.py: Verify deleted items never retrieved | ||
| - test_retrieval_hit_rate.py: Verify relevant memories are found | ||
| """ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| { | ||
| "description": "Synthetic test queries for retrieval hit rate evaluation", | ||
| "test_cases": [ | ||
| { | ||
| "id": "pref_output_style", | ||
| "query": "output format preference", | ||
| "expected_kinds": ["preference"], | ||
| "expected_keywords": ["concise", "brief", "format"] | ||
| }, | ||
| { | ||
| "id": "project_context", | ||
| "query": "current project", | ||
| "expected_kinds": ["project"], | ||
| "expected_keywords": ["project", "working on", "building"] | ||
| }, | ||
| { | ||
| "id": "user_identity", | ||
| "query": "user name role", | ||
| "expected_kinds": ["profile"], | ||
| "expected_keywords": ["name", "role", "position"] | ||
| }, | ||
| { | ||
| "id": "constraints", | ||
| "query": "rules restrictions", | ||
| "expected_kinds": ["constraint"], | ||
| "expected_keywords": ["never", "always", "must", "don't"] | ||
| }, | ||
| { | ||
| "id": "goals", | ||
| "query": "goals objectives", | ||
| "expected_kinds": ["goal"], | ||
| "expected_keywords": ["goal", "want", "trying to", "working towards"] | ||
| } | ||
| ] | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| { | ||
| "description": "Sample memories for testing extraction and retrieval", | ||
| "user_id": "test_user", | ||
| "memories": [ | ||
| { | ||
| "kind": "profile", | ||
| "content": "User's name is Jerry, a PhD student at MIT", | ||
| "confidence": 0.85, | ||
| "tags": ["identity", "name", "role"] | ||
| }, | ||
| { | ||
| "kind": "preference", | ||
| "content": "User prefers concise, brief responses without excessive explanation", | ||
| "confidence": 0.75, | ||
| "tags": ["style", "output", "format"] | ||
| }, | ||
| { | ||
| "kind": "project", | ||
| "content": "Currently working on PaperBot, a research workflow tool", | ||
| "confidence": 0.80, | ||
| "tags": ["project", "paperbot"] | ||
| }, | ||
| { | ||
| "kind": "constraint", | ||
| "content": "Never include code examples unless explicitly requested", | ||
| "confidence": 0.70, | ||
| "tags": ["rules", "code"] | ||
| }, | ||
| { | ||
| "kind": "goal", | ||
| "content": "Working towards building a memory middleware for LLM applications", | ||
| "confidence": 0.72, | ||
| "tags": ["goal", "memory", "llm"] | ||
| }, | ||
| { | ||
| "kind": "fact", | ||
| "content": "Project deadline is end of Q1 2025", | ||
| "confidence": 0.65, | ||
| "tags": ["deadline", "timeline"] | ||
| } | ||
| ] | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a discrepancy between the query logic described in the main design document (
P0_memory_scope_acceptance_design.md) and the implementation snippet shown here. The design doc states that queries for a specific scope (e.g.,track) should also includeglobalitems. However, the snippet here (and the actual implementation) only filters for the specificscope_type, excluding global items. This could lead to unexpected behavior where global memories are not retrieved when working within a specific track. The documentation and implementation should be aligned with the intended design.