jerry609 · jerry609 · Feb 11, 2026 · Feb 10, 2026 · Feb 10, 2026 · Feb 10, 2026
diff --git a/README.md b/README.md
@@ -9,9 +9,9 @@
 | 模块 | 说明 |
 |------|------|
 | **Topic Search** | 多主题聚合检索，支持 papers.cool + arXiv API + Hugging Face Daily Papers 三数据源，跨 query/branch 去重与评分排序，`min_score` 质量过滤 |
-| **DailyPaper** | 日报生成（Markdown/JSON），可选 LLM 增强（摘要/趋势/洞察/相关性），支持定时推送（Email/Slack/钉钉） |
-| **LLM-as-Judge** | 5 维评分（Relevance/Novelty/Rigor/Impact/Clarity）+ 推荐分级（must_read/worth_reading/skim/skip），Token Budget 控制，多轮校准 |
-| **Analyze SSE** | Judge + Trend 分析通过 SSE 实时流式推送，前端增量渲染（逐张 Judge 卡片 / 逐条 Trend 分析） |
+| **DailyPaper** | 日报生成（Markdown/JSON），SSE 实时流式推送全流程进度，可选 LLM 增强（摘要/趋势/洞察/相关性），Judge 评分后自动过滤低质论文，支持定时推送（Email/Slack/钉钉） |
+| **LLM-as-Judge** | 5 维评分（Relevance/Novelty/Rigor/Impact/Clarity）+ 推荐分级（must_read/worth_reading/skim/skip），Token Budget 控制，多轮校准，评分后自动过滤 skip/skim 论文 |
+| **Analyze SSE** | Judge + Trend 分析通过 SSE 实时流式推送，前端增量渲染（逐张 Judge 卡片 / 逐条 Trend 分析），完整 Judge 日志保留 |
 | **学者追踪** | 定期监测学者论文，多 Agent 协作（Research/Code/Quality/Reviewer），PIS 影响力评分（引用速度、趋势动量） |
 | **深度评审** | 模拟同行评审（初筛→深度批评→决策），输出 Summary/Strengths/Weaknesses/Novelty Score |
 | **Paper2Code** | 论文到代码骨架（Planning→Analysis→Generation→Verification），自愈调试，Docker/E2B 沙箱执行 |
@@ -82,6 +82,24 @@ Input Queries ──→  ├─── arXiv API (relevance sort)
                    └── Web UI (DAG + Tabs: Papers / Insights / Judge)
 ```
 
+### DailyPaper SSE 流式管线
+
+当启用 LLM 分析或 Judge 评分时，`/daily` 端点返回 SSE 流式响应，前端实时显示每个阶段的进度：
+
+```text
+Search → Build Report → LLM Enrichment → Judge Scoring → Filter → Save → Notify → Result
+  │          │               │                │            │
+  │          │               │                │            └─ 移除 skip/skim 论文
+  │          │               │                └─ 逐篇评分，实时推送 judge 事件
+  │          │               └─ 逐篇摘要 + 趋势分析 + 洞察
+  │          └─ 构建报告结构
+  └─ 多源检索 + 去重 + 评分
+```
+
+**Post-Judge 过滤**：Judge 评分完成后，自动移除推荐等级为 `skip` 和 `skim` 的论文，只保留 `must_read` 和 `worth_reading` 的论文。完整的 Judge 评分日志保留在 `report.filter.log` 中。
+
+**前端配置持久化**：所有功能开关（LLM/Judge/数据源/邮箱等）默认全部启用，保存在浏览器 localStorage 中，刷新页面不会丢失。
+
 ## 界面预览
 
 ### Terminal UI（Ink）
@@ -114,6 +132,10 @@ Input Queries ──→  ├─── arXiv API (relevance sort)
 |---------------|-----------------|
 | ![Judge Cards](asset/ui/9-4.png) | ![Judge Radar](asset/ui/9-5.png) |
 
+### Email 推送
+
+![Email Notification](asset/notify.png)
+
 ## 快速开始
 
 ### 1) 安装
@@ -166,17 +188,34 @@ LLM_REASONING_MODEL=...
 <details>
 <summary>每日推送配置（点击展开）</summary>
 
-```bash
-# 通知渠道
-PAPERBOT_NOTIFY_ENABLED=true
-PAPERBOT_NOTIFY_CHANNELS=email,slack,dingding
+DailyPaper 生成后可自动推送摘要到 Email/Slack/钉钉。有两种配置方式：
+
+**方式一：Web UI 配置（推荐）**
+
+在 Topic Workflow 页面的 Settings 面板中：
+1. 勾选 "Email Notification"
+2. 填入收件邮箱地址（如 `you@example.com`）
+3. 运行 DailyPaper 时会自动在最后发送邮件
+
+> UI 中填写的邮箱会覆盖环境变量中的 `PAPERBOT_NOTIFY_EMAIL_TO`。
+> 所有配置项（LLM/Judge/数据源/邮箱等）会自动持久化到浏览器 localStorage，刷新页面不会丢失。
 
-# Email (SMTP)
-PAPERBOT_NOTIFY_SMTP_HOST=smtp.example.com
-PAPERBOT_NOTIFY_SMTP_USERNAME=...
-PAPERBOT_NOTIFY_SMTP_PASSWORD=...
-PAPERBOT_NOTIFY_EMAIL_FROM=bot@example.com
-PAPERBOT_NOTIFY_EMAIL_TO=you@example.com
+**方式二：环境变量配置**
+
+```bash
+# 总开关
+PAPERBOT_NOTIFY_ENABLED=true          # 是否启用推送（必须为 true 才能发送）
+PAPERBOT_NOTIFY_CHANNELS=email,slack   # 启用的推送渠道（逗号分隔）
+
+# Email (SMTP) — 必须配置才能发送邮件
+PAPERBOT_NOTIFY_SMTP_HOST=smtp.qq.com          # SMTP 服务器地址
+PAPERBOT_NOTIFY_SMTP_PORT=587                  # SMTP 端口（587=STARTTLS, 465=SSL）
+PAPERBOT_NOTIFY_SMTP_USERNAME=your@qq.com      # SMTP 登录用户名
+PAPERBOT_NOTIFY_SMTP_PASSWORD=your-auth-code   # SMTP 密码或授权码
+PAPERBOT_NOTIFY_SMTP_USE_TLS=true              # 是否使用 STARTTLS（端口 587 时为 true）
+PAPERBOT_NOTIFY_SMTP_USE_SSL=false             # 是否使用 SSL（端口 465 时为 true）
+PAPERBOT_NOTIFY_EMAIL_FROM=your@qq.com         # 发件人地址
+PAPERBOT_NOTIFY_EMAIL_TO=recipient@example.com # 默认收件人（可被 UI 覆盖）
 
 # Slack
 PAPERBOT_NOTIFY_SLACK_WEBHOOK_URL=https://hooks.slack.com/...
@@ -185,14 +224,23 @@ PAPERBOT_NOTIFY_SLACK_WEBHOOK_URL=https://hooks.slack.com/...
 PAPERBOT_NOTIFY_DINGTALK_WEBHOOK_URL=https://oapi.dingtalk.com/robot/send?access_token=...
 PAPERBOT_NOTIFY_DINGTALK_SECRET=SEC...
 
-# DailyPaper 定时任务
+# DailyPaper 定时任务（ARQ Worker）
 PAPERBOT_DAILYPAPER_ENABLED=true
 PAPERBOT_DAILYPAPER_CRON_HOUR=8
 PAPERBOT_DAILYPAPER_CRON_MINUTE=30
 PAPERBOT_DAILYPAPER_NOTIFY_ENABLED=true
 PAPERBOT_DAILYPAPER_NOTIFY_CHANNELS=email,slack
 ```
 
+**QQ 邮箱配置示例：**
+1. 登录 QQ 邮箱 → 设置 → 账户 → POP3/SMTP 服务 → 开启
+2. 生成授权码（不是 QQ 密码）
+3. 设置 `SMTP_HOST=smtp.qq.com`, `SMTP_PORT=587`, `SMTP_USE_TLS=true`
+
+**Gmail 配置示例：**
+1. Google 账号 → 安全性 → 两步验证 → 应用专用密码
+2. 设置 `SMTP_HOST=smtp.gmail.com`, `SMTP_PORT=587`, `SMTP_USE_TLS=true`
+
 </details>
 
 ### 3) 启动
@@ -229,7 +277,7 @@ arq paperbot.infrastructure.queue.arq_worker.WorkerSettings
 | `/api/review` | POST | 深度评审（SSE） |
 | `/api/chat` | POST | AI 对话（SSE） |
 | `/api/research/paperscool/search` | POST | 主题检索（多源聚合，支持 `min_score` 过滤） |
-| `/api/research/paperscool/daily` | POST | DailyPaper 日报（支持 `notify` 推送） |
+| `/api/research/paperscool/daily` | POST | DailyPaper 日报（LLM/Judge 启用时返回 SSE 流式，否则 JSON；支持 `notify` 推送） |
 | `/api/research/paperscool/analyze` | POST | Judge + Trend 流式分析（SSE） |
 | `/api/research/tracks` | GET/POST | 研究方向管理 |
 | `/api/research/memory/*` | GET/POST | 记忆系统（Inbox/审核/检索） |

diff --git a/alembic/versions/0003_paper_registry.py b/alembic/versions/0003_paper_registry.py
@@ -0,0 +1,89 @@
+"""paper registry
+
+Revision ID: 0003_paper_registry
+Revises: 0002_research_eval_runs
+Create Date: 2026-02-10
+
+Adds canonical papers table for persistent DailyPaper ingestion.
+"""
+
+from __future__ import annotations
+
+import sqlalchemy as sa
+from alembic import context, op
+
+
+revision = "0003_paper_registry"
+down_revision = "0002_research_eval_runs"
+branch_labels = None
+depends_on = None
+
+
+def _is_offline() -> bool:
+    try:
+        return bool(context.is_offline_mode())
+    except Exception:
+        return False
+
+
+def _insp():
+    return sa.inspect(op.get_bind())
+
+
+def _has_table(name: str) -> bool:
+    return _insp().has_table(name)
+
+
+def _get_indexes(table: str) -> set[str]:
+    idx = set()
+    for i in _insp().get_indexes(table):
+        idx.add(str(i.get("name") or ""))
+    return idx
+
+
+def _create_index(name: str, table: str, cols: list[str]) -> None:
+    if _is_offline():
+        op.create_index(name, table, cols)
+        return
+    if name in _get_indexes(table):
+        return
+    op.create_index(name, table, cols)
+
+
+def upgrade() -> None:
+    if _is_offline() or not _has_table("papers"):
+        op.create_table(
+            "papers",
+            sa.Column("id", sa.Integer(), primary_key=True, autoincrement=True),
+            sa.Column("arxiv_id", sa.String(length=64), nullable=True),
+            sa.Column("doi", sa.String(length=128), nullable=True),
+            sa.Column("title", sa.Text(), server_default="", nullable=False),
+            sa.Column("authors_json", sa.Text(), server_default="[]", nullable=False),
+            sa.Column("abstract", sa.Text(), server_default="", nullable=False),
+            sa.Column("url", sa.String(length=512), server_default="", nullable=False),
+            sa.Column("external_url", sa.String(length=512), server_default="", nullable=False),
+            sa.Column("pdf_url", sa.String(length=512), server_default="", nullable=False),
+            sa.Column("source", sa.String(length=32), server_default="papers_cool", nullable=False),
+            sa.Column("venue", sa.String(length=256), server_default="", nullable=False),
+            sa.Column("published_at", sa.DateTime(timezone=True), nullable=True),
+            sa.Column("first_seen_at", sa.DateTime(timezone=True), nullable=False),
+            sa.Column("keywords_json", sa.Text(), server_default="[]", nullable=False),
+            sa.Column("metadata_json", sa.Text(), server_default="{}", nullable=False),
+            sa.Column("created_at", sa.DateTime(timezone=True), nullable=False),
+            sa.Column("updated_at", sa.DateTime(timezone=True), nullable=False),
+            sa.UniqueConstraint("arxiv_id", name="uq_papers_arxiv_id"),
+            sa.UniqueConstraint("doi", name="uq_papers_doi"),
+        )
+
+    _create_index("ix_papers_arxiv_id", "papers", ["arxiv_id"])
+    _create_index("ix_papers_doi", "papers", ["doi"])
+    _create_index("ix_papers_title", "papers", ["title"])
+    _create_index("ix_papers_source", "papers", ["source"])
+    _create_index("ix_papers_published_at", "papers", ["published_at"])
+    _create_index("ix_papers_first_seen_at", "papers", ["first_seen_at"])
+    _create_index("ix_papers_created_at", "papers", ["created_at"])
+    _create_index("ix_papers_updated_at", "papers", ["updated_at"])
+
+
+def downgrade() -> None:
+    op.drop_table("papers")
diff --git a/alembic/versions/0004_paper_feedback_judge_links.py b/alembic/versions/0004_paper_feedback_judge_links.py
@@ -0,0 +1,112 @@
+"""paper feedback/judge links
+
+Revision ID: 0004_paper_feedback_judge_links
+Revises: 0003_paper_registry
+Create Date: 2026-02-10
+
+Adds:
+- paper_judge_scores table
+- paper_feedback.paper_ref_id nullable FK-like reference column
+"""
+
+from __future__ import annotations
+
+import sqlalchemy as sa
+from alembic import context, op
+
+
+revision = "0004_paper_feedback_judge_links"
+down_revision = "0003_paper_registry"
+branch_labels = None
+depends_on = None
+
+
+def _is_offline() -> bool:
+    try:
+        return bool(context.is_offline_mode())
+    except Exception:
+        return False
+
+
+def _insp():
+    return sa.inspect(op.get_bind())
+
+
+def _has_table(name: str) -> bool:
+    return _insp().has_table(name)
+
+
+def _get_columns(table: str) -> set[str]:
+    cols = set()
+    for c in _insp().get_columns(table):
+        cols.add(str(c.get("name") or ""))
+    return cols
+
+
+def _get_indexes(table: str) -> set[str]:
+    idx = set()
+    for i in _insp().get_indexes(table):
+        idx.add(str(i.get("name") or ""))
+    return idx
+
+
+def _create_index(name: str, table: str, cols: list[str]) -> None:
+    if _is_offline():
+        op.create_index(name, table, cols)
+        return
+    if name in _get_indexes(table):
+        return
+    op.create_index(name, table, cols)
+
+
+def upgrade() -> None:
+    if _is_offline() or not _has_table("paper_judge_scores"):
+        op.create_table(
+            "paper_judge_scores",
+            sa.Column("id", sa.Integer(), primary_key=True, autoincrement=True),
+            sa.Column("paper_id", sa.Integer(), sa.ForeignKey("papers.id"), nullable=False),
+            sa.Column("query", sa.String(length=256), server_default="", nullable=False),
+            sa.Column("overall", sa.Float(), server_default="0.0", nullable=False),
+            sa.Column("relevance", sa.Float(), server_default="0.0", nullable=False),
+            sa.Column("novelty", sa.Float(), server_default="0.0", nullable=False),
+            sa.Column("rigor", sa.Float(), server_default="0.0", nullable=False),
+            sa.Column("impact", sa.Float(), server_default="0.0", nullable=False),
+            sa.Column("clarity", sa.Float(), server_default="0.0", nullable=False),
+            sa.Column("recommendation", sa.String(length=32), server_default="", nullable=False),
+            sa.Column("one_line_summary", sa.Text(), server_default="", nullable=False),
+            sa.Column("judge_model", sa.String(length=128), server_default="", nullable=False),
+            sa.Column("judge_cost_tier", sa.Integer(), nullable=True),
+            sa.Column("scored_at", sa.DateTime(timezone=True), nullable=False),
+            sa.Column("metadata_json", sa.Text(), server_default="{}", nullable=False),
+            sa.UniqueConstraint("paper_id", "query", name="uq_paper_judge_scores_paper_query"),
+        )
+
+    _create_index("ix_paper_judge_scores_paper_id", "paper_judge_scores", ["paper_id"])
+    _create_index("ix_paper_judge_scores_query", "paper_judge_scores", ["query"])
+    _create_index("ix_paper_judge_scores_recommendation", "paper_judge_scores", ["recommendation"])
+    _create_index("ix_paper_judge_scores_scored_at", "paper_judge_scores", ["scored_at"])
+
+    if _is_offline():
+        op.add_column("paper_feedback", sa.Column("paper_ref_id", sa.Integer(), nullable=True))
+        op.create_index("ix_paper_feedback_paper_ref_id", "paper_feedback", ["paper_ref_id"])
+        return
+
+    if "paper_ref_id" not in _get_columns("paper_feedback"):
+        with op.batch_alter_table("paper_feedback") as batch_op:
+            batch_op.add_column(sa.Column("paper_ref_id", sa.Integer(), nullable=True))
+
+    _create_index("ix_paper_feedback_paper_ref_id", "paper_feedback", ["paper_ref_id"])
+
+
+def downgrade() -> None:
+    with op.batch_alter_table("paper_feedback") as batch_op:
+        try:
+            batch_op.drop_index("ix_paper_feedback_paper_ref_id")
+        except Exception:
+            pass
+        try:
+            batch_op.drop_column("paper_ref_id")
+        except Exception:
+            pass
+
+    op.drop_table("paper_judge_scores")