A skill for Claude Code (Anthropic's CLI coding agent) that searches the last 30 days of scientific literature and synthesizes findings. No API keys required.
/research30 CRISPR gene editing
/research30 microplastics health effects
/research30 large language model alignment
Claude searches 5 academic databases in parallel, scores and deduplicates results, then synthesizes key findings, trends, and gaps from the top 25 papers.
Prerequisites: Claude Code and Python 3.8+ (no pip packages needed — stdlib only)
1. Clone the repo:
git clone https://github.com/shandley/research30.git
cd research302. Install the skill (run from the repo root):
mkdir -p ~/.claude/skills
ln -s "$(pwd)/research30" ~/.claude/skills/research30This creates a symlink so the skill stays in sync with the repo. The $(pwd) must expand to the repo root (the directory containing research30/, README.md, etc.).
3. Use it:
Start a new Claude Code session (skills are loaded at startup), then:
/research30 your topic here
This works from any directory — the skill is globally installed.
Standalone (no Claude Code): You can also run the script directly from the repo root:
python3 research30/scripts/research30.py "your topic here"This outputs the raw ranked list. Claude Code adds the synthesis layer on top.
| Source | What | Coverage |
|---|---|---|
| OpenAlex | Journals, preprints, conference papers | 250M+ works, topic-augmented full-text search |
| Semantic Scholar | Embedding-based semantic search | Conceptual matching beyond keywords (needs free API key) |
| PubMed | Peer-reviewed journal articles | TIAB field-tagged queries with MeSH metadata |
| arXiv | Physics, math, CS, q-bio preprints | Atom API keyword search |
| HuggingFace | ML models, datasets, daily papers | Hub API |
Additional sources available via --sources=biorxiv or --sources=medrxiv (slower, direct API pagination).
python3 research30/scripts/research30.py <topic> [options]
| Flag | Description |
|---|---|
--quick |
Faster: fewer API results, shows top 10 |
--deep |
Thorough: more API results, shows top 50 |
--refresh |
Bypass 24h cache, fetch fresh results |
--sources=MODE |
all (default), preprints, pubmed, huggingface, openalex, semanticscholar, biorxiv, medrxiv, arxiv |
--emit=MODE |
compact (default), html, json, md, context, path |
--debug |
Verbose HTTP logs to stderr |
--mock |
Use bundled fixtures (for testing) |
# Default: top 25 results from all sources
python3 research30/scripts/research30.py "CRISPR gene editing"
# Quick scan — top 10
python3 research30/scripts/research30.py "protein folding" --quick
# Deep dive — top 50
python3 research30/scripts/research30.py "single-cell transcriptomics" --deep
# PubMed only
python3 research30/scripts/research30.py "Alzheimer's disease biomarkers" --sources=pubmed
# HTML report (opens in browser)
python3 research30/scripts/research30.py "protein folding" --emit=html
# JSON output for programmatic use
python3 research30/scripts/research30.py "protein folding" --emit=jsonBoth keys are optional. The skill works without them.
Semantic Scholar (recommended — adds semantic/embedding search):
# Get a free key at https://www.semanticscholar.org/product/api#api-key-form
mkdir -p ~/.config/research30
echo 'S2_API_KEY=your_key_here' >> ~/.config/research30/.env
chmod 600 ~/.config/research30/.envNCBI (faster PubMed — 10 req/s instead of 3):
# Get a free key at https://www.ncbi.nlm.nih.gov/account/settings/
echo 'NCBI_API_KEY=your_key_here' >> ~/.config/research30/.envSee an example HTML report (download and open in your browser).
Default output is a flat ranked list sorted by score (0-100). Each item includes score, title, source, date, URL, metadata, abstract snippet (first 200 chars), and relevance explanation. When used as a Claude Code skill, Claude reads these and synthesizes key findings, research fronts, methods, and gaps.
Script outputs (written on every run to ~/.local/share/research30/out/):
report.html— interactive HTML with score badges, source tags, collapsible abstractsreport.json— all results (not just top N)report.md— formatted markdowncontext.md— condensed context snippet
Skill outputs (created by Claude when invoked via /research30):
~/.local/share/research30/reports/{topic}-{date}.md— full synthesis report~/.local/share/research30/research-log.md— cumulative research journal
Results are cached for 24 hours at ~/.cache/research30/. Use --refresh to bypass.
Each result is scored 0-100:
Papers (50% relevance + 25% recency + 25% academic signal):
- Keyword relevance: title match 2x weighted, bigram matching for multi-word queries, position boost for OpenAlex/S2 results
- Recency: newer = higher
- Academic: peer review, citations, journal, author count
HuggingFace (45% relevance + 25% recency + 30% engagement):
- Downloads and likes as engagement signal
Cross-source deduplication uses DOI matching + Jaccard title similarity. Priority: PubMed > S2 > OpenAlex > bioRxiv > medRxiv > arXiv > HuggingFace.
OpenAlex discovers relevant topic clusters first (e.g., "virome" maps to topic T11048), then searches within them — combining ML-classified topics with keyword ranking.
PubMed uses explicit [TIAB] field tags instead of Automatic Term Mapping (which can misfire). Multi-word queries get both exact-phrase and AND-combination: ("CRISPR gene editing"[TIAB] OR (CRISPR[TIAB] AND gene[TIAB] AND editing[TIAB])). MeSH terms are extracted as metadata.
Semantic Scholar provides embedding-based search — catches papers that use different terminology for the same concept. Post-filtered with a higher relevance threshold (0.3 vs 0.1).
cd research30
python3 -m pytest tests/ -vAll tests use bundled fixtures and make no network requests.
research30/
SKILL.md -- Claude Code skill manifest
scripts/
research30.py -- Main orchestrator
lib/
openalex.py -- OpenAlex API (topic discovery + works search)
semanticscholar.py -- Semantic Scholar API (semantic search)
pubmed.py -- PubMed E-utilities (TIAB queries, MeSH)
arxiv.py -- arXiv Atom API
huggingface.py -- HuggingFace Hub API
biorxiv.py -- bioRxiv/medRxiv API (parallel pagination)
schema.py -- Data models
normalize.py -- Schema conversion + keyword relevance
score.py -- Academic-signal scoring
dedupe.py -- Cross-source deduplication
render.py -- Output formatting
cache.py -- 24-hour result caching
http.py -- HTTP client (stdlib only)
xml_parse.py -- XML parsers (arXiv Atom, PubMed)
dates.py -- Date utilities
env.py -- Configuration loading
ui.py -- Terminal progress display
tests/ -- Unit tests (103 tests)
fixtures/ -- Mock API responses
MIT