pathXcite is an integrated graphical tool for literature-based over-representation analysis (ORA). It tests a list of genes against curated libraries to find enriched pathways, phenotypes, diseases, and other biological concepts.
When experimental data is limited or analyses are exploratory, a literature-driven approach can reveal relevant associations between your topic and over-represented functional terms.
Streamlined and accessible — pathXcite covers the full workflow: literature curation, gene extraction & mapping, ranking, and interactive comparative enrichment — all in a self-contained environment with minimal setup.
Key benefits:
- End-to-end GUI workflow for reproducible analyses
- Support for curated libraries and custom .gmt files
- Interactive, exportable results and comprehensive documentation
- Usable for both quick exploration and in-depth studies
Links:
- Repository: https://github.com/sysbio-bioinf/pathXcite
- Documentation & Tutorials: pathXcite Website
- Key Features
- Quick Start
- Installation
- OS-Specific Setup
- Verify Installation
- Recommended Configuration
- Workflow at a Glance
- Gene Ranking Strategies
- Inputs & Outputs
- Supported Gene Set Libraries
- System Requirements
- Troubleshooting
- Links
- Citation & License
- Contributing
- Retrieve literature (PubMed / PMC) and build a curated corpus
- Extract gene mentions and map to official gene symbols (and NCBI Gene IDs via PubTator Central)
- Rank genes by Absolute Frequency or GF-IDF (gene-frequency inverse document frequency)
- Enrichment against 240+ built-in libraries or custom
.gmtfiles - Interactive visualizations, exportable results, and local self-contained project folders for reproducibility
Check out our Quick start and step-by-step tutorials
git clone https://github.com/sysbio-bioinf/pathXcite.git
cd pathXcite
# macOS / Linux
chmod +x ./setup.sh && ./setup.sh
# Windows (PowerShell)
powershell -ExecutionPolicy Bypass -File ".\setup_win.ps1"Then follow the 5-step workflow:
- Create project
- Add articles
- Extract & map genes
- Rank genes
- Run enrichment
Note: On our website, you can find more infos and troubleshooting.
- Go to the repository: https://github.com/sysbio-bioinf/pathXcite
- Code → Download ZIP
- Unzip (folder name may be
pathXciteorpathXcite-main) - Move the folder to a convenient location (e.g.,
Documents/)
git clone https://github.com/sysbio-bioinf/pathXcite.gitAfter downloading/cloning you should have:
app/ main.py
setup.sh setup_win.ps1
requirements.txt setup_gmt_files.py
test_imports.py pathXcite_win.bat
...
- Examples use
~/Documents/pathXcite(macOS/Linux) or%HOMEPATH%\Documents\pathXcite(Windows). - If using the ZIP, your folder might be
pathXcite-main— adjust paths accordingly. - In code blocks, type everything after
$(do not include the$).
Step 1 — Open PowerShell
- Press ⊞ Win, type PowerShell (or Windows Terminal), press Enter.
- Or Win + X → Windows Terminal.
Step 2 — Go to the folder
cd "%HOMEPATH%\Documents\pathXcite"(Tip: In File Explorer, right-click the folder background → Open in Terminal.)
Step 3 — First-time setup
-
Option 1: Double-click
setup_win.ps1in File Explorer. If SmartScreen appears: More info → Run anyway. -
Option 2 (with logging):
powershell -ExecutionPolicy Bypass -File ".\setup_win.ps1"
-
If scripts are blocked for the session:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
What this does: creates a local environment using the bundled Python, installs required packages, and launches pathXcite.
Step 4 — Subsequent launches
powershell -ExecutionPolicy Bypass -File ".\setup_win.ps1"Security prompts you might see
- SmartScreen: More info → Run anyway.
- “Running scripts is disabled”: use the one-time bypass above (Process scope only).
Step 1 — Open Terminal
- ⌘ + Space → type Terminal → Return, or Applications → Utilities → Terminal.
Step 2 — Go to the folder
cd "$HOME/Documents/pathXcite"(Tip: Drag the folder into the Terminal window to autofill the path.)
Step 3 — Make the installer executable (first time)
chmod +x ./setup.sh
chmod +x ./pathXcite_os.shStep 4 — Run the installer
./setup.shStep 5 — Subsequent launches
./pathXcite_os.shSecurity prompts you might see
- Gatekeeper: if you see “can’t be opened,” go to System Settings → Privacy & Security, click Open Anyway, then run again.
- Permission denied: ensure you ran
chmod +x setup.sh.
Step 1 — Open Terminal
- Ctrl + Alt + T, or search Terminal in Activities.
Step 2 — Go to the folder
cd "$HOME/Documents/pathXcite"Step 3 — Make the installer executable (first time)
chmod +x ./setup.sh
chmod +x ./pathXcite_os.shStep 4 — Run the installer
./setup.shStep 5 — Subsequent launches
./pathXcite_os.shCommon hints
-
Permission denied: ensure
chmod +x setup.shwas run. -
Missing unzip (only if unzipping via terminal):
sudo apt update && sudo apt install unzip
- After setup completes, pathXcite launches automatically.
- Keep the terminal open during the first run; a few packages may finish installing in the background.
- If you moved the folder after installing, re-run the installer once in the new location.
- If it doesn’t start, re-run the OS-specific command from the correct folder.
- Create an API key in your NCBI Account settings.
- In pathXcite, open Settings → Account API.
- Paste the key and your email and save. Retrieval is typically ~3-4 times faster.
- In Settings → Libraries, download or update Enrichr libraries used for enrichment.
- To use custom terms, add your own
.gmtfiles.
- Create a project — A folder stores corpus, annotations, and results (portable, reproducible).
- Add articles — Supply PMIDs/PMCIDs or use the integrated browser; curate as needed.
- Extract genes — Annotate gene mentions and map to NCBI Gene IDs.
- Rank genes — Choose Absolute Frequency or GF-IDF (topic-specific weighting).
- Run enrichment — Select libraries, inspect interactive visualizations, export results.
See Quick Start and full walkthroughs with screenshots at the Website.
- Absolute Frequency — counts mentions; highlights widely studied genes.
- GF-IDF — emphasizes topic-specific genes by down-weighting globally common ones. Tip: Run enrichment with both lists to reveal complementary signals.
Inputs
- PMIDs / PMCIDs (or integrated browser search results)
- Optional species filters
- Optional custom libraries via
.gmt
Outputs
- Ranked gene lists (Absolute Frequency, GF-IDF) with NCBI Gene IDs
- Enrichment tables (p-values, FDR, overlap)
- Interactive plots; exportable figures and tables
- Self-contained project folders for reproducibility
- Gene Ontology (BP/MF/CC), HPO
- KEGG, Reactome, WikiPathways, BioCarta, BioPlanet
- TF/regulatory: ChEA, ENCODE, JASPAR, TRRUST
- Disease: DisGeNET, OMIM, ClinVar, Orphanet, MGI
- Drug signatures: LINCS L1000, DSigDB, GEO perturbations
- Cell/tissue: GTEx, ARCHS4, CellMarker, Azimuth, Allen Brain Atlas
- Complexes & PPIs: CORUM, BioPlex, virus–host interaction sets
- Custom: import your own
.gmtfiles
- Windows 10+, macOS 14+ (Intel/Apple Silicon), or Ubuntu 20.04+
- ≥ 4 GB RAM (8 GB recommended)
- ~4 GB disk (tool + databases)
- Internet required for literature retrieval and library downloads (after article retrieval and caching, offline use is possible)
-
“Permission denied” (macOS/Linux)
chmod +x setup.sh
-
“Running scripts is disabled” (Windows) — one-time bypass (Process scope only)
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
-
SmartScreen / Gatekeeper — choose More info → Run anyway (Windows) or Open Anyway in Privacy & Security (macOS).
-
Slow first run — set your Entrez API key (Settings → Entrez) and ensure network connectivity.
-
Folder name mismatch (after ZIP) — your folder may be
pathXcite-main; update yourcdpath.
- Project repo: https://github.com/sysbio-bioinf/pathXcite
- Documentation, Quick Start, Tutorials: https://sysbio.uni-ulm.de/software/pathXcite
If you use pathXcite in your research, please cite:
[../DOI or CITATION.cff]
Licensed under the MIT License. See LICENSE.
Citation details in CITATION (or CITATION.cff).
