Evidence from Cryptocurrency Factor Analysis
Working Paper DAI-2508 | Dissensus AI
This study investigates whether cryptocurrency whitepaper narratives align with empirically observed market factor structure. We construct a pipeline combining zero-shot NLP classification of 38 whitepapers across 10 semantic categories with CP tensor decomposition of hourly market data (49 assets, 17,543 timestamps). Using Procrustes rotation and Tucker's congruence coefficient, we find weak alignment between claims and market statistics (phi = 0.246, p = 0.339) and between claims and latent factors (phi = 0.058, p = 0.751). A methodological validation comparison---statistics versus factors, both derived from market data---achieves significance (p < 0.001), confirming the pipeline detects real structure. The null result indicates whitepaper narratives do not meaningfully predict market factor structure, with implications for narrative economics and investor decision-making. Entity-level analysis reveals specialized tokens (XMR, CRV, YFI) show stronger narrative--market correspondence than broad infrastructure tokens.
| Finding | Result |
|---|---|
| Claims-Statistics alignment | phi = 0.246 (weak, p = 0.339) |
| Claims-Factors alignment | phi = 0.058 (negligible, p = 0.751) |
| Pipeline validation (Stats vs Factors) | Significant (p < 0.001) |
| Variance explained by CP decomposition | 92.45% (rank-2) |
| Assets analyzed | 49 cryptocurrencies, 17,543 timestamps |
cryptocurrency, tensor decomposition, NLP, factor analysis, Procrustes rotation, Tucker's congruence coefficient, zero-shot classification
tensor-defi/
├── src/ # Python modules
│ ├── alignment/ # Procrustes alignment methods
│ ├── nlp/ # NLP classification pipeline
│ ├── tensor_ops/ # CP decomposition operations
│ ├── market/ # Market data processing
│ ├── visualization/ # Plotting utilities
│ └── stats/ # Statistical tests
├── scripts/ # Analysis pipeline scripts
│ ├── run_full_pipeline.py # Complete end-to-end pipeline
│ ├── run_nlp.py # NLP classification
│ ├── run_tensor.py # Tensor construction
│ ├── run_alignment.py # Factor alignment
│ └── run_figures.py # Figure generation
├── paper/ # LaTeX source
│ ├── main-arxiv.tex # Paper source
│ ├── references.bib # Bibliography
│ └── figures/ # Paper figures
├── data/ # Input data (included)
│ ├── whitepapers/ # PDF corpus
│ └── market/ # Parquet market data
├── outputs/ # Pipeline outputs
├── figures/ # Generated figures
├── CITATION.cff
├── requirements.txt
└── LICENSE
python scripts/run_full_pipeline.pypython scripts/run_nlp.py # NLP classification of whitepapers
python scripts/run_tensor.py # Build market tensor
python scripts/run_alignment.py # Compute factor alignment
python scripts/run_figures.py # Generate figures- RAM: 16GB minimum (32GB recommended for full tensor operations)
- GPU: Optional but recommended for NLP inference (CUDA/ROCm supported)
@article{farzulla2026whitepaper,
author = {Farzulla, Murad},
title = {Do Whitepaper Claims Predict Market Behavior? Evidence from Cryptocurrency Factor Analysis},
year = {2026},
journal = {arXiv preprint arXiv:2601.20336},
doi = {10.5281/zenodo.17917922}
}- Murad Farzulla -- Dissensus AI & King's College London
- ORCID: 0009-0002-7164-8704
- Email: murad@dissensus.ai
Paper content: CC-BY-4.0