Hi, I am Qing (Leah) Li, a Bioinformatician & Computational Biologist 🧬 with 7+ years of experience in the area of bioinformatics including leading, managing and analyzing bioinformatics and genomics/multi-omics projects, transforming complex datasets into actionable insights across academia and clinical research 💻.
-
Omics Data Analysis: Genomic, transcriptomic, proteomic, and single-cell data processing, integration, and visualization (bulk and spatial).
-
Variant Discovery and Association Studies: Germline/somatic variant calling, eQTL/pQTL mapping, GWAS/TWAS, colocalization, and Mendelian randomization.
-
Machine Learning and Model Development: Deep learning, autoencoders, transfer learning, and interpretable ML for regulatory variant and gene prediction.
-
Workflow and Pipeline Development: Reproducible workflows using Nextflow, Docker, Git, and HPC/cloud systems (AWS, Slurm).
-
Statistical and Functional Analysis: Differential expression, functional enrichment, pathway/network analysis, multi-ancestry integrative genomics for cancer risk and drug discovery.
-
Programming and Tools: Python, R, Bash (Linux), SQL, GitHub, Power BI, RMarkdown, PyTorch, TensorFlow, Scikit-learn.
-
Collaboration and Communication: Cross-disciplinary teamwork, teaching and mentoring, conference presentations, and multilingual communication (Mandarin, French, English).
- Risk Genes, Proteins, and Drugs for Colorectal Cancer
- Li Q*, Song Q*, Chen Z*, Choi J, Moreno V, Ping J, Wen W, Li C, Shu X, Yan J, Shu XO, Cai Q, Long J, Huyghe JR, Pai R, Gruber SB, Casey G, Wang X, Toriola AT, Li L, Singh B, Lau KS, Zhou L, Wu C, Peters U, Zheng W, Long Q, Yin Z, Guo X.** Large-scale integration of omics and electronic health records to identify potential risk protein biomarkers and therapeutic drugs for cancer prevention and intervention. medRxiv, 2025+ doi:10.1101/2024.05.29.24308170 — in prep. AJHG.
- Chen Z*, Song W*, Li Q*, et al. Identifying putative susceptible transcription factors and genes for colorectal cancer: an integrated analysis of large-scale genetic and multi-omics data. Nature Communications, 2025+ — under minor revision.
- Multi-Ancestry Genomic Analyses
- Lyu L*, Li Q*, Li C, Wen W, Chen Y, Shu XO, Zheng W, Yin Z, Lau KS, Guo X. Multi-ancestry genome-wide and transcriptome-wide association analyses identify new risk loci and genes in inflammatory bowel disease. medRxiv, 2025+ doi:10.1101/2025.09.23.25336483 — under review, Gut.
- Artificial Intelligence and Deep Learning for Omics
- Li Q, Perera D, Chen Z, Wen W, Wang D, Yan J, Shu X, Zheng W, Guo X, Long Q. Leveraging deep transfer learning and multi-omics integration to enhance regulatory variant prediction. bioRxiv, 2025+ doi:10.1101/2023.09.11.557208— under review, PLOS Genetics.
- Li Q*, Bian J*, Weeraman J*, Zhang Z, Leung A, Ding QX, Chekouo T, Wu L, Yan J, Wu J, Long Q. Autoencoder-transformed transcriptome improves genotype–phenotype association studies. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2025. PMID: 40811310
- Li Q, Yu Y, Kossinna P, Lun T, Liao W, Zhang Q. XA4C: eXplainable representation learning via autoencoders revealing critical genes. PLOS Computational Biology, 2023. PMID: 37782668
- Statistical Genetics and Method Development
- Li Q, Perera D, Cao C, He J, Bian J, Chen X, Azeem F, Howe A, Au B, Wu J, Yan J, Long Q. Interaction-integrated linear mixed model reveals 3D-genetic basis underlying Autism. Genomics, 2023. PMID: 36758877
- Li Q*, Bian J*, Qian Y, Kossinna P, Gordon P, Yan J, Zhou X, Guo X, Wu J, Long Q. An expression-directed linear mixed model (edLMM) discovering low-effect genetic variants. Genetics, 2024. PMID: 38314848
- Cao C*, Ding B*, Li Q, Kwok D, Wu J, Long Q. Power analysis of transcriptome-wide association studies: implications for practical protocol choice. PLOS Genetics, 2021. PMID: 33635859
- Cao C, Kwok D, Edie S, Li Q, Ding B, Kossinna P, Campbell S, Wu J, Greenberg M, Long Q. kTWAS: integrating kernel-machine with transcriptome-wide association studies improves statistical power and reveals novel genes. Briefings in Bioinformatics, 2021. PMID: 33200776
- (Spatial) Single cell analysis: Python packages Squidpy | [Scanpy] (https://scanpy.readthedocs.io/en/stable/)
- Statistics & Visualization: R packages - DESeq2 | limma | ggplot2 | UpsetR
- Interactive Visualization: R packages - InteractiveComplexHeatmap | Shiny
- Functional Enrichment Analysis: g:Profiler | Metascape | enrichR
- Network Analysis: Cytoscape | BioNERO R package | STRING
- Variant Analysis: Genome Aggregation Database (gnomAD) | ClinVar | DISGENET
- Workflow Manager & Pipelines: Nextflow-based pipelines on High-Performance Computing (HPC) cluster

