RELI (Regulatory Element Locus Intersection) is an algorithm for discovering transcription factors (TFs) that bind a significant number of loci associated with a given disease or phenotype (e.g., through a Genome Wide Association study, or GWAS).
The major data components are
-
An input set of disease or phenotype-associated genetic variants (RS IDs)
-
An internal “library” consisting of many ChIP-seq dataset peaks (in the form of .bed files)
-
An internal file containing information on genetic variant allele frequencies, etc.
To assess the significance of the intersection between the input disease variants and a given TF ChIP-seq dataset, RELI performs simulations, generating a null distribution used for P-value calculations.
The output of RELI is a series of statistics based upon the significance of the overlap between the input genetic variants and the selected ChIP-seq dataset.
Additional details on RELI and the associated findings can be found in its accompanying publication.
If you have the Common Workflow Language's cwltool already
installed and Docker available on your system, this is the most
straightforward way to run RELI on some sample data:
# clone the public repository and check out the 'cwl-docker-workflow' branch
git clone https://github.com/WeirauchLab/RELI.git
cd RELI
git checkout cwl-docker-workflow
# retrieve sample data and use CWL to run with example input parameters
make fetchdata
cwltool workflow/reli-docker.cwl workflow/reli-example-eu-ancestry.yamlThis will run RELI on a small input set of ChIP-seq data of European ancestry using a set of lupus (SLE)-associated SNPs.
RELI requires a C++11 compiler (e.g. GNU CC 4.7 or higher) and libgsl and
libgslcblas from the GNU Scientific Library.
You may download the latest release as a compressed archive from GitHub, or clone the repository with Git:
# GitHub
git clone https://github.com/WeirauchLab/RELI.git
# Weirauch Lab GitLab
git clone https://tfwebdev.research.cchmc.org/gitlab/ches2d/RELI_public.gitThis is the recommended method if you have some familiarity with Docker, as it does not require you to download a compiler or any of the dependencies necessary to build RELI from source.
You will need to install the appropriate Docker client for your OS. Please see the official docs for help with that.
The most straightforward way to get started is to simply use the pre-built image available on Docker Hub:
docker run -it --rm weirauchlab/reli RELI --helpOr, if you've cloned the RELI_public repository (see above), you can locally
build the CentOS 7-based Docker container and compile RELI from source as
follows:
cd /path/to/cloned/repo
docker build -t reli .
# test to see if it works
docker run -it --rm reli RELI --helpA GNU-style Makefile is provided in the repository. With GSL installed
system-wide, you can build the RELI binary with just
makethen run ./RELI with no arguments to verify that you have a working binary
(you should get a help screen).
In order to run a test analysis, you need to download the sample data either manually (see the next section) or just type
make testwhich will download and validate the sample datasets automatically, then invoke
example/example_run.sh to invoke RELI on the sample data.
This test analysis requires around 10 GB of RAM to finish successfully; 16 GB is recommended.
The included Makefile will respect CFLAGS and LDFLAGS if set in the
environment, for example, if you have a locally-built GSL that is installed in
a non-standard place (such as in your home directory):
CFLAGS=-I/path/to/include LDFLAGS=-L/path/to/lib makeIf g++ is not available in your PATH (or it has a different name), you will
likely want to modify the Makefile directly, beginning around line 33 with the
CC variable.
RELI has also been verified to build and run on the following platforms (in addition to GNU/Linux):
-
Windows with Cygwin and GCC 5.4.0 (ensure the
gcc-g++,make,gsl, andlibgsl-devel, andcurlpackages are installed, at a minimum) -
macOS 10.14.6 (Mojave) with LLVM 10.0.1 (clang-1001.0.46.4) from the Xcode Command Line Tools and GSL installed from MacPorts
-
you need to specify paths to MacPorts' includes/libs like this, before running
makeexport CFLAGS=-I/opt/local/include LDFLAGS=-L/opt/local/lib
-
On Windows, make sure you run make (or the example/example_run.sh script)
from within the Cygwin shell, not the Windows Command Prompt or PowerShell.
You may need to lightly modify the CDT build toolchain settings if your
installation of Cygwin is not at C:\Cygwin64.
Eclipse CDT project settings files are also included for both of the
above toolchains. Just create a copy (or symlink) of the appropriate one
called .cproject, then choose File → Import... → Existing
Projects into Workspace and browse to where you cloned the repository.
If you have problems with make test (perhaps you don't have curl
available), you can manually download and extract the sample datasets from
such that the decompressed data is inside a data subdirectory, within the
RELI_public repository you cloned above. A .zip-format archive is also
provided, in case for some reason you don't have bzip2 available.
You can run the sample analysis by changing into the example directory and
running example_run.sh in a terminal like so:
user@[/path/to/repo]$ cd example
user@[/path/to/repo]$ ./example_run.sh
Required options are in bold text
| Option | Explanation |
|---|---|
-snp FILE |
Phenotype snp file in 4 column bed format |
-ld FILE |
(optional) Phenotype linkage disequilibrium structure for snps, default: no ld file |
-index FILE |
ChIP-seq index file |
-data DIR |
Specify directory where ChIP-seq data are stored |
-target STRING |
Target label of ChIP-seq experiment to be tested from index file |
-build FILE |
Genome build file |
-null FILE |
Null model file |
-dbsnp FILE |
dbSNP table file |
-out DIR |
Specify output directory name under currentg working folder. |
-match |
(optional) Boolean switch to turn on minor allele frequency based matching, default: off |
-rep NUMBER |
(optional) Number of permutation/simulation to be performed, default: 2000 |
-corr NUMBER |
(optional) Bonferroni correction multiplier for multiple test, default: 1 |
-phenotype STRING |
(optional) User-provided phenotype name, default: "." |
-ancestry STRING |
(optional) User provided ancestry name, default: "." |
To add an additional ChIP-seq dataset, create an entry in the ChIP-seq index
file (data/ChIPseq.index) with the following tab-delimited format:
label ⇥ source ⇥ Cell ⇥ TF ⇥ Cell label ⇥ PMID ⇥ Group ⇥ EBV Status ⇥ Species
where label corresponds to the filename, which you should deposit in the
data/ChIP-seq directory (in BED 4 column format).
To use a different genome build, use the UCSC fetchChromSizes utility
(usage information here) to download chromosome information for that
build. You may wish to prune lines representing unmapped chromosome information
(e.g., chrN_glXXXXXX_random and chrUn_glXXXXXX) from the downloaded data
file.
Be advised, however, that the null model included with the data was generated for Homo sapiens at build hg19; using a later "hg" build may invalidate this model.
If you need support for a different organism, please contact us via email for additional details (see "Feedback" section, below), or file an issue against the public GitHub repository.
Transcription factors operate across disease loci, with EBNA2 implicated in autoimmunity.
Harley JB, Chen X, Pujato M, Miller D, Maddox A, Forney C, Magnusen AF, Lynch A, Chetal K, Yukawa M, Barski A, Salomonis N, Kaufman KM, Kottyan LC, Weirauch MT.
Nat Genet. 2018 Apr 16. doi: 10.1038/s41588-018-0102-3. Epub 2018 Apr 16.
PMID: 29662164
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See LICENSE.txt for more details.
Please report any issues with RELI (or feature suggestions) in our GitHub issue tracker.
With other questions, you may contact Dr. Chen (the primary author of RELI) or Dr. Weirauch via email.
| Name | Institution | Remarks |
|---|---|---|
| Dr. Xiaoting Chen | Cincinnati Children's Hospital | primary author |
Project avatar based on Wikimedia Commons Chromosome_18.svg