modiff can either look for differentially methylated loci
or check which regions in a pre-defined set are differentially
methylated.
TO find DMLs modiff does a two group tests by
using a logistic regression with an intercept term and a group term
followed by a test on the significance of the group term
(via likelihood ratio).
Usage is as folllows:
modiff FILE
where FILE is a A TAB-separated file.
The columns are contig and genomic coordinate followed by depth and number of modified bases per each sample. For example with 2 samples an input line looks like the following:
chr1\t10471\t20\t7\t23\t19
modiff will read directly from stdin if the file name is -.
Columns after the genome coordinate by default
are split in two equal parts corresponding to the two groups.
Alternatively the option --ng1 can be used
to specify the size of the first group.
| column | quantity | comment |
|---|---|---|
| 1 | contig/chromosome | reference sequence |
| 2 | start position | (inclusive, 0-based) |
| 3 | end position | (exclusive, 0-based) |
| 4 | mean methylation by group | comma separated |
| 5 | total depth by group | comma separated |
| 6 | b0 | regression coeff. |
| 7 | b1 | regression coeff. |
| 8 | statistics | residual deviance of log reg |
| 9 | pvalue | |
| 10 | adjusted pvalue | BH-adjusted pvalue |
The regions to be checked have to be listed in a .BED fil, eg regions.bed.
The first step is to run modiff normally to list all the DMLs
in the available data.
After that, and assuming the results from the first step are stored in dmls.bed
one can run:
modiff --regions regions.bed dmls.bed
| column | quantity | comment |
|---|---|---|
| 1 | contig/chromosome | reference sequence |
| 2 | start position | (inclusive, 0-based) |
| 3 | end position | (exclusive, 0-based) |
| 4 | nCpGs | number of CpGs in region |
| 5 | mean residual deviance | averaged over the region |
| 6 | pvalue | mean raw pvalue (computed from the average residual variance) |
github issues and/or pull requests. email to emanuele dot raineri at cnag dot eu